Introduction to python¶
Introduction, background, basics¶
Python is an interpretive language (unlike, for example, Fortran that is a compiled language). Scripts or code can be written in a text editor and then run via python, or commands can be entered interactively via the command line. There are different versions of python, specifically python-2.7 and python-3 that present some important changes. In addition, there are development (e.g., idle), interactive (e.g., ipython, jupyter) and GUI based (e.g., spyder) tools that can be used.
In this section we will just cover some basics needed to work with data and make some straightforward plots. One of the most important things to understand in python is that indentation is extremely important. Most compiled languages make use of code blocks contained between beginning and ending statements, for example, “if” and “endif”, “subroutine” and “return”, “do” and “continue”, and so on. In python, blocks are delineated by indenting the code within that block (same column). For example, an “if” statement is not ended by an “endif” but rather by the line of code that lines up in the same column as the “if” statement itself. We will go through several examples.
Another thing to keep in mind is since python is an interpretive language, function definitions have to come first in a script, before they are called, otherwise an undefined error will occur. Compiled programs, on the other hand, can have subroutines defined at the beginning or end of a program.
Finally, while python has several “built-in” features or commands, to keep it lightweight almost all of the functions are kept externally. These are brought into a script when needed. For example, the package (library) NumPy (numpy) contains many useful functions. These are not loaded in by default, instead they need to be loaded (imported) explicitly.
Interactive commands¶
We will first start with an interactive python-3 session by entering “python3” at the terminal command prompt. Note that python3 will need to be in your path for this to work (see Section x.x). The terminal should respond with the display prompt “>>>” at which you can enter commands. Here in jupyter notebook we won’t see that however.
Like other languages, python allows for a certain amount of interactive algebra, e.g.,
[38]:
3 + 10 * 30**2
[38]:
9003
The standard symbols +, -, * and /, along with ** (to the power) may all be used. Note that the symbol ‘^’, which is used to raise numbers to an exponent in other languages, does a “bit-wise exclusive OR” operation.
[3]:
3 * 4
[3]:
12
[4]:
3**2
[4]:
9
Next, we will assign values to variables. Here it is worthwhile to note that python has four primitive variable types: integers, floats, strings and booleans. Unlike other programming languages, these do not need to be declared a-priori (e.g., in FORTRAN you’d specify a variable as “int” to later associate it with an integer value). Note that boolean variables can be “True” or “False”, and this corresponds with 1 and 0, respectively. For example:
[5]:
A = "help"
B = 5
C = 5.0
D = True
This sets A to a string, B to an integer, C to a float and D to a boolean. Arithmetic operations tend to preserve these classes, e.g., a float plus a float would return a float. Mixed cases are allowed between integers and floats, but not between the other classes. For example,
[6]:
A = "help"
B = " me"
C = A + B
print(C)
help me
This, however, does not work:
[7]:
A = "help"
B = 4
C = A + B
print(C)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [7], in <module>
1 A = "help"
2 B = 4
----> 3 C = A + B
4 print(C)
TypeError: can only concatenate str (not "int") to str
You’ll get a “TypeError” since A is class string and B is class integer. As a final note, python allows multiple assignments in a single line. This can be done either with comma or semi-colon:
[8]:
A, B, C = 2, 4, 8
print(A,B,C)
A = 2; B = 4; C = 8
print(A,B,C)
2 4 8
2 4 8
Internal modules¶
At this point you’ve probably noticed that python, when run interactively, will display variable values if you enter the variable name by itself, e.g.,
[8]:
A = "help"
B = " me"
A
[8]:
'help'
However, there are certain intrinsic functions built-in to python, the print statement above being one example. At this point it might be worth a bit of background.
In python, all variables/constants/functions are objects with a type and associated attributes (features) and methods (also called functions). Attributes (features) are aspects of the specific object. For example, “make”, “model”, “year” might be attributes of the object “car”. Methods (functions) are actions that objects can perform. Following our example, methods of the object “car” might be “start”, “stop”, “accelerate”, etc.
As mentioned, python has a limited number of built-in functions, print being one. The complete list (for python3) is:
. |
. |
. |
. |
. |
---|---|---|---|---|
abs() |
delattr() |
hash() |
memoryview() |
set() |
all() |
dict() |
help() |
min() |
setattr() |
any() |
dir() |
hex() |
next() |
slice() |
ascii() |
divmod() |
id() |
object() |
sorted() |
bin() |
enumerate() |
input() |
oct() |
staticmethod() |
bool() |
eval() |
int() |
open() |
str() |
breakpoint() |
exec() |
isinstance() |
ord() |
sum() |
byte() |
filter() |
issubclass() |
pow() |
super() |
bytes() |
float() |
iter() |
print() |
tuple() |
callable() |
format() |
len() |
property() |
type() |
chr() |
frozenset() |
list() |
range() |
vars() |
classmethod() |
getattr() |
locals() |
repr() |
zip() |
compile() |
globals() |
map() |
reversed() |
import() |
complex() |
hasattr() |
max() |
round() |
Some of these functions are self-explanatory, e.g., abs() returns the absolute value of a variable. Note there is a help function as well. These all work the same in the sense of function and argument in parentheses, e.g., print(A) where print is the function and A is the argument.
Another useful built-in function is “type” that returns the variable class:
[9]:
A = True
B = 1.0
C = 1
D = "hello"
type(A), type(B), type(C), type(D)
[9]:
(bool, float, int, str)
There are also internal functions to change the variable’s class, e.g., int() can be used to change a variable to an integer. Note that classes can be set “dynamically” in the sense that operating between two integers would yield an integer (except non-integer division), operating between integers and floats yields floats, and so on:
[11]:
A = 5
B = 2
C = A * B
D = A / B
type(C), type(D)
print("C = ",C,"D=",D)
C = 10 D= 2.5
[19]:
B = "275"
A = int(B)
Methods within objects are accessed a little differently, specifically by using the dot (.), as a separator. For example, the object “plot” might have a method called “line”, and this would be accessed as plot.line() or plot.line(form=dashed). The key after any object will list all the available attributes and methods. Similarly, dir(object) will list them (dir in the table above). As a quick example, we define a variable X to be the integer 5. Remember that all variables are objects, so we can list the attributes of X using X or dir(X). This gives things like “append”, “count”, “extent”, etc. Note that it will return different methods depending on the variable’s class. For example:
[11]:
X = 5
dir(X)
[11]:
['__abs__',
'__add__',
'__and__',
'__bool__',
'__ceil__',
'__class__',
'__delattr__',
'__dir__',
'__divmod__',
'__doc__',
'__eq__',
'__float__',
'__floor__',
'__floordiv__',
'__format__',
'__ge__',
'__getattribute__',
'__getnewargs__',
'__gt__',
'__hash__',
'__index__',
'__init__',
'__init_subclass__',
'__int__',
'__invert__',
'__le__',
'__lshift__',
'__lt__',
'__mod__',
'__mul__',
'__ne__',
'__neg__',
'__new__',
'__or__',
'__pos__',
'__pow__',
'__radd__',
'__rand__',
'__rdivmod__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__rfloordiv__',
'__rlshift__',
'__rmod__',
'__rmul__',
'__ror__',
'__round__',
'__rpow__',
'__rrshift__',
'__rshift__',
'__rsub__',
'__rtruediv__',
'__rxor__',
'__setattr__',
'__sizeof__',
'__str__',
'__sub__',
'__subclasshook__',
'__truediv__',
'__trunc__',
'__xor__',
'as_integer_ratio',
'bit_length',
'conjugate',
'denominator',
'from_bytes',
'imag',
'numerator',
'real',
'to_bytes']
This shows that there are several built-in functions that can be used on variable of type integer (X). Compare this to the following:
[20]:
X = 5.5
dir(X)
[20]:
['__abs__',
'__add__',
'__bool__',
'__ceil__',
'__class__',
'__delattr__',
'__dir__',
'__divmod__',
'__doc__',
'__eq__',
'__float__',
'__floor__',
'__floordiv__',
'__format__',
'__ge__',
'__getattribute__',
'__getformat__',
'__getnewargs__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__int__',
'__le__',
'__lt__',
'__mod__',
'__mul__',
'__ne__',
'__neg__',
'__new__',
'__pos__',
'__pow__',
'__radd__',
'__rdivmod__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__rfloordiv__',
'__rmod__',
'__rmul__',
'__round__',
'__rpow__',
'__rsub__',
'__rtruediv__',
'__set_format__',
'__setattr__',
'__sizeof__',
'__str__',
'__sub__',
'__subclasshook__',
'__truediv__',
'__trunc__',
'as_integer_ratio',
'conjugate',
'fromhex',
'hex',
'imag',
'is_integer',
'real']
As mentioned, methods are accessed by appending the object (variable in this example) with the method name. Here are some examples:
[25]:
# this is a comment; hash tag is used to comment code
# here we set A = 5
A = 5
# another way to comment is with three single quotes; this starts and
# ends a comment section
B = 6
'''
this whole section will be comments and not interpreted by python
I can put whatever here
blah
'''
C = A + B # this line will add A and B
[26]:
# string example, __add__ method
A = "This is a string"
B = A.__add__(" right here")
print(B)
# float example, hex method
A = 500.0
B = A.hex
print(B)
# “fill in the blank” syntax using brackets ({}’s) and the format function with strings:
A = "I will go to {} at {}"
B = A.format("the store", "3:00")
print(B)
This is a string right here
<built-in method hex of float object at 0x7f7d2b571550>
I will go to the store at 3:00
At this point we’ve discussed variables and basic functions in python. In addition, python makes use of three different variable structures: lists, tuples and dictionaries. The most straightforward is a “list”. A list can be assigned to a variable with the use of square brackets ([]). You can think of a list like an array (row, column, table, etc.) of values. For example,
[28]:
A = [1, 5, 10, -1, 8]
type(A)
[28]:
list
Tuples are like lists, but they are “immutable”, i.e., you can’t easily change them. They are defined with parentheses, so the difference is very subtle:
[29]:
A = (1, 4, 10, 11)
type(A)
[29]:
tuple
Before getting into the details of lists (we will, don’t worry), you access elements in a list with an index number, much like we did in Matlab. Note the extremely important difference between ()’s and []’s. Also note! python starts indexing with 0, not 1 like the rest of the computing world. You’ll either come to love this, or it will become a huge annoyance (e.g., the first value in a list has index 0, the second has an index of 1, and so on)
[32]:
A = [ 1, 5, 10, -1, 8]
print(A[-1])
# replace 3rd value, 10, with a 20:
A[2] = 20
print(A)
8
[1, 5, 20, -1, 8]
[36]:
A = [1,5,10,-1,8]
A[2] = 20
We can also use this example to see the difference between a list (above, []’s) and a tuple (below, ()’s)
[17]:
A = ( 1, 5, 10, -1, 8 )
print(A[2])
# replace 3rd value, 10, with a 20:
A[2] = 20
print(A)
10
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-b41b8dc77efe> in <module>
2 print(A[2])
3 # replace 3rd value, 10, with a 20:
----> 4 A[2] = 20
5 print(A)
TypeError: 'tuple' object does not support item assignment
Finally, python has something called a dictionary data structure. These are like lists, but they contain key:value pairs. We will talk about these much later. One final useful function is “len” that shows how many elements are in a list or tuple:
[18]:
A = [1, 4, 10, 11]
len(A)
[18]:
4
Individual elements in lists are addressed by an index number in []’s, so A[x] will return the xth element in list A. As noted, and note again here that in python, the list elements are counted from 0, so the first number in a list is referenced as the 0th. A range in a list is obtained using a colon, A[start:end]. Again, to make it a little more confusing, the elements are referenced up to but not including the last, so while A[2:6] seems to reference elements 2 through six in list A, it really means the 3rd (since it starts counting at zero) to the 5th (since it doesn’t include the last one). An optional “increment” can also be included, A[start:end:increment]. Finally, defaults are the first and last elements in a list, and are obtained if no number is given, e.g., A[::2] means start at 0, go to the end, increment by two. Negative numbers can be used to go in the opposite “direction”. Here are some examples:
[37]:
A = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
print(len(A))
print(A[3])
print(A[4:9])
print(A[2:6:2])
print(A[::3])
print(A[6:2:-1])
print(A[-1])
print(A[::-1])
11
40
[50, 60, 70, 80, 90]
[30, 50]
[10, 40, 70, 100]
[70, 60, 50, 40]
110
[110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10]
Like with variables, there are certain functions available with lists, tuples and dictionaries. For example,
[20]:
A = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
dir(A)
[20]:
['__add__',
'__class__',
'__contains__',
'__delattr__',
'__delitem__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getitem__',
'__gt__',
'__hash__',
'__iadd__',
'__imul__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__len__',
'__lt__',
'__mul__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__reversed__',
'__rmul__',
'__setattr__',
'__setitem__',
'__sizeof__',
'__str__',
'__subclasshook__',
'append',
'clear',
'copy',
'count',
'extend',
'index',
'insert',
'pop',
'remove',
'reverse',
'sort']
Here you can see certain useful things that, hopefully, are self-explanatory (e.g., sort, reverse, etc.). As a final example, we will combine the “split” method (for strings) and the len() built-in function.
First, we define a string (students) that has names in it. Note, this is not a list, tuple or other kind of array; it’s a simple variable of type “str”. Next, we use the “split” method for strings to separate items using a comma as the specifier (comma denotes split place). We then make this a list by using the ()’s. In other words, we start with a single variable (students) and end with a list of names.
[21]:
students = "Matthew, Mark, Luke, John, Mary, Elizabeth"
print(type(students))
print(len(students))
A = students.split(',')
print(type(A))
print(len(A))
print(A)
<class 'str'>
42
<class 'list'>
6
['Matthew', ' Mark', ' Luke', ' John', ' Mary', ' Elizabeth']
Some classes have their own built-in operators, e.g., “>” (greater than) for numeric types. Another powerful one is “in”, which is typically used with class “range”.
In the previous sections we learned about different types in python, e.g., int, list, etc. There are actually quite a few more. Here we simply present these, and if interested details can be found elsewhere.
Numeric: int, float, complex
Example method “hex”):
A = 12.0 A.hex() print(A) ‘0x1.8000000000000p+3 B = 12.0 C = float.hex(B) print(C) ‘0x1.8000000000000p+3
Sequence: list, tuple, range
Example method “sort” (note, sorts “in place”):
A = [10, 0, -1, 4, 2, 12, -20, 15] A.sort() print(A) [-20, -1, 0, 2, 4, 10, 12, 15]
Example method “sorted” (note, sorts “in place”):
A = [10, 0, -1, 4, 2, 12, -20, 15] B = sorted(A) print(B) [-20, -1, 0, 2, 4, 10, 12, 15]
Text sequence: str
Example method “capitalize”:
A = ‘this is a test’ B = str.capitalize(A) print(B) ‘This is a test’
Mapping: dict
Set: set, frozenset
Binary Sequence: bytes, bytearray, memoryview
Other: modules, classes, functions, methods
Functions and commands¶
As a quick example, we define a variable X to be a list of numbers:
[ ]:
X = [ -10, 5, 1, 2, 7, 4, 6]
Remember that all variables are objects, so we can list the attributes of X using X or dir(X). This gives things like “append”, “count”, “extent”, etc.:
[ ]:
dir(X)
To then take advantage of one of these, we use the “.” and method name, e.g.:
[ ]:
X.sort()
print(X)
Here we have redefined X to have this new order (we could have reassigned it, e.g., Y = X.sort()
[ ]:
X = [-10, 5, 1, 2, 7, 4, 6]
Y = X.sort()
print(X)
print(Y)
Wait, what? Looks like X.sort() will first reorder X. Further, the output is not a list. How to change this?
Attributes can be distinguished from methods using “callable” (as methods are callable functions).
Python scripts, typically ending in .py, are called modules. Modules can either be run interactively, e.g., start python and enter the script name, or can be imported by or within other scripts. Similarly, groups of scripts bundled together are termed a “package”. Like modules, packages are imported into python when needed.
There are, however, a number of internal functions and modules that come with python (and thus don’t need to be imported). Built-in functions include things like abs (absolute value), print, float, etc. For these, the command can simply be used without importing anything. (See previous lecture)
The process of loading external packages or modules can be done in many different ways. First, the entire package can be loaded. The modules within the package are then accessed using the package name as a prefix and the module name as a suffix. For example, let’s say there is a module called “sample_module” with a module called “dothis”:
import sample_module
x = sample_module.dothis
will bind the name “sample_module” to the object that is the “sample_module”. This does not give you access to any of the names inside the module itself, thus the need for the prefix. Alternately one could load all the names, e.g.,
from sample_module import *
x = dothis
This is usually not advisable since it may overwrite existing variable names. Specific ones may be access through this same syntax however. For example, to get the variable pi from the module numpy, one could import and access as numpy.pi. Alternately, the function can be directly loaded from the module using the additional qualifier “from” like the above:
from sample_module import dothis
x = dothis
The module may also be assigned a shorthand (alias) name to make things a little easier. This shorthand name is specified using “as” and is then used in place of the full name:
[ ]:
import numpy as np
x = np.sin(np.pi)
print(x)
To summarize with an example, here are four different ways of accessing the plotting function (plot) from the matplotlib package, which is in the pyplot set of routines (subpackage):
import entire package
import matplotlib
matplotlib.pyplot.plot(x,y)
import single subpackage
from matplotlib import pyplot
pyplot.plot(x,y)
import subpackage as alias
from matplotlib import pyplot as plt
plt.plot(x,y)
from matplotlib.pyplot import plot as plt
plt(x,y)
Some of the more useful modules that we will use include sys that has system-level functions like exit, matplotlib, numpy and scipy that allow for matlab like functions, pandas for reading data from files, and a few more. To find out what is available in a particular module, first import it then use dir. For example,
Finally, it is possible to import local python scripts. For example, if you have your own utility in a python script called “my_sort.py”, you can import it as
import my_sort
Or, if it is in a different directory, use period (.) instead of slash (/) to demarcate directories. For example, if you keep your scripts in a subdirectory called “my_libs” and then another subdirectory called “system”, you can import as
import my_libs.system.my_sort
Note that this only works on paths under the current working directory. A better way would be to update the path within the main script (e.g., using importlib.lib).