Introduction to Python (Part 2)

Getting help

Use ?, ? or ?? to get help. Or use the help function.

In [1]:
help(range)
Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |
 |  Methods defined here:
 |
 |  __contains__(self, key, /)
 |      Return key in self.
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __getitem__(self, key, /)
 |      Return self[key].
 |
 |  __gt__(self, value, /)
 |      Return self>value.
 |
 |  __hash__(self, /)
 |      Return hash(self).
 |
 |  __iter__(self, /)
 |      Implement iter(self).
 |
 |  __le__(self, value, /)
 |      Return self<=value.
 |
 |  __len__(self, /)
 |      Return len(self).
 |
 |  __lt__(self, value, /)
 |      Return self<value.
 |
 |  __ne__(self, value, /)
 |      Return self!=value.
 |
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |
 |  __reduce__(...)
 |      helper for pickle
 |
 |  __repr__(self, /)
 |      Return repr(self).
 |
 |  __reversed__(...)
 |      Return a reverse iterator.
 |
 |  count(...)
 |      rangeobject.count(value) -> integer -- return number of occurrences of value
 |
 |  index(...)
 |      rangeobject.index(value, [start, [stop]]) -> integer -- return index of value.
 |      Raise ValueError if the value is not present.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  start
 |
 |  step
 |
 |  stop

Magics

See link for details of magic functions. Perhaps the most useful for data science is the ability to easily go between Python and R.

In [2]:
%load_ext rpy2.ipython
In [4]:
x = %R 1:15
In [5]:
x
Out[5]:
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15], dtype=int32)
In [8]:
y = %R -i x x^2 + 3 + rnorm(length(x))
In [9]:
y
Out[9]:
array([   3.9395204 ,    5.34805803,   11.25033399,   19.58789804,
         28.24917345,   39.13577363,   52.51709934,   67.27538519,
         87.26379034,  102.63916992,  123.36500302,  146.54990275,
        171.79786825,  199.2817583 ,  228.31390622])
In [11]:
%%R -i x,y -o m

m <- lm(y ~ x)
summary(m)

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max
-18.492 -14.547  -3.388  10.827  30.470

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -42.573      9.490  -4.486 0.000613 ***
x             16.043      1.044  15.370 1.02e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.47 on 13 degrees of freedom
Multiple R-squared:  0.9478,        Adjusted R-squared:  0.9438
F-statistic: 236.2 on 1 and 13 DF,  p-value: 1.022e-09


In [18]:
%%R

coef(m)
(Intercept)           x
  -42.57263    16.04253

In [20]:
import numpy as np
In [25]:
np.array(m.rx('coefficients'))
Out[25]:
array([[-42.57263055,  16.04253416]])

Algorithmic complexity

In [ ]:

More on data collections

Tuple and namedtuple

In [28]:
x = ('Tom', 'Jones', 'chemistry', 'statistics', '3.5')
In [27]:
from collections import namedtuple
In [29]:
student = namedtuple('student', 'first last major minor gpa')
In [30]:
x = student('Tom', 'Jones', 'chemistry', 'statistics', '3.5')
In [34]:
x
Out[34]:
student(first='Tom', last='Jones', major='chemistry', minor='statistics', gpa='3.5')
In [33]:
x.first, x.last
Out[33]:
('Tom', 'Jones')
In [32]:
x.gpa
Out[32]:
'3.5'
In [36]:
y = student(last='Rice', first='Anne', major='Philosophy', minor='English', gpa=4.0)
In [37]:
y
Out[37]:
student(first='Anne', last='Rice', major='Philosophy', minor='English', gpa=4.0)

Unpacking

In [44]:
a, b, c, d = range(1, 5)
In [45]:
a, b, c, d
Out[45]:
(1, 2, 3, 4)
In [46]:
a, *b, c, d = range(1, 10)
In [47]:
a, b, c, d
Out[47]:
(1, [2, 3, 4, 5, 6, 7], 8, 9)
In [48]:
a, b
Out[48]:
(1, [2, 3, 4, 5, 6, 7])
In [49]:
a, b = b, a
In [50]:
a, b
Out[50]:
([2, 3, 4, 5, 6, 7], 1)

Appending and extending lists

In [38]:
xs = []
In [39]:
xs.append(1)
In [40]:
xs.append(2)
In [41]:
xs
Out[41]:
[1, 2]
In [42]:
xs.extend([3,4,5])
In [43]:
xs
Out[43]:
[1, 2, 3, 4, 5]

Dictionary creation and idioms

In [51]:
{'a': 1, 'b':2, 'c': 3}
Out[51]:
{'a': 1, 'b': 2, 'c': 3}
In [52]:
dict(a=1, b=2, c=3)
Out[52]:
{'a': 1, 'b': 2, 'c': 3}
In [53]:
dict(zip('abc', [1,2,3]))
Out[53]:
{'a': 1, 'b': 2, 'c': 3}
In [56]:
dict.fromkeys('abc', 0)
Out[56]:
{'a': 0, 'b': 0, 'c': 0}

Flow control

If-elif-else

In [57]:
for i in range(1, 31):
    if i % 15 == 0:
        print('fizzbuzz')
    elif i % 5 == 0:
        print('fizz')
    elif i % 3 == 0:
        print('buzz')
    else:
        print(i)
1
2
buzz
4
fizz
buzz
7
8
buzz
fizz
11
buzz
13
14
fizzbuzz
16
17
buzz
19
fizz
buzz
22
23
buzz
fizz
26
buzz
28
29
fizzbuzz

Ternary if operator

In [59]:
i = 45
x = 'fizzbuzz' if i % 15 == 0 else i
x
Out[59]:
'fizzbuzz'
In [60]:
i = 42
x = 'fizzbuzz' if i % 15 == 0 else i
x
Out[60]:
42

The for loop

In [61]:
for x in range(3):
    print(x**2)
0
1
4

Using enumerate

In [86]:
for i, x in enumerate(range(10, 15)):
    print(i, x)
0 10
1 11
2 12
3 13
4 14
In [87]:
for i, x in enumerate(range(10, 15), start=1):
    print(i, x)
1 10
2 11
3 12
4 13
5 14

Nested loops

In [62]:
for x in range(3):
    for y in range(3, 5):
        print(x, y)
0 3
0 4
1 3
1 4
2 3
2 4

The while loop

In [63]:
i = 0
while i < 5:
    print(i)
    i += 1
0
1
2
3
4

pass, continue and break

In [64]:
for i in range(5):
    if i == 3:
        pass
    print(i)
0
1
2
3
4
In [65]:
for i in range(5):
    if i == 3:
        continue
    print(i)
0
1
2
4
In [66]:
for i in range(5):
    if i == 3:
        break
    print(i)
0
1
2

Iterable objects and iterators

Iterators

In [67]:
xs = range(5)
In [68]:
next(xs)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-68-c652a0de46c2> in <module>()
----> 1 next(xs)

TypeError: 'range' object is not an iterator
In [69]:
it = iter(xs)
In [70]:
next(it), next(it), next(it), next(it), next(it)
Out[70]:
(0, 1, 2, 3, 4)
In [71]:
next(it)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-71-2cdb14c0d4d6> in <module>()
----> 1 next(it)

StopIteration:
In [72]:
for x in it:
    print(x)
In [73]:
it = iter(xs)
In [76]:
it = iter(xs)
for x in it:
    print(x)
0
1
2
3
4

Generators

Comprehensions

In [80]:
def squares(i=1):
    while True:
        yield i**2
        i = i + 1
In [84]:
for x in squares():
    if x > 100:
        break
    print(x)
1
4
9
16
25
36
49
64
81
100

Generator comprehensions

In [77]:
(i**2 for i in range(5))
Out[77]:
<generator object <genexpr> at 0x113e32c50>
In [78]:
list(i**2 for i in range(5))
Out[78]:
[0, 1, 4, 9, 16]

List comprehensions

In [88]:
[i**2 for i in range(5)]
Out[88]:
[0, 1, 4, 9, 16]

Dictionary comprehensions

In [91]:
{k: v for (v, k) in enumerate('abc', 1)}
Out[91]:
{'a': 1, 'b': 2, 'c': 3}

Set comprehensions

In [92]:
{char for char in 'tweedledum'}
Out[92]:
{'d', 'e', 'l', 'm', 't', 'u', 'w'}

Nested comprehensions

In [93]:
[(i, j) for i in 'abc' for j in range(3)]
Out[93]:
[('a', 0),
 ('a', 1),
 ('a', 2),
 ('b', 0),
 ('b', 1),
 ('b', 2),
 ('c', 0),
 ('c', 1),
 ('c', 2)]