Python: Functions¶
To a large extent, data analysis consists of a sequence of data transformations using functions. Given the centrality of functions in this course, the notebook goes into some depth into Python function construction and usage.
Built-in functions¶
See Python docs
In [1]:
xs = ['apple', 'pear', 'grape', 'orange', 'rambutan', 'durian', 'longan', 'mango']
In [2]:
sorted(xs)
Out[2]:
['apple', 'durian', 'grape', 'longan', 'mango', 'orange', 'pear', 'rambutan']
In [3]:
sorted(xs, key=len)
Out[3]:
['pear', 'apple', 'grape', 'mango', 'orange', 'durian', 'longan', 'rambutan']
In [4]:
sorted(xs, key=len, reverse=True)
Out[4]:
['rambutan', 'orange', 'durian', 'longan', 'apple', 'grape', 'mango', 'pear']
In [5]:
max(xs)
Out[5]:
'rambutan'
In [6]:
max(xs, key=len)
Out[6]:
'rambutan'
In [7]:
min(xs)
Out[7]:
'apple'
In [8]:
min(xs, key=len)
Out[8]:
'pear'
In [9]:
sum(map(len, xs))
Out[9]:
45
Custom functions¶
It is very simple to write a function. The docstring (a triple quoted string with 1 or more lines of function documentation is not mandatory but highly recommended).
In [10]:
def power(x, n=2):
"""Returns x to the nth power.
n has a default value of 2."""
return x**n
In [11]:
help(power)
Help on function power in module __main__:
power(x, n=2)
Returns x to the nth power.
n has a default value of 2.
Give named arguments out of order¶
In [14]:
power(n=0.5, x=3)
Out[14]:
1.7320508075688772
With arbitrary arguments¶
In [15]:
def f(a, b, *args, **kwargs):
"""Example to illustrate use * and ** arguments."""
return a, b, args, kwargs
In [16]:
a, b, args, kwargs = f(1, 2, 3, 4, 5, x=10, y=11, z = 13)
In [17]:
a, b
Out[17]:
(1, 2)
In [18]:
args
Out[18]:
(3, 4, 5)
In [19]:
kwargs
Out[19]:
{'x': 10, 'y': 11, 'z': 13}
Required keyword arguments¶
In [20]:
def f1(a, b, *, c, d):
"""c and d MUST be given as keyword arguments."""
return a, b, c, d
In [21]:
f1(1, 2, c=3, d=4)
Out[21]:
(1, 2, 3, 4)
In [22]:
try:
f1(1, 2, 3, 4)
except Exception as e:
print(e)
f1() takes 2 positional arguments but 4 were given
In [23]:
def f2(a, b, *args, c, d, **kwargs):
"""Combining requyired and optional arguments."""
return a, b, c, d, args, kwargs
In [24]:
a, b, c, d, args, kwargs = f2(1, 2, 3, 4, c=5, d=6, e=7, f=8)
In [25]:
a, b,
Out[25]:
(1, 2)
In [26]:
c, d
Out[26]:
(5, 6)
In [27]:
args
Out[27]:
(3, 4)
In [28]:
kwargs
Out[28]:
{'e': 7, 'f': 8}
In [29]:
try:
f2(1, 2, 3, 4, 5, 6, e=7, f=8)
except Exception as e:
print(e)
f2() missing 2 required keyword-only arguments: 'c' and 'd'
All arguments are keyword only¶
In [30]:
def f3(*, a, b, c):
"""a, b and c must all be given as keyword arguments."""
return a, b, c
In [31]:
f3(c=1, b=2, a=4)
Out[31]:
(4, 2, 1)
In [32]:
try:
f3(4, 2, 1)
except Exception as e:
print(e)
f3() takes 0 positional arguments but 3 were given
Expanding function arguments¶
In [33]:
def f4(a, b, c, d):
return a, b, c, d
In [34]:
a = 1
bc = [2, 3]
d = 4
In [35]:
f4(a, *bc, d)
Out[35]:
(1, 2, 3, 4)
Function annotations¶
You can indicate the type of the arguments and return values using
function annotations. Python itself does not do anything with these
except put them in a dictionary under the __annotations__ attribute,
but 3rd party packages may use them if present. You will not be expected
to use function annotations in this course.
In [36]:
def power2(x : float, n : dict(type=float, help='exponent') =2) -> float:
"""Returns x to the nth power.
n has a default value of 2."""
return x**n
In [37]:
power2.__annotations__
Out[37]:
{'n': {'help': 'exponent', 'type': float}, 'return': float, 'x': float}
Lambda functions¶
Also known as anonymous functions. These are often used to construct one-use-only short functions for higher order functions such as map, filter and reduce.
In [38]:
power2 = lambda x, n=2: x**n
In [39]:
power2(3)
Out[39]:
9
In [40]:
power2(3, 4)
Out[40]:
81
Recursive functions¶
A recursive function is one that calls itself. Python is not optimized for such functions and may crash if the recursion goes too deep. There is always a non-recursive version for any recursive algorithm that should be used instead, but this may not be obvious. Unlike functional languages with tail call optimization, recursive functions are rarely used in Python as they are usually slower and consume more memory than the equivalent non-recursive version.
Recursive functions consist of
- a base case which terminates the computation
- a recursive call that MUST eventually end up in the base case
In [41]:
def factorial_1(n):
"""Recursive factorial function."""
if n == 0:
return 1
else:
return n * factorial_1(n-1)
In [42]:
factorial_1(50)
Out[42]:
30414093201713378043612608166064768844377641568960512000000000000
In [43]:
from functools import reduce
In [44]:
def factorial_2(n):
"""Non-recursive version."""
return reduce(lambda a, b: a*b, range(1, n+1))
In [45]:
factorial_2(50)
Out[45]:
30414093201713378043612608166064768844377641568960512000000000000
The non-recursive version is usually more time and memory efficient¶
In [46]:
%timeit factorial_1(50)
14 µs ± 89.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [47]:
%timeit factorial_2(50)
10.6 µs ± 71.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Higher order functions¶
A higher order function is a function that takes another function as an argument or returns a function. Since functions are “first class” in Python, they can be treated in the same way as any other value. In particular, they can be used as function arguments and/or as return values.
Classical higher order functions are map, filter and reduce, here presented with more Pythonic versions using built-ins and comprehensions. However, map, filter and reduce are still important because often parallel and distributed code is most efficiently coded using these functional operators.
Map¶
In [48]:
list(map(lambda x: x**3, range(5)))
Out[48]:
[0, 1, 8, 27, 64]
In [49]:
[x**3 for x in range(5)]
Out[49]:
[0, 1, 8, 27, 64]
Filter¶
In [50]:
list(filter(lambda x: x % 2 == 0,
range(5)))
Out[50]:
[0, 2, 4]
In [51]:
[x for x in range(5) if x % 2 == 0]
Out[51]:
[0, 2, 4]
Reduce¶
In [52]:
from functools import reduce
In [53]:
reduce(lambda x, y: x+y,
map(lambda x: x**3, range(5)))
Out[53]:
100
In [54]:
sum([x**3 for x in range(5)])
Out[54]:
100
Using the operator module¶
The operator modules provides function equivalents for all of Python’s operators that are convenient for use in higher order functions.
In [55]:
import operator as op
In [56]:
reduce(op.add, map(lambda x: x**3, range(5)))
Out[56]:
100
Custom function taking function arguments¶
In [57]:
def f(a, b, g, h):
"""Function taking functions g and h as arguments."""
return g(a) + h(b)
In [58]:
f(2, 2, lambda x: x**2, lambda x: x**3)
Out[58]:
12
In [59]:
f('abc', -2, len, abs)
Out[59]:
5
Functions returning functions¶
The partial function takes a function as argument and returns another function
In [60]:
from functools import partial
In [61]:
def f1(a, b, c):
"""A function with 3 arguments."""
return a, b, c
In [62]:
f1(1, 2, 3)
Out[62]:
(1, 2, 3)
f2 takes a single argument since b and c have been given values by partial¶
In [63]:
f2 = partial(f1, b=12, c=13)
In [64]:
f2(11)
Out[64]:
(11, 12, 13)
Custom function returning a function¶
In [65]:
def timed(f, *args, **kwargs):
"""Decorates the function f with time takin in seconds."""
import time
def func(*args, **kwargs):
start = time.time()
result = f(*args, **kwargs)
elapsed = time.time() - start
return elapsed, result
return func
In [66]:
def my_sum(xs):
s = 0
for x in xs:
s += x
return s
In [67]:
my_sum(range(10000000))
Out[67]:
49999995000000
In [68]:
my_sum2 = timed(my_sum)
my_sum2(range(10000000))
Out[68]:
(0.8492159843444824, 49999995000000)
Decorators¶
There is syntactic sugar to decorate functions when using functions
such as the timed function above.
In [69]:
@timed
def my_sum3(xs):
s = 0
for x in xs:
s += x
return s
In [70]:
my_sum3(range(10000000))
Out[70]:
(0.9857957363128662, 49999995000000)
Giving decorators their own arguments¶
You will come across packages that provide decorators which can take arguments. This is one way that they can be implemented. See if you can follow how it works!
In [71]:
def timed2(fudge_factor=0.0):
"""Decorator with decorartor arguments."""
def timed(f):
import time
def func(*args, **kwargs):
start = time.time()
result = f(*args, **kwargs)
elapsed = fudge_factor + time.time() - start
return elapsed, result
return func
return timed
In [72]:
@timed2(fudge_factor=10)
def my_sum4(xs):
s = 0
for x in xs:
s += x
return s
In [73]:
my_sum4(range(10000000))
Out[73]:
(11.009344100952148, 49999995000000)