Native Code Compilation¶
Exercise 1 (100 points)
The code given below performs a stochastic gradient descent to fit a quadratic polynomila to \(n\) data points. Maake the code run faster by:
- Using
numba
JIT (20 points) - Using
Cython
(30 poits) - Rewrite the
sgd
function in C and wrap for use in Python using Cython (50 points)
Replace the code stubs below with your otpimized code. Reprot the ratio original_time/optimized_time for each of the four parts.
In [4]:
def sgd(b, x, y, max_iter, alpha):
n = x.shape[0]
for i in range(max_iter):
for j in range(n):
b[0] -= alpha * (2*(b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
b[1] -= alpha * (2*x[j] * (b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
b[2] -= alpha * (2*x[j]**2 * (b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
return b
In [5]:
n = 10000
x = np.linspace(0, 10, n)
y = 2*x**2 + 6*x + 3 + np.random.normal(0, 5, n)
k = 100
alpha = 0.0001
b0 = np.random.random(3)
In [6]:
%%time
np.random.seed(123)
b = sgd(b0, x, y, k, alpha)
print(b)
[ 7.90234588 -0.01733983 2.5184669 ]
CPU times: user 8.12 s, sys: 9.8 ms, total: 8.13 s
Wall time: 8.13 s
In [7]:
yhat = b[0] + b[1]*x+ b[2]*x**2
idx = sorted(np.random.choice(n, 100))
plt.scatter(x[idx], y[idx])
plt.plot(x[idx], yhat[idx], c='red')
pass
![homework/../_build/doctrees/nbsphinx/homework_Homework09_5_0.png](homework/../_build/doctrees/nbsphinx/homework_Homework09_5_0.png)
Using numba JIT¶
In [105]:
def sgd_numba(b, x, y, max_iter, alpha):
return np.zeros(3)
In [106]:
%%time
b = sgd_numba(b0, x, y, k, alpha)
print(b)
[ 0. 0. 0.]
CPU times: user 652 µs, sys: 119 µs, total: 771 µs
Wall time: 729 µs
Speed-up ratio¶
In [ ]:
Using Cython¶
In [107]:
def sgd_cython(b, x, y, max_iter, alpha):
return np.zeros(3)
In [108]:
%%time
np.random.seed(123)
b = sgd_cython(b0, x, y, k, alpha)
print(b)
[ 0. 0. 0.]
CPU times: user 660 µs, sys: 89 µs, total: 749 µs
Wall time: 697 µs
Speed-up ratio¶
In [ ]:
Using C and wrapping with Cython¶
In [109]:
def sgd_wrap(b, x, y, max_iter, alpha):
return np.zeros(3)
In [110]:
%%time
np.random.seed(123)
b = sgd_wrap(b0, x, y, k, alpha)
print(b)
[ 0. 0. 0.]
CPU times: user 662 µs, sys: 80 µs, total: 742 µs
Wall time: 689 µs
Speed-up ratio¶
In [ ]:
Hints for Part 3¶
The mechanics for how you can wrap the C sgd
function are
illustrated below.
In [9]:
%%file f.h
#pragma once
void foo(double *x, double *res, int n);
Overwriting f.h
In [14]:
%%file f.c
#include "f.h"
void foo(double *x, double *res, int n) {
for (int i=0; i<n; i++) {
res[i] = x[i]*x[i];
}
}
Overwriting f.c
In [26]:
%%file f_wrap.pyx
cdef extern from 'f.h':
void foo(double *x, double *res, int n)
def foo_wrap(double[:] x, double[:] res):
foo(&x[0], &res[0], len(x))
return res
Overwriting f_wrap.pyx
In [2]:
%%file setup.py
from distutils.core import setup, Extension
from Cython.Build import cythonize
ext = Extension("f_wrap",
sources=["f_wrap.pyx", "f.c"],
libraries=[],
extra_compile_args=["-w", "-std=c99"])
setup(
ext_modules = cythonize(
ext
))
Overwriting setup.py
In [3]:
! python setup.py -q build_ext --inplace
In [4]:
from f_wrap import foo_wrap
In [7]:
x = np.arange(10).astype('float')
res = np.zeros_like(x)
np.array(foo_wrap(x, res))
Out[7]:
array([ 0., 1., 4., 9., 16., 25., 36., 49., 64., 81.])
In [ ]: