Native Code Compilation¶

Exercise 1 (100 points)

The code given below performs a stochastic gradient descent to fit a quadratic polynomila to \(n\) data points. Maake the code run faster by:

Using numba JIT (20 points)
Using Cython (30 poits)
Rewrite the sgd function in C and wrap for use in Python using Cython (50 points)

Replace the code stubs below with your otpimized code. Reprot the ratio original_time/optimized_time for each of the four parts.

In [4]:

def sgd(b, x, y, max_iter, alpha):
    n = x.shape[0]
    for i in range(max_iter):
        for j in range(n):
            b[0] -= alpha * (2*(b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
            b[1] -= alpha * (2*x[j] * (b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
            b[2] -= alpha * (2*x[j]**2 * (b[0] + b[1]*x[j] + b[2]*x[j]**2 - y[j]))
    return b

In [5]:

n = 10000
x = np.linspace(0, 10, n)
y = 2*x**2 + 6*x + 3 + np.random.normal(0, 5, n)
k = 100
alpha = 0.0001
b0 = np.random.random(3)

In [6]:

%%time
np.random.seed(123)
b = sgd(b0, x, y, k, alpha)
print(b)

[ 7.90234588 -0.01733983  2.5184669 ]
CPU times: user 8.12 s, sys: 9.8 ms, total: 8.13 s
Wall time: 8.13 s

In [7]:

yhat = b[0] + b[1]*x+ b[2]*x**2
idx = sorted(np.random.choice(n, 100))
plt.scatter(x[idx], y[idx])
plt.plot(x[idx], yhat[idx], c='red')
pass

homework/../_build/doctrees/nbsphinx/homework_Homework09_5_0.png

Using numba JIT¶

In [105]:

def sgd_numba(b, x, y, max_iter, alpha):
    return np.zeros(3)

In [106]:

%%time
b = sgd_numba(b0, x, y, k, alpha)
print(b)

[ 0.  0.  0.]
CPU times: user 652 µs, sys: 119 µs, total: 771 µs
Wall time: 729 µs

Speed-up ratio¶

In [ ]:

Using Cython¶

In [107]:

def sgd_cython(b, x, y, max_iter, alpha):
    return np.zeros(3)

In [108]:

%%time
np.random.seed(123)
b = sgd_cython(b0, x, y, k, alpha)
print(b)

[ 0.  0.  0.]
CPU times: user 660 µs, sys: 89 µs, total: 749 µs
Wall time: 697 µs

Speed-up ratio¶

In [ ]:

Using C and wrapping with Cython¶

In [109]:

def sgd_wrap(b, x, y, max_iter, alpha):
    return np.zeros(3)

In [110]:

%%time
np.random.seed(123)
b = sgd_wrap(b0, x, y, k, alpha)
print(b)

[ 0.  0.  0.]
CPU times: user 662 µs, sys: 80 µs, total: 742 µs
Wall time: 689 µs

Speed-up ratio¶

In [ ]:

Hints for Part 3¶

The mechanics for how you can wrap the C sgd function are illustrated below.

In [9]:

%%file f.h

#pragma once
void foo(double *x, double *res, int n);

Overwriting f.h

In [14]:

%%file f.c
#include "f.h"

void foo(double *x, double *res, int n) {
    for (int i=0; i<n; i++) {
        res[i] = x[i]*x[i];
    }
}

Overwriting f.c

In [26]:

%%file f_wrap.pyx

cdef extern from 'f.h':
    void foo(double *x, double *res, int n)

def foo_wrap(double[:] x, double[:] res):
    foo(&x[0], &res[0], len(x))
    return res

Overwriting f_wrap.pyx

In [2]:

%%file setup.py
from distutils.core import setup, Extension
from Cython.Build import cythonize

ext = Extension("f_wrap",
                sources=["f_wrap.pyx", "f.c"],
                libraries=[],
                extra_compile_args=["-w", "-std=c99"])

setup(
    ext_modules = cythonize(
            ext
    ))

Overwriting setup.py

In [3]:

! python setup.py -q build_ext --inplace

In [4]:

from f_wrap import foo_wrap

In [7]:

x = np.arange(10).astype('float')
res = np.zeros_like(x)
np.array(foo_wrap(x, res))

Out[7]:

array([  0.,   1.,   4.,   9.,  16.,  25.,  36.,  49.,  64.,  81.])

In [ ]:

Native Code Compilation¶

Using numba JIT¶

Speed-up ratio¶

Using Cython¶

Speed-up ratio¶

Using C and wrapping with Cython¶

Speed-up ratio¶

Hints for Part 3¶

Page contents

This Page