{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Foreign Function Interface\n", "====" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "plt.style.use('ggplot')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Wrapping functions written in C\n", "----\n", "\n", "### Steps \n", "\n", "- Write the C header and implemntation files\n", "- Write the Cython `.pxd` file to decalre C function signatures\n", "- Write the Cython `.pyx` file to wrap the C functions for Python\n", "- Write `setup.py` to automate buiding of the Python extension module\n", "- Run `python setup.py build_ext --inplace` to build the module\n", "- Import module in Python like any other Python module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### C header file" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting c_math.h\n" ] } ], "source": [ "%%file c_math.h\n", "\n", "#pragma once\n", "double plus(double a, double b);\n", "double mult(double a, double b);\n", "double square(double a);\n", "double acc(double *xs, int size);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### C implementation file" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting c_math.c\n" ] } ], "source": [ "%%file c_math.c\n", "#include \n", "#include \"c_math.h\"\n", "\n", "double plus(double a, double b) {\n", " return a + b;\n", "};\n", "\n", "double mult(double a, double b) {\n", " return a * b;\n", "};\n", "\n", "double square(double a) {\n", " return pow(a, 2);\n", "};\n", "\n", "double acc(double *xs, int size) {\n", " double s = 0;\n", " for (int i=0; i.pxd` in the regular Cython `.pyx` files to get access to functions decalred in the `.pxd` files." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting cy_math.pxd\n" ] } ], "source": [ "%%file cy_math.pxd\n", "\n", "cdef extern from \"c_math.h\":\n", " double plus(double a, double b)\n", " double mult(double a, double b)\n", " double square(double a)\n", " double acc(double *xs, int size)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cython \"implementation\" file\n", "\n", "Here is whhere we actaully wrap the C code for use in Python. Note especially how we handle passing in of arrays to a C funciton expecting a pointer to double using `typed memoryviews`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting cy_math.pyx\n" ] } ], "source": [ "%%file cy_math.pyx\n", "\n", "cimport cy_math\n", "\n", "def py_plus(double a, double b):\n", " return cy_math.plus(a, b)\n", "\n", "def py_mult(double a, double b):\n", " return cy_math.mult(a, b)\n", "\n", "def py_square(double a):\n", " return cy_math.square(a)\n", "\n", "def py_sum(double[::1] xs):\n", " cdef int size = len(xs)\n", " return cy_math.acc(&xs[0], size)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Build script `setup.py`\n", "\n", "This is build script for Python, similar to a Makefile" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting setup.py\n" ] } ], "source": [ "%%file setup.py\n", "\n", "from distutils.core import setup, Extension\n", "from Cython.Build import cythonize\n", "import numpy as np\n", "\n", "ext = Extension(\"cy_math\",\n", " sources=[\"cy_math.pyx\", \"c_math.c\"],\n", " libraries=[\"m\"],\n", " extra_compile_args=[\"-w\", \"-std=c99\"])\n", "\n", "setup(name = \"Math Funcs\",\n", " ext_modules = cythonize(ext))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building an extension module" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Compiling cy_math.pyx because it changed.\n", "[1/1] Cythonizing cy_math.pyx\n", "running clean\n", "removing 'build/temp.macosx-10.5-x86_64-2.7' (and everything under it)\n" ] } ], "source": [ "! python setup.py clean\n", "! python setup.py -q build_ext --inplace" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cy_math.c cy_math.pyx\r\n", "\u001b[31mcy_math.cpython-35m-darwin.so\u001b[m\u001b[m \u001b[31mcy_math.so\u001b[m\u001b[m\r\n", "cy_math.pxd\r\n" ] } ], "source": [ "! ls cy_math*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using the extension module in Python" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "7.0\n", "12.0\n", "9.0\n", "45.0\n" ] } ], "source": [ "import cy_math\n", "import numpy as np\n", "\n", "print(cy_math.py_plus(3, 4))\n", "print(cy_math.py_mult(3, 4))\n", "print(cy_math.py_square(3))\n", "\n", "xs = np.arange(10, dtype='float')\n", "print(cy_math.py_sum(xs))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Confirm that we are geting C speedups by comparing with pure Python accumulator" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def acc(xs):\n", " s = 0\n", " for x in xs:\n", " s += x\n", " return s" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3 loops, best of 3: 204 ms per loop\n", "3 loops, best of 3: 1.98 ms per loop\n" ] } ], "source": [ "import cy_math\n", "\n", "xs = np.arange(1000000, dtype='float')\n", "%timeit -r3 -n3 acc(xs)\n", "%timeit -r3 -n3 cy_math.py_sum(xs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "C++\n", "----\n", "\n", "This is almost similar to C. We will use Cython to wrap a simple funciton. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting add.hpp\n" ] } ], "source": [ "%%file add.hpp\n", "#pragma once\n", "int add(int a, int b);" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting add.cpp\n" ] } ], "source": [ "%%file add.cpp\n", "int add(int a, int b) {\n", " return a+b;\n", "}" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting plus.pyx\n" ] } ], "source": [ "%%file plus.pyx\n", "\n", "cdef extern from 'add.cpp':\n", " int add(int a, int b)\n", " \n", "def plus(a, b):\n", " return add(a, b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Note that essentially the only differnece from C is `language=\"C++\"` and the floag `-std=c++11`" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting setup.py\n" ] } ], "source": [ "%%file setup.py \n", "from distutils.core import setup, Extension\n", "from Cython.Build import cythonize\n", "\n", "ext = Extension(\"plus\",\n", " sources=[\"plus.pyx\", \"add.cpp\"],\n", " extra_compile_args=[\"-w\", \"-std=c++11\"])\n", "\n", "setup(\n", " ext_modules = cythonize(\n", " ext,\n", " language=\"c++\", \n", " ))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Compiling plus.pyx because it changed.\n", "[1/1] Cythonizing plus.pyx\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "error: invalid argument '-std=c++11' not allowed with 'C/ObjC'\n", "error: command 'gcc' failed with exit status 1\n" ] } ], "source": [ "%%bash\n", "python setup.py -q build_ext --inplace" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "ename": "ImportError", "evalue": "No module named 'plus'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mImportError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mplus\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mplus\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mplus\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mImportError\u001b[0m: No module named 'plus'" ] } ], "source": [ "import plus\n", "\n", "plus.plus(3, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wrap an R function from libRmath using `ctypes`\n", "----\n", "\n", "R comes with a standalone C library of special functions and distributions, as described in the official documentation. These functions can be wrapped for use in Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building the Rmath standalone library\n", "\n", "```bash\n", "git clone https://github.com/JuliaLang/Rmath-julia.git\n", "cd Rmath-julia/src\n", "make\n", "cd ../..\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Functions to wrap" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "! grep \"\\s.norm(\" Rmath-julia/include/Rmath.h" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from ctypes import CDLL, c_int, c_double" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%bash\n", "ls Rmath-julia/src/*so" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "lib = CDLL('Rmath-julia/src/libRmath-julia.so')\n", "\n", "def rnorm(mu=0, sigma=1):\n", " lib.rnorm.argtypes = [c_double, c_double]\n", " lib.rnorm.restype = c_double\n", " return lib.rnorm(mu, sigma)\n", "\n", "def dnorm(x, mean=0, sd=1, log=0):\n", " lib.dnorm4.argtypes = [c_double, c_double, c_double, c_int]\n", " lib.dnorm4.restype = c_double\n", " return lib.dnorm4(x, mean, sd, log)\n", "\n", "def pnorm(q, mu=0, sd=1, lower_tail=1, log_p=0):\n", " lib.pnorm5.argtypes = [c_double, c_double, c_double, c_int, c_int]\n", " lib.pnorm5.restype = c_double\n", " return lib.pnorm5(q, mu, sd, lower_tail, log_p)\n", "\n", "def qnorm(p, mu=0, sd=1, lower_tail=1, log_p=0):\n", " lib.qnorm5.argtypes = [c_double, c_double, c_double, c_int, c_int]\n", " lib.qnorm5.restype = c_double\n", " return lib.qnorm5(p, mu, sd, lower_tail, log_p)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "pnorm(0, mu=2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "qnorm(0.022750131948179212, mu=2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "plt.hist([rnorm() for i in range(100)])\n", "pass" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "xs = np.linspace(-3,3,100)\n", "plt.plot(xs, list(map(dnorm, xs)))\n", "pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using Cython to wrap standalone library" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%file rmath.pxd\n", "\n", "cdef extern from \"Rmath-julia/include/Rmath.h\":\n", " double dnorm(double, double, double, int)\n", " double pnorm(double, double, double, int, int)\n", " double qnorm(double, double, double, int, int)\n", " double rnorm(double, double)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%file rmath.pyx\n", "\n", "cimport rmath\n", "\n", "def rnorm_(mu=0, sigma=1):\n", " return rmath.rnorm(mu, sigma)\n", "\n", "def dnorm_(x, mean=0, sd=1, log=0):\n", " return rmath.dnorm(x, mean, sd, log)\n", "\n", "def pnorm_(q, mu=0, sd=1, lower_tail=1, log_p=0):\n", " return rmath.pnorm(q, mu, sd, lower_tail, log_p)\n", "\n", "def qnorm_(p, mu=0, sd=1, lower_tail=1, log_p=0):\n", " return rmath.qnorm(p, mu, sd, lower_tail, log_p)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%file setup.py \n", "from distutils.core import setup, Extension\n", "from Cython.Build import cythonize\n", "\n", "ext = Extension(\"rmath\",\n", " sources=[\"rmath.pyx\"],\n", " include_dirs=[\"Rmath-julia/include\"],\n", " library_dirs=[\"Rmath-julia/src\"],\n", " libraries=[\"Rmath-julia\"],\n", " runtime_library_dirs=[\"Rmath-julia/src\"],\n", " extra_compile_args=[\"-w\", \"-std=c99\", \"-DMATHLIB_STANDALONE\"],\n", " extra_link_args=[],\n", " )\n", "\n", "setup(\n", " ext_modules = cythonize(\n", " ext\n", " ))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "! python setup.py build_ext --inplace" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import rmath\n", "\n", "plt.hist([rmath.rnorm_() for i in range(100)])\n", "pass" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "xs = np.linspace(-3,3,100)\n", "plt.plot(xs, list(map(rmath.dnorm_, xs)))\n", "pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `Cython` wrappers are faster than `ctypes`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%timeit pnorm(0, mu=2)\n", "%timeit rmath.pnorm_(0, mu=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fortran\n", "----" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "! pip install fortran-magic" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%load_ext fortranmagic" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%fortran\n", "\n", "subroutine fort_sum(N, s)\n", " integer*8, intent(in) :: N\n", " integer*8, intent(out) :: s\n", " integer*8 i\n", " s = 0\n", " do i = 1, N\n", " s = s + i*i\n", " end do\n", "end " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fort_sum(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Antoher example from the [documentation](http://nbviewer.ipython.org/github/mgaitan/fortran_magic/blob/master/documentation.ipynb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%fortran --link lapack\n", "\n", "subroutine solve(A, b, x, n)\n", " ! solve the matrix equation A*x=b using LAPACK\n", " implicit none\n", "\n", " real*8, dimension(n,n), intent(in) :: A\n", " real*8, dimension(n), intent(in) :: b\n", " real*8, dimension(n), intent(out) :: x\n", "\n", " integer :: pivot(n), ok\n", "\n", " integer, intent(in) :: n\n", " x = b\n", "\n", " ! find the solution using the LAPACK routine SGESV\n", " call DGESV(n, 1, A, n, pivot, x, n, ok)\n", " \n", "end subroutine" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "A = np.array([[1, 2.5], [-3, 4]])\n", "b = np.array([1, 2.5])\n", "\n", "solve(A, b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Interfacing with R\n", "----" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load_ext rpy2.ipython" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%R\n", "library(ggplot2)\n", "suppressPackageStartupMessages(\n", " ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point() + geom_smooth(method=loess)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Converting between Python and R" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%R -o mtcars" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### `mtcars` is now a Python dataframe" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "mtcars.head(n=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### We can also pass data from Ptyhon to R" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = np.linspace(0, 2*np.pi, 100)\n", "y = np.sin(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%R -i x,y\n", "plot(x, y, main=\"Sine curve in R base graphics\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }