{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python: Numeric data\n",
    "\n",
    "The foundation for numerical computation in Python is the `numpy` package, and essentially all scientific libraries in Python build on this - e.g. `scipy`, `pandas`, `statsmodels`, `scikit-learn`, `cv2` etc. The basic data structure in `numpy` is the NDArray, and it is essential to become familiar with how to slice and dice this object.\n",
    "\n",
    "Numpy also has the `random`, and `linalg` modules that we will discuss in later lectures."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Resources\n",
    "----\n",
    "\n",
    "- [Numpy for R users](http://mathesaurus.sourceforge.net/r-numpy.html)\n",
    "- [NumPy: creating and manipulating numerical data](http://www.scipy-lectures.org/intro/numpy/index.html)\n",
    "- [Advanced Numpy](http://www.scipy-lectures.org/advanced/advanced_numpy/index.html)\n",
    "- [100 Numpy Exercises](http://www.labri.fr/perso/nrougier/teaching/numpy.100/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NDArray\n",
    "\n",
    "The base structure in `numpy` is `ndarray`, used to represent vectors, matrices and higher-dimensional arrays. Each `ndarray` has the following attributes:\n",
    "\n",
    "- dtype = corresponds to data types in C\n",
    "- shape = dimensions of array\n",
    "- strides = number of bytes to step in each direction when traversing the array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1 2 3 4 5 6]\n",
      "dytpe int64\n",
      "shape (6,)\n",
      "strides (8,)\n"
     ]
    }
   ],
   "source": [
    "x = np.array([1,2,3,4,5,6])\n",
    "print(x)\n",
    "print('dytpe', x.dtype)\n",
    "print('shape', x.shape)\n",
    "print('strides', x.strides)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 2 3]\n",
      " [4 5 6]]\n",
      "dytpe int64\n",
      "shape (2, 3)\n",
      "strides (24, 8)\n"
     ]
    }
   ],
   "source": [
    "x.shape = (2,3)\n",
    "print(x)\n",
    "print('dytpe', x.dtype)\n",
    "print('shape', x.shape)\n",
    "print('strides', x.strides)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1.+0.j  2.+0.j  3.+0.j]\n",
      " [ 4.+0.j  5.+0.j  6.+0.j]]\n",
      "dytpe complex128\n",
      "shape (2, 3)\n",
      "strides (48, 16)\n"
     ]
    }
   ],
   "source": [
    "x = x.astype('complex')\n",
    "print(x)\n",
    "print('dytpe', x.dtype)\n",
    "print('shape', x.shape)\n",
    "print('strides', x.strides)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Array creation\n",
    "----"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array([1,2,3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  2.,  3.])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array([1,2,3], np.float64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 3. ,  3.5,  4. ,  4.5,  5. ,  5.5])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(3, 6, 0.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6]])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array([[1,2,3],[4,5,6]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  1.,  1.])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.ones(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  0.,  0.,  0.],\n",
       "       [ 0.,  0.,  0.,  0.],\n",
       "       [ 0.,  0.,  0.,  0.]])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.zeros((3,4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  0.,  0.,  0.],\n",
       "       [ 0.,  1.,  0.,  0.],\n",
       "       [ 0.,  0.,  1.,  0.],\n",
       "       [ 0.,  0.,  0.,  1.]])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.eye(4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0, 0, 0],\n",
       "       [0, 2, 0, 0],\n",
       "       [0, 0, 3, 0],\n",
       "       [0, 0, 0, 4]])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.diag([1,2,3,4])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   1.,   4.,   9.,  16.],\n",
       "       [  1.,   2.,   5.,  10.,  17.],\n",
       "       [  4.,   5.,   8.,  13.,  20.],\n",
       "       [  9.,  10.,  13.,  18.,  25.]])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.fromfunction(lambda i, j: i**2+j**2, (4,5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Array manipulation\n",
    "----"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   1.,   4.,   9.,  16.],\n",
       "       [  1.,   2.,   5.,  10.,  17.],\n",
       "       [  4.,   5.,   8.,  13.,  20.],\n",
       "       [  9.,  10.,  13.,  18.,  25.]])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.fromfunction(lambda i, j: i**2+j**2, (4,5))\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 5)"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "20"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  4,  9, 16],\n",
       "       [ 1,  2,  5, 10, 17],\n",
       "       [ 4,  5,  8, 13, 20],\n",
       "       [ 9, 10, 13, 18, 25]])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.astype(np.int64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   1.,   4.,   9.],\n",
       "       [  1.,   2.,   5.,  10.],\n",
       "       [  4.,   5.,   8.,  13.],\n",
       "       [  9.,  10.,  13.,  18.],\n",
       "       [ 16.,  17.,  20.,  25.]])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   1.,   4.,   9.,  16.,   1.,   2.,   5.,  10.,  17.],\n",
       "       [  4.,   5.,   8.,  13.,  20.,   9.,  10.,  13.,  18.,  25.]])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.reshape(2,-1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Array indexing\n",
    "----"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   1.,   4.,   9.,  16.],\n",
       "       [  1.,   2.,   5.,  10.,  17.],\n",
       "       [  4.,   5.,   8.,  13.,  20.],\n",
       "       [  9.,  10.,  13.,  18.,  25.]])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  0.,   1.,   4.,   9.,  16.])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  0.,   1.,   4.,   9.,  16.])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[0,:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.,  1.,  4.,  9.])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[:,0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  9.,  10.,  13.,  18.,  25.])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.0"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[1,1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1.,   4.],\n",
       "       [  2.,   5.],\n",
       "       [  5.,   8.],\n",
       "       [ 10.,  13.]])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[:, 1:3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Boolean indexing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[False, False,  True,  True,  True],\n",
       "       [False,  True,  True,  True,  True],\n",
       "       [ True,  True,  True,  True,  True],\n",
       "       [ True,  True,  True,  True,  True]], dtype=bool)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x >= 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  4.,   9.,  16.,   5.,  10.,  17.,   4.,   5.,   8.,  13.,  20.,\n",
       "         9.,  10.,  13.,  18.,  25.])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[x > 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fancy indexing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  4.])"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[0, [1,2]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Calculations and broadcasting\n",
    "----\n",
    "\n",
    "Broadcasting refers to the set of rules that numpy uses to perfrom operations on arrays with different shapes. See official [documentation](http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html) for a clear explanation of the rules. Array shapes can be manipulated using the `reshape` method or by inserting a new axis with `np.newaxis`. Note that `np.newaxis` is an alias for `None`, which I sometimes use in my examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.fromfunction(lambda i, j: i**2+j**2, (2,3))\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   5.,  20.],\n",
       "       [  5.,  10.,  25.]])"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x * 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   2.,   8.],\n",
       "       [  2.,   4.,  10.]])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x + x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 17.,  22.],\n",
       "       [ 22.,  30.]])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x @ x.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1.,   2.,   5.],\n",
       "       [  2.,   5.,  14.],\n",
       "       [  5.,  14.,  41.]])"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.T @ x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.        ,  0.69314718,  1.60943791],\n",
       "       [ 0.69314718,  1.09861229,  1.79175947]])"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.log1p(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[   1.        ,    2.71828183,   54.59815003],\n",
       "       [   2.71828183,    7.3890561 ,  148.4131591 ]])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.exp(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combining and splitting arrays\n",
    "----"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.r_[x, x]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.vstack([x, x])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.concatenate([x, x], axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.,  0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.,  1.,  2.,  5.]])"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.c_[x,x]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.,  0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.,  1.,  2.,  5.]])"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.hstack([x, x])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.,  0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.,  1.,  2.,  5.]])"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.concatenate([x,x], axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = np.r_[x, x]\n",
    "y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "a, b, c = np.hsplit(y, 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.],\n",
       "       [ 1.],\n",
       "       [ 0.],\n",
       "       [ 1.]])"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.],\n",
       "       [ 2.],\n",
       "       [ 1.],\n",
       "       [ 2.]])"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4.],\n",
       "       [ 5.],\n",
       "       [ 4.],\n",
       "       [ 5.]])"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[array([[ 0.,  1.,  4.],\n",
       "        [ 1.,  2.,  5.],\n",
       "        [ 0.,  1.,  4.]]), array([[ 1.,  2.,  5.]])]"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.vsplit(y, [3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[array([[ 0.,  1.,  4.],\n",
       "        [ 1.,  2.,  5.],\n",
       "        [ 0.,  1.,  4.]]), array([[ 1.,  2.,  5.]])]"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.split(y, [3], axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.hstack(np.hsplit(y, 3))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Reductions\n",
    "----"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.],\n",
       "       [ 0.,  1.,  4.],\n",
       "       [ 1.,  2.,  5.]])"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "26.0"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  2.,   6.,  18.])"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.sum(0) # column sum"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 5.,  8.,  5.,  8.])"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.sum(1) # row sum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Standardize by column mean and standard deviation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "z = (y - y.mean(0))/y.std(0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-1., -1., -1.],\n",
       "       [ 1.,  1.,  1.],\n",
       "       [-1., -1., -1.],\n",
       "       [ 1.,  1.,  1.]])"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([ 0.,  0.,  0.]), array([ 1.,  1.,  1.]))"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z.mean(0), z.std(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Standardize by row mean and standard deviation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "z = (y - y.mean(1)[:,None])/y.std(1)[:,None]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.98058068, -0.39223227,  1.37281295],\n",
       "       [-0.98058068, -0.39223227,  1.37281295],\n",
       "       [-0.98058068, -0.39223227,  1.37281295],\n",
       "       [-0.98058068, -0.39223227,  1.37281295]])"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([ -7.40148683e-17,   7.40148683e-17,  -7.40148683e-17,\n",
       "          7.40148683e-17]), array([ 1.,  1.,  1.,  1.]))"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z.mean(1), z.std(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: Calculating pairwise distance matrix using broadcasting and vectorization\n",
    "\n",
    "Calculate the pairwise distance matrix between the following points\n",
    "\n",
    "- (0,0)\n",
    "- (4,0)\n",
    "- (4,3)\n",
    "- (0,3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def distance_matrix_py(pts):\n",
    "    \"\"\"Returns matrix of pairwise Euclidean distances. Pure Python version.\"\"\"\n",
    "    n = len(pts)\n",
    "    p = len(pts[0])\n",
    "    m = np.zeros((n, n))\n",
    "    for i in range(n):\n",
    "        for j in range(n):\n",
    "            s = 0\n",
    "            for k in range(p):\n",
    "                s += (pts[i,k] - pts[j,k])**2\n",
    "            m[i, j] = s**0.5\n",
    "    return m"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def distance_matrix_np(pts):\n",
    "    \"\"\"Returns matrix of pairwise Euclidean distances. Vectorized numpy version.\"\"\"\n",
    "    return np.sum((pts[None,:] - pts[:, None])**2, -1)**0.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 0],\n",
       "       [4, 0],\n",
       "       [4, 3],\n",
       "       [0, 3]])"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pts = np.array([(0,0), (4,0), (4,3), (0,3)])\n",
    "pts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 2)"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pts.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  4.,  5.,  3.],\n",
       "       [ 4.,  0.,  3.,  5.],\n",
       "       [ 5.,  3.,  0.,  4.],\n",
       "       [ 3.,  5.,  4.,  0.]])"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n = pts.shape[0]\n",
    "p = pts.shape[1]\n",
    "dist = np.zeros((n, n))\n",
    "for i in range(n):\n",
    "    for j in range(n):\n",
    "        s = 0\n",
    "        for k in range(p):\n",
    "            s += (pts[i, k] - pts[j, k])**2\n",
    "        dist[i, j] = np.sqrt(s)\n",
    "dist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using broadcasting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, 4, 2)"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pts[None, :].shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 1, 2)"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pts[:, None].shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  0],\n",
       "        [ 4,  0],\n",
       "        [ 4,  3],\n",
       "        [ 0,  3]],\n",
       "\n",
       "       [[-4,  0],\n",
       "        [ 0,  0],\n",
       "        [ 0,  3],\n",
       "        [-4,  3]],\n",
       "\n",
       "       [[-4, -3],\n",
       "        [ 0, -3],\n",
       "        [ 0,  0],\n",
       "        [-4,  0]],\n",
       "\n",
       "       [[ 0, -3],\n",
       "        [ 4, -3],\n",
       "        [ 4,  0],\n",
       "        [ 0,  0]]])"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "m = pts[None, :] - pts[:, None]\n",
    "m"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  0],\n",
       "        [16,  0],\n",
       "        [16,  9],\n",
       "        [ 0,  9]],\n",
       "\n",
       "       [[16,  0],\n",
       "        [ 0,  0],\n",
       "        [ 0,  9],\n",
       "        [16,  9]],\n",
       "\n",
       "       [[16,  9],\n",
       "        [ 0,  9],\n",
       "        [ 0,  0],\n",
       "        [16,  0]],\n",
       "\n",
       "       [[ 0,  9],\n",
       "        [16,  9],\n",
       "        [16,  0],\n",
       "        [ 0,  0]]])"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "m**2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 4, 2)"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(m**2).shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to end up with a 4 by 4 matrix, so sum over the axis with dimension 2. This is axis=2, or axis=-1 since it is the first axis from the end."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0, 16, 25,  9],\n",
       "       [16,  0,  9, 25],\n",
       "       [25,  9,  0, 16],\n",
       "       [ 9, 25, 16,  0]])"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sum((pts[None, :] - pts[:, None])**2, -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basically, the distance matrix can be calculated in one line of numpy code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  4.,  5.,  3.],\n",
       "       [ 4.,  0.,  3.,  5.],\n",
       "       [ 5.,  3.,  0.,  4.],\n",
       "       [ 3.,  5.,  4.,  0.]])"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sqrt(np.sum((pts[None, :] - pts[:, None])**2, -1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's put them in functions and compare the time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def pdist1(pts):\n",
    "    n = pts.shape[0]\n",
    "    p = pts.shape[1]\n",
    "    dist = np.zeros((n, n))\n",
    "    for i in range(n):\n",
    "        for j in range(n):\n",
    "            s = 0\n",
    "            for k in range(p):\n",
    "                s += (pts[i, k] - pts[j, k])**2\n",
    "            dist[i, j] = s\n",
    "    return np.sqrt(dist)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def pdist2(pts):\n",
    "    return np.sqrt(np.sum((pts[None, :] - pts[:, None])**2, -1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Check that the outputs are the same"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.alltrue(pdist1(pts) == pdist2(pts))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "pts = np.random.random((1000, 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 loops, best of 3: 3.26 s per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit pdist1(pts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10 loops, best of 3: 77.3 ms per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit pdist2(pts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### But don't give up on loops yet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from numba import njit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "@njit\n",
    "def pdist3(pts):\n",
    "    n = pts.shape[0]\n",
    "    p = pts.shape[1]\n",
    "    dist = np.zeros((n, n))\n",
    "    for i in range(n):\n",
    "        for j in range(n):\n",
    "            s = 0\n",
    "            for k in range(p):\n",
    "                s += (pts[i, k] - pts[j, k])**2\n",
    "            dist[i, j] = s\n",
    "    return np.sqrt(dist)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The slowest run took 27.17 times longer than the fastest. This could mean that an intermediate result is being cached \n",
      "1 loops, best of 3: 16.1 ms per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit pdist3(pts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What is going on?\n",
    "\n",
    "This is 3-5 times faster than the broadcasting version! We have just performed Just In Time (JIT) compilation of a function, which will be discussed in a later lecture."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: Constructing leave-one-out arrays\n",
    "\n",
    "Another example of numpy trickery is to construct a leave-one-out matrix of a vector of length k. In the matrix, each row is a vector of length k-1, with a different vector component dropped each time. This can be used for LOOCV to evalaute the out-of-sample accuracy of a predictive model.\n",
    "\n",
    "For example, suppose you have data points [(1,4), (2,7), (3,11), (4,9), (5,15)] that you want to perfrom LOOCV on for a simple regression model. For each cross-validation, you use one point for testing, and the remaining 4 points for training. In other words, you want the training set to be:\n",
    "```\n",
    "[(2,7), (3,11), (4,9), (5,15)]\n",
    "[(1,4), (3,11), (4,9), (5,15)]\n",
    "[(1,4), (2,7),  (4,9), (5,15)]\n",
    "[(1,4), (2,7), (3,11), (5,15)]\n",
    "[(1,4), (2,7), (3,11), (4,9)]\n",
    "```\n",
    "Here is one way to do create the training set using numpy tricks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a triangular matrix with N rows, N-1 columns and offset from diagnonal by -1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "N = 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  0.,  0.,  0.,  0.],\n",
       "       [ 1.,  1.,  0.,  0.,  0.],\n",
       "       [ 1.,  1.,  1.,  0.,  0.],\n",
       "       [ 1.,  1.,  1.,  1.,  0.],\n",
       "       [ 1.,  1.,  1.,  1.,  1.]])"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.tri(N)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  0.,  0.,  0.],\n",
       "       [ 1.,  1.,  0.,  0.],\n",
       "       [ 1.,  1.,  1.,  0.],\n",
       "       [ 1.,  1.,  1.,  1.],\n",
       "       [ 1.,  1.,  1.,  1.]])"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.tri(N, N-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  0.,  0.,  0.],\n",
       "       [ 1.,  0.,  0.,  0.],\n",
       "       [ 1.,  1.,  0.,  0.],\n",
       "       [ 1.,  1.,  1.,  0.],\n",
       "       [ 1.,  1.,  1.,  1.]])"
      ]
     },
     "execution_count": 88,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.tri(N, N-1, -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use broadcasting to create a new index matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4])"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(1, N)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  2.,  3.,  4.],\n",
       "       [ 0.,  2.,  3.,  4.],\n",
       "       [ 0.,  1.,  3.,  4.],\n",
       "       [ 0.,  1.,  2.,  4.],\n",
       "       [ 0.,  1.,  2.,  3.]])"
      ]
     },
     "execution_count": 90,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(1, N) - np.tri(N, N-1, -1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "idx = np.arange(1, N) - np.tri(N, N-1, -1).astype('int')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  4],\n",
       "       [ 2,  7],\n",
       "       [ 3, 11],\n",
       "       [ 4,  9],\n",
       "       [ 5, 15]])"
      ]
     },
     "execution_count": 92,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = np.array([(1,4), (2,7), (3,11), (4,9), (5,15)])\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 2,  7],\n",
       "        [ 3, 11],\n",
       "        [ 4,  9],\n",
       "        [ 5, 15]],\n",
       "\n",
       "       [[ 1,  4],\n",
       "        [ 3, 11],\n",
       "        [ 4,  9],\n",
       "        [ 5, 15]],\n",
       "\n",
       "       [[ 1,  4],\n",
       "        [ 2,  7],\n",
       "        [ 4,  9],\n",
       "        [ 5, 15]],\n",
       "\n",
       "       [[ 1,  4],\n",
       "        [ 2,  7],\n",
       "        [ 3, 11],\n",
       "        [ 5, 15]],\n",
       "\n",
       "       [[ 1,  4],\n",
       "        [ 2,  7],\n",
       "        [ 3, 11],\n",
       "        [ 4,  9]]])"
      ]
     },
     "execution_count": 93,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[idx]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### All but one\n",
    "\n",
    "R uses negative indexing to mean delete the component at that index. Because Python uses negative indexing to mean count from the end, we have to do a little more work to get the same effect. Here are two ways of deleting one item from a vector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def f1(a, k):\n",
    "    idx = np.ones_like(a).astype('bool')\n",
    "    idx[k] = 0\n",
    "    return a[idx]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def f2(a, k):\n",
    "    return np.r_[a[:k], a[k+1:]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "a = np.arange(100)\n",
    "k = 50"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The slowest run took 6.39 times longer than the fastest. This could mean that an intermediate result is being cached \n",
      "100000 loops, best of 3: 12.4 µs per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit f1(a, k)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10000 loops, best of 3: 47.6 µs per loop\n"
     ]
    }
   ],
   "source": [
    "%timeit f2(a, k)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### Universal functions (Ufuncs)\n",
    "\n",
    "Functions that work on both scalars and arrays are known as ufuncs. For arrays, ufuncs apply the function in an element-wise fashion. Use of ufuncs is an esssential aspect of vectorization and typically much more computationally efficient than using an explicit loop over each element."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEACAYAAACwB81wAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8zmX+x/HXB5E2mUkykZJoX+hHaTuVRBlpMdkyLUMq\nqmGUaaNNJamkTZOKNPrRJlMTTU5DSpTIFkqKdiYljJxz/f64bjN+Ott97uX6fu/7/Xw8zuMsvuf+\nvh3H577uazXnHCIikvuqhA4gIiLZoYIvIpInVPBFRPKECr6ISJ5QwRcRyRMq+CIieSItBd/MHjez\nr81sfhnXjDCzZWb2gZkdmY77iohIxaWrhf8EcHppf2hm7YD9nXMHAJcCj6TpviIiUkFpKfjOuRnA\nv8q45CxgTOLaWUAtM6ubjnuLiEjFZKsPf2/g820+X534moiIZIkGbUVE8kS1LN1nNdBgm8/rJ772\nC2amzX1ERJLknLPyrklnC98SbyWZBPQAMLNjgO+dc1+X9kDOuVi+DRo0KKXv37zZ8cQTjqZNHc2b\nO4YPd6xcmdxjbNrkeO01R6dOjt13d/Ts6VixIjv5Q78pv/Lna/6KSksL38yeAQqAX5vZZ8AgoLqv\n3W6Uc+4VMzvDzJYDPwEXpeO+ucI5mDABBgyAAw6Ahx6Ck08GK/f5+pdq1IA2bfzbV1/5x2reHC66\nCK6/HmrXTn9+EYmHtBR851zXClzTJx33yjWffAJXXAGrVsHTT8MJJ6TvsffaC265BS67DAYNgqZN\n4d57oWvXyj2ZiEi8adA2jQoKCip8rXPw6KPQogUUFMD776e32G+rXj0YNQpefRVuvx06d4a1a395\nXTL5o0j5w1L+6LNk+n+ywcxc1DKl2/r10Ls3zJsHEyf6lne2bNwI113nu5D+93+hVavs3VtEMsPM\ncFketJUKWL4cWraEatVg1qzsFnuAmjV9t86oUdCxI4wZk937i0g4auFn0Zw58Nvfwo03+n710P3o\nixb5PJ06wZAhUEVP/yKxVNEWvgp+lkyZAt26wWOP+ZZ1VHz3HZx9Nuy/Pzz+OFStGjqRiCRLBT9C\nnn/et+ifew6OPz50ml/66Sf/JLTHHr6LZ4cdQicSkWSo4EfE5MlwySXw97/DUUeFTlO6TZvg3HP9\nPP7x46F69dCJRKSiNGgbAVOnwsUXw8svR7vYA+y4o38lsmWLX6RVXBw6kYikmwp+hrz1ll/g9Pzz\nfq59HNSoAc8+C599Bv37+7UCIpI7VPAzYPly3z0ydmw0++zLUrMmTJoEr78OQ4eGTiMi6ZSt3TLz\nxtq1cOaZMHgwtG0bOk3l1K7txxyOOw722Qe6dAmdSETSQYO2abR5s9+07OijYdiw0GlS9+GHcMop\nvvg3bx46jYiURrN0Aujd2+9Q+dxzuTOf/bnnoF8/ePddqKtDKUUiqaIFX106aTJmDEybBrNn506x\nBz8WMX++f//GG5quKRJnauGnwfz5cOqpvuAfemjoNOlXXOxX4zZq5PfhEZFo0Tz8LFm3zrd+778/\nN4s9+D12nnwSXnjBz+ARkXhSCz8Fzvm59r/6FTz4YOg0mTdzpm/pz5kDDRqUf72IZIda+Fkwbpyf\nyZILM3IqolUr+OMf/TTNLVtCpxGRZKmFX0krVvgVtK+/DkccETpN9hQXQ7t2vvgPGhQ6jYiApmVm\n1JYt/ljCs8/2WxDkm9Wr/d5Af/87NGsWOo2IqEsng+6+2+8788c/hk4Sxt57w/Dh0KMH/PvfodOI\nSEWphZ+kxYvhxBP9wGXDhqHThOOcn53UpAnceWfoNCL5TV06GVBUBCecAN27w+WXh04T3jffwOGH\n++maxx4bOo1I/lKXTgY88IA/Dap379BJomHPPWHECOjZ0+8jJCLRphZ+BX38MbRsCW+/DQccEDpN\ndDgH7dv7WTvXXx86jUh+UpdOGjnntzo+9VS45prQaaJn5Uq/m+bMmb5PX0SyS106aTRhAnzxRf7O\nyilPw4a+dd+7t07JEokyFfxy/PCD3x744Yd9/72UrG9f/7MaMyZ0EhEpjbp0ytGvH3z/PYweHTpJ\n9M2eDR06wJIlUKtW6DQi+UN9+Gkwb54/wWrhQthjj9Bp4uEPf4Bdd9U2yiLZpIKfIufgpJOgWze4\n9NLQaeLj22/h4IOhsBAOOSR0GpH8oEHbFE2c6Puk//CH0EnipU4dv6la374awBWJGhX8EmzcCAMG\n+ENNcum4wmzp3RvWrPFPmiISHSr4JbjnHjj6aN+lI8mrVg3uuw+uvRY2bQqdRkS2Uh/+dlav9vvD\nzJkD++0XLEZO6NDB7z00YEDoJCK5TYO2lXThhVCvHtxxR7AIOWPJEjj+eP9es5xEMkcFvxLmzYPT\nT4elS2G33YJEyDl9+oCZ33hORDJDBb8S2rWDM87wM0wkPbZO05w+HQ48MHQakdykaZlJev11WLZM\nc+7TrU4d34evnTRFwlMLH38w99FHw5//DJ06ZfXWeWHjRr+l9HPP+S2mRSS91MJPwvjxfmO0884L\nnSQ31azpF2MNHKjFWCIh5X3B//lnuPFGuOsuP7gomXHRRX6L6alTQycRyV95X/BHj4b994eCgtBJ\nclu1anD77b7brLg4dBqR/JTXBX/TJrjtNv8mmXfuuf5V1IQJoZOI5Ke8LvgPP+yP5mvRInSS/GAG\nQ4b4/vyiotBpRPJP3s7S+fFHP3Nk6lQ47LCM304SnIMTT4ReveCCC0KnEckNWnhVjttvh0WLYNy4\njN9KtjNtGvTs6bdcqFYtdBqR+MvqtEwza2tmS8xsqZldW8Kfn2Rm35vZ+4m3G9Jx38pat87v5jho\nUMgU+evkk6FBAxg7NnQSkfyScgvfzKoAS4FTgS+A2UBn59ySba45CejvnOtQgcfLeAv/ttv8fjk6\ncDuc6dOhRw/46COoXj10GpF4y2YLvwWwzDm30jn3MzAeOKukTGm4V8rWrfMHm9wQ9DWGnHCCH0N5\n4onQSUTyRzoK/t7A59t8virxte0da2YfmNnfzOzgNNy3Uh54ANq2hSZNQiWQrW6+2W9D/fPPoZOI\n5IdsDZm9B+zjnNtgZu2AF4FSS+7gwYP/83FBQQEFaVoVtbV1P2NGWh5OUnTssb6VP3YsXHxx6DQi\n8VFYWEhhYWHS35eOPvxjgMHOubaJzwcCzjl3VxnfswJo7pxbW8KfZawP/7bbfJ+xBguj45//9MVe\nM3ZEKi+bffizgcZm1tDMqgOdgUnbham7zcct8E80vyj2mbR+PYwYoW16o+bEE6F+ffjrX0MnEcl9\nKRd851wR0AeYAiwExjvnFpvZpWbWK3HZeWa2wMzmAvcB56d632Q98oifDqhDOKLnppv8qy+tvhXJ\nrLxYeLVxo98g7dVX4Ygj0vrQkgbO+Vk7V1wBXbqETiMSP9oPfxujR/sDTlTso8nMT5MdMkQ7aYpk\nUs4X/M2bYehQ9d1H3emn+0No/va30ElEclfOF/ynn/Zz7nW0XrSZ+b3yhwzRqVgimZLTBb+oyJ9k\ndd11oZNIRZxzDqxdC2++GTqJSG7K6YL/4otQu7ZOs4qLqlXh2mt9K19E0i9nC75zcOed/uBsnVUb\nH927w+LFMGdO6CQiuSdnC/60aX6xVYdy9+eUKKleHf70J7/HjoikV87Ow2/Txs/pvuiiNISSrPrp\nJ9h3X5g50++1IyJly+t5+O+957sFunULnUQqY+ed4bLL4J57QicRyS052cI//3w/DbNfvzSFkqz7\n5hu/DcaiRbDXXqHTiERb3p5p+8kn0KIFrFgBu+6axmCSdVdcAbvv7s8fFpHS5W3B79sXdtlFg365\n4OOP/Ss1PXmLlC0vC/6aNX6Qb+FCqFcvzcEkCHXPiZQvLwv+rbfCp5/C44+nN5OEM2eOX4H78cd+\nrx0R+aW8m6WzcSM8+KCfwy254+ijoVEjmDgxdBKR+MuZgj92LPzP/8BBB4VOIunWvz8MG6ZN1URS\nlRMFv7jYz9lW6z43nXmmX4ylTdVEUpMTBf+VV/wsjhNPDJ1EMqFKFT9oO2xY6CQi8ZYTg7Ynnww9\ne0LXrhkKJcFt3Oi3WygsVLedyPbyZtD2/fdh+XLo1Cl0EsmkmjXh8sth+PDQSUTiK/Yt/AsugMMP\nhwEDMhhKIuGbb6BpU1i6FOrUCZ1GJDryYh7+qlW+2H/yiV+CL7mvZ09o0ABuuil0EpHoyIuCP3Cg\n79u9//4Mh5LIWLgQWrf2C+xq1AidRiQacr7gr1/vB/HefdcvzJH8cfrp/qyDCy8MnUQkGnJ+0HbM\nGD8NU8U+//Tr5wdvI9ZWEYm8WBb84mK47z64+urQSSSENm2gqAjeeCN0EpF4iWXBf/VVv9DqhBNC\nJ5EQzPyTvaZoiiQnln34rVvD73/vp2RKftq4ERo2hBkzoEmT0GlEwsrZPvz58/2xd+efHzqJhFSz\nJvTqBSNGhE4iEh+xa+Ffcgnstx/ccEMWQ0kkffEFHHKIPxFL6zAkn+XktMxvv/Uv37XSUrbq1g2a\nNfNbKIvkq5zs0nn0UTj3XBV7+a+rroIHHoAtW0InEYm+2BT8zZvh4Yf9f3CRrVq0gL33hpdeCp1E\nJPpiU/AnTvQbZx12WOgkEjVXX+3XZYhI2WJT8O+/X617KVnHjn5vnblzQycRibZYFPx33vEDtu3b\nh04iUbTDDnDFFdpET6Q8sZil06ULtGyprRSkdGvWQOPG8NFHsOeeodOIZFfOTMtcvdr3269YAbVq\nBQwmkdezJ+yzD9x4Y+gkItmVMwX/hhtg3To/9U6kLB9+6LdO/vRTqF49dBqR7MmJefibNsFjj0Gf\nPqGTSBwcdpg/4HzixNBJRKIp0gV//Hi/irJp09BJJC6uvFKDtyKliWzBd85vjKWpmJKM9u39jK5Z\ns0InEYmeyBb8t96Cn37yh12IVFTVqr4LUGM+Ir8U2UHbTp3gpJPUfy/J+/57f/TlwoVQr17oNCKZ\nF+tZOitXOo48Elau9CdbiSTr8sv9fPzBg0MnEcm8WBf8gQMdGzdqfxSpvEWL4JRTfKOhRo3QaUQy\nK9YFv04dx8yZfuWkSGWddpo/CrN799BJRDIrq/PwzaytmS0xs6Vmdm0p14wws2Vm9oGZHVnW47Vo\noWIvqds6RTNibRqRYFIu+GZWBRgJnA4cAnQxswO3u6YdsL9z7gDgUuCRsh6zb99UU4nAGWfA2rWa\noimyVTpa+C2AZc65lc65n4HxwFnbXXMWMAbAOTcLqGVmdUt7wNNOS0MqyXuaoiny/6Wj4O8NfL7N\n56sSXyvrmtUlXPPfUJFdHSBxc9FF8Oqr/sBzkXRaudJP/Y2TaqEDlGTwNnPpCgoKKCgoCJZF4m33\n3aFzZ38e8s03h04jueSuu/z52iF+rwoLCyksLEz6+1KepWNmxwCDnXNtE58PBJxz7q5trnkEmOac\nezbx+RLgJOfc1yU83i/2wxdJhaZoSrpFbXFfNmfpzAYam1lDM6sOdAYmbXfNJKBHItgxwPclFXuR\nTDj4YDj8cJgwIXQSyRVPPAFt20aj2Ccj5YLvnCsC+gBTgIXAeOfcYjO71Mx6Ja55BVhhZsuBR4HL\nU72vSDL69tUUTUmPoiIYOdJP+42bSC68ilomib+iImjSBMaNg2OOCZ1G4mzyZLjlFj/d18rtRMmO\nnDgARSRdtk7RHDEidBKJuxEjfOs+KsU+GWrhS974/nvYbz8/0Pab34ROI3EU1QkAauGLbGf33aFr\nV3ikzHXeIqUbORJ69YpWsU+GWviSVxYvhpNPjl4LTaJv6yvERYuiNztHLXyREhx0EBxxBDz7bOgk\nEjejR/v9maJW7JOhgi9558or/cCbXkhKRRUV+T2Z4n7Gtgq+5J127fzL87ffDp1E4mLyZKhb12/d\nHmcq+JJ3qlT570IskYrYOhUz7jRoK3nphx9g331h3jxo0CB0GomyBQugTRv49FOoXj10mpJp0Fak\nDLvtBhdcAA8/HDqJRN2IEdC7d3SLfTLUwpe8tXw5tGrlp2jWrBk6jUTRmjX+uNUlS3wfflSphS9S\njsaN/SDcM8+ETiJR9dhj0LFjtIt9MtTCl7w2dSr07+/78uO4N4pkzs8/+z3vX34ZjjwydJqyqYUv\nUgGtW0NxMUybFjqJRM3zz/uCH/VinwwVfMlrZn66naZoyvbuvz/+C622py4dyXsbNkDDhn4hVuPG\nodNIFMyeDb/7nR/Yr1o1dJryqUtHpIJ22gl69vRL50XAt+779IlHsU+GWvgiwKpV/tzbFSugVq3Q\naSSk1avhsMPi9bugFr5IEurXh9NP94dTS3576CHo3j0+xT4ZauGLJMyaBV26wLJlufdSXipmwwa/\n5cbMmfEaz1ELXyRJLVv6BTaTJoVOIqE8/TQce2y8in0yVPBFttGvHwwfHjqFhOAc3HcfXH116CSZ\no4Ivso2zz4bPP/fT8iS/TJniN0grKAidJHNU8EW2Ua2a3yv/3ntDJ5FsGz7ct+5zeYsNDdqKbGfd\nOn9Y9fz5fvaO5L6te96vWBHPw+01aCtSSbVqQY8eMHJk6CSSLcOHwxVXxLPYJ0MtfJESfPKJ3zr5\n009hl11Cp5FM+uorOOggv43Cr38dOk3lqIUvkoJGjfzg3ejRoZNIpj34oF9/Eddinwy18EVK8c47\n0LUrLF3qB3Ml92xdaDVjBjRpEjpN5amFL5KiY46BevXghRdCJ5FMGTPGL7SKc7FPhgq+SBn+9CcY\nNswvypHcUlTkB2v79QudJHtU8EXK0KGDP8j6rbdCJ5F0mzQJdt8dTjwxdJLsUcEXKUPVqr4FeM89\noZNIug0bBgMG5PZCq+1p0FakHFsH9qZPh6ZNQ6eRdJg5Ey64wA/I58LOqBq0FUmTnXaCyy5TKz+X\n3H23f+WWC8U+GWrhi1TAt9/6mRyLF8Nee4VOI6lYuhSOP95vo7DzzqHTpIda+CJpVKcOdOsGI0aE\nTiKpuuce6N07d4p9MtTCF6mgrdstrFgBu+4aOo1UxpdfwiGHwEcf+SfxXKEWvkiaNWoErVvDY4+F\nTiKVdf/9/pVaLhX7ZKiFL5KE996Djh3h44/9YRkSH+vW+Sft997zs65yiVr4IhnQvDkceCCMGxc6\niSTrkUegXbvcK/bJUAtfJEnTpvlBv0WL8m9aX1xt2uQPtZkyBQ47LHSa9FMLXyRDCgqgdm1tqhYn\nTz3lX53lYrFPhlr4IpUwaRIMHuz7g/NpaX4cbdniV0g/9ZSff5+L1MIXyaD27WHzZnjttdBJpDzj\nx0ODBrlb7JOhFr5IJY0bB6NGwZtvhk4ipSkuhkMP9dMxTzstdJrMUQtfJMPOPx9WrfKbqkk0vfCC\nXyTXunXoJNGQUsE3s9pmNsXMPjKz18ysVinXfWpm88xsrpm9m8o9RaKiWjX485/h1ltDJ5GSOAe3\n3w7XX69xlq1SbeEPBF53zjUF3gD+XMp1xUCBc+4o51yLFO8pEhk9evhl+u+8EzqJbO/VV/2pVu3b\nh04SHakW/LOApxIfPwV0LOU6S8O9RCKnenUYOFCt/KhxDm67Da67Dqqo8vxHqj+KPZ1zXwM4574C\n9izlOgdMNbPZZtYzxXuKRMrFF8O8eX6KpkTD1Knw/fdw3nmhk0RLtfIuMLOpQN1tv4Qv4DeUcHlp\n02uOc859aWZ18IV/sXNuRmn3HDx48H8+LigooKCgoLyYIsHUqAHXXONb+S++GDqNOOfXSNx0U+6u\nhC4sLKSwsDDp70tpWqaZLcb3zX9tZnsB05xzB5XzPYOAH51zw0v5c03LlNjZuBH23x9eeQWOPDJ0\nmvw2ZQpcfTV8+GHuFvztZWta5iTgwsTHvwdeKiHITma2S+LjnYE2wIIU7ysSKTVrwrXXwqBBoZPk\nt3xo3aci1YJ/F3CamX0EnArcCWBm9cxscuKausAMM5sLvAO87JybkuJ9RSLn0kt9P/6cOaGT5K+t\nffedOoVOEk1aaSuSRg89BJMn+64dyS7noFUruPJK6NIldJrs0kpbkQAuuQQWLoS33w6dJP9Mngzr\n1/sV0FIyFXyRNKpRA264QX352VZc7H/ut96qefdl0Y9GJM0uvBCWL9ematk0YYJ/sj3rrNBJok19\n+CIZ8PTT8OCDMHOm9nHJtC1b4JBDYOTI3N4RsyzqwxcJqGtX2LDBH5QimTVmDNSrpx0xK0ItfJEM\n+dvf/Arc+fM1JzxTNm3yp1k98wwcd1zoNOGohS8S2BlnwK9+BWPHhk6Su0aOhKOOyu9inwy18EUy\naMYM6NbNb6G8446h0+SWtWt96376dDjwwNBpwlILXyQCjj8ejjjCt0Qlve64A845R8U+GWrhi2TY\nkiW+8C9ZAnvsETpNbli5Epo1gwUL/IBtvqtoC18FXyQL+vb17x94IGyOXNGjB+y7L9xyS+gk0aCC\nLxIh330HBx3k+/SbNg2dJt7efRc6dvTjIrvuGjpNNKgPXyRC9tjDT9G85prQSeLNObjqKhgyRMW+\nMlTwRbLkyiv9oRz/+EfoJPH1zDN+ZW2PHqGTxJO6dESy6IUX/CZfH3wAO+wQOk28/PSTn5Ezfrzm\n3W9PXToiEdSxI9Svr2malTF0qJ/tpGJfeWrhi2TZRx/5wvXhh7DXXqHTxMPHH0PLlvD++7DPPqHT\nRI9m6YhE2DXXwDffwJNPhk4Sfc75bSpOPlmD3qVRwReJsB9/9NM0n31WXRTlef55uPFGmDsXqlcP\nnSaa1IcvEmG77gr33usPPt+8OXSa6Fq/Hq6+2p8VrGKfOhV8kUDOO8+vFr377tBJouvmm31Xzkkn\nhU6SG9SlIxLQypXQvLk/GatJk9BpouW993zf/fz5ULdu6DTRpi4dkRho2NDPy7/0Uj84Kd7mzXDx\nxTBsmIp9OqngiwTWt68fxH388dBJouOuu/x6he7dQyfJLerSEYmABQt8X/Xs2b5fP59t/VnMneuL\nvpRPXToiMXLooTBggO/GKC4OnSacn3/2P4Pbb1exzwQVfJGI6N/fH8r94IOhk4Rz663+HOCePUMn\nyU3q0hGJkGXLoFUreOut/Ju189ZbcO65vitHp1glR106IjF0wAH+FKfOneHf/w6dJnt++AEuuAAe\nfVTFPpPUwheJGOegUyf4zW9gxIjQabLj97+HHXf0BV+SV9EWfrVshBGRijODv/wFjjoKTjnFb6mc\ny0aP9scWzpkTOknuUwtfJKJmzYIOHXwxbNgwdJrMmDsX2rSBf/7TbyYnlaM+fJGYa9kSrr0WzjkH\nNmwInSb9/vUvv5/QyJEq9tmiFr5IhDnnz28tKoJx43x3Ty4oLvZdVY0awX33hU4Tf2rhi+QAMxg1\nyk/XHDo0dJr0GTjQz8zJpb9THGjQViTiatb0h5+3bAkHHwy//W3oRKkZNQpeegneflt73GebunRE\nYuLdd+HMM32xbNUqdJrKmTLFT8GcPh0aNw6dJneoS0ckx7RoAWPHwtlnw6JFodMkb84cv/vlhAkq\n9qGo4IvESNu2cM89/v1nn4VOU3Hz5vlXJ48/DscfHzpN/lIfvkjMdO8Oa9ZAQQG88Ub0t1NetMg/\nQY0cGf/xh7hTwReJoauugqpV/Vmv//hHdLtIFizwxX7oUL9dhISlgi8SU336+FkuBQV+MPTgg0Mn\n+v+mT/cLq+69F7p2DZ1GQAVfJNZ69YKddvJF/+mn/TYFUfDiiz7buHFw2mmh08hWmpYpkgOmT4ff\n/Q6uu863/EOtyC0uhjvugIcegkmToHnzMDnyTUWnZargi+SIFSv8oOjRR/ttlXfbLbv3X7PG72n/\n44/w7LN+e2fJDs3DF8kz++0H77wDNWrAEUf4HSiz5bXXoFkzfzbvG2+o2EdVSgXfzM4zswVmVmRm\nzcq4rq2ZLTGzpWZ2bSr3FJHS7bKLP0TkgQf8qVl9+8J332Xufl9+6e9z2WV+y4ShQ2GHHTJ3P0lN\nqi38D4GzgTdLu8DMqgAjgdOBQ4AuZnZgiveNpMLCwtARUqL8YaUzf/v2MH++323zwAPhzjth48a0\nPTxr18LgwXD44X7HywULoEaNwvTdIIC4//5UREoF3zn3kXNuGVBW31ELYJlzbqVz7mdgPHBWKveN\nqrj/wih/WOnOv8cefrHTzJkwe7Y/ROWaa2D58so/5pIlMGCAP3t31Sr/2EOG+JlC+vlHXzamZe4N\nfL7N56vwTwIikgVNmsBzz/ktlkeNgmOPhaZNoXVrf4Riixb+PNmSrF3rt0WYPt3vgbN2re/C+eAD\naNAgu38PSV25Bd/MpgJ1t/0S4IDrnXMvZyqYiKTXAQfA3XfDrbfCm2/CtGnQv7/v+qldG+rX9+83\nbvRv33wD69b5bpsWLeCRR/yTRRVN9YittEzLNLNpQH/n3Psl/NkxwGDnXNvE5wMB55y7q5TH0pxM\nEZEkVWRaZjq7dEq72WygsZk1BL4EOgNdSnuQioQWEZHkpTots6OZfQ4cA0w2s1cTX69nZpMBnHNF\nQB9gCrAQGO+cW5xabBERSVbkVtqKiEhmRGb4Jc6Ls8zscTP72szmh85SGWZW38zeMLOFZvahmV0Z\nOlMyzKyGmc0ys7mJ/INCZ0qWmVUxs/fNbFLoLMkys0/NbF7i5/9u6DzJMrNaZjbBzBYn/g+0DJ2p\nosysSeLn/n7i/bqy/v9GooWfWJy1FDgV+ALf79/ZObckaLAKMrPjgfXAGOfc4aHzJMvM9gL2cs59\nYGa7AO8BZ8Xl5w9gZjs55zaYWVXgLeBK51xsio+Z/RFoDuzmnOsQOk8yzOwToLlz7l+hs1SGmT0J\nvOmce8LMqgE7Oed+CBwraYk6ugpo6Zz7vKRrotLCj/XiLOfcDCCWv+wAzrmvnHMfJD5eDyzGr5+I\nDefchsSHNfCTEcK3ZCrIzOoDZwB/CZ2lkozo1JKkmNluwAnOuScAnHNb4ljsE1oDH5dW7CE6/0gl\nLc6KVcHJFWa2L3AkMCtskuQkukTmAl8BU51zs0NnSsK9wABi9CS1HQdMNbPZZtYzdJgk7Qd8Z2ZP\nJLpFRplZzdChKul84K9lXRCVgi8RkOjOmQhclWjpx4Zzrtg5dxRQH2hpZhE7/6lkZnYm8HXiFZZR\n9jYlUXXzgDPRAAABiElEQVScc64Z/lXKFYkuzrioBjQDHkz8HTYAA8NGSp6Z7QB0ACaUdV1UCv5q\nYJ9tPq+f+JpkSaLvciIw1jn3Uug8lZV4OT4NaBs6SwUdB3RI9IP/FTjZzMYEzpQU59yXifffAi8Q\nr61TVgGfO+fmJD6fiH8CiJt2wHuJf4NSRaXg/2dxlplVxy/Oittshbi2zrYaDSxyzt0fOkiyzGwP\nM6uV+LgmcBoQiwFn59x1zrl9nHON8L/3bzjneoTOVVFmtlPilSFmtjPQBlgQNlXFOee+Bj43syaJ\nL50KLAoYqbK6UE53DkTkTFvnXJGZbV2cVQV4PE6Ls8zsGaAA+LWZfQYM2joIFAdmdhzQDfgw0Q/u\ngOucc38Pm6zC6gFPJWYpVAGedc69EjhTvqgLvJDYEqUaMM45NyVwpmRdCYxLdIt8AlwUOE9SzGwn\n/IBtr3KvjcK0TBERybyodOmIiEiGqeCLiOQJFXwRkTyhgi8ikidU8EVE8oQKvohInlDBFxHJEyr4\nIiJ54v8A42wJvv9FA/kAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1115c3198>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "xs = np.linspace(0, 2*np.pi, 100)\n",
    "ys = np.sin(xs) # np.sin is a universal function\n",
    "plt.plot(xs, ys);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generalized ufuncs\n",
    "\n",
    "A universal function performs vectorized looping over scalars. A generalized ufunc performs looping over vectors or arrays. Currently, numpy only ships with a single generalized ufunc. However, they play an important role for JIT compilation with `numba`, a topic we will cover in future lectures."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(m,n),(n,p)->(m,p)\n"
     ]
    }
   ],
   "source": [
    "from numpy.core.umath_tests import matrix_multiply\n",
    "\n",
    "print(matrix_multiply.signature)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "us = np.random.random((5, 2, 3)) # 5 2x3 matrics\n",
    "vs = np.random.random((5, 3, 4)) # 5 3x4 matrices"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0.45041464,  0.73156889,  0.20199586],\n",
       "        [ 0.07597661,  0.13069672,  0.57386177]],\n",
       "\n",
       "       [[ 0.10830059,  0.26695388,  0.44054188],\n",
       "        [ 0.57974703,  0.17978862,  0.52472549]],\n",
       "\n",
       "       [[ 0.40794462,  0.35751635,  0.36870809],\n",
       "        [ 0.63494551,  0.11960905,  0.51381859]],\n",
       "\n",
       "       [[ 0.49510212,  0.46783668,  0.07856113],\n",
       "        [ 0.28401281,  0.47199107,  0.54560703]],\n",
       "\n",
       "       [[ 0.79458848,  0.43756637,  0.06759583],\n",
       "        [ 0.40228528,  0.50838122,  0.56375008]]])"
      ]
     },
     "execution_count": 102,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "us"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[  5.12266952e-02,   8.40637879e-01,   2.38644940e-01,\n",
       "           4.35712252e-01],\n",
       "        [  7.81217918e-01,   6.77203544e-01,   7.28623630e-01,\n",
       "           8.70980358e-02],\n",
       "        [  9.53278422e-01,   6.57880491e-01,   4.45387233e-01,\n",
       "           5.54732924e-02]],\n",
       "\n",
       "       [[  6.52946888e-01,   2.29030756e-01,   5.91273241e-01,\n",
       "           8.29711164e-06],\n",
       "        [  5.25251656e-01,   5.06625358e-01,   1.87996526e-01,\n",
       "           3.57795468e-02],\n",
       "        [  6.45830540e-01,   7.67992588e-01,   3.53764451e-01,\n",
       "           8.93663158e-01]],\n",
       "\n",
       "       [[  7.05223217e-01,   5.68551438e-01,   2.21699577e-01,\n",
       "           3.66118249e-01],\n",
       "        [  3.59834031e-01,   8.86827366e-01,   8.90595276e-01,\n",
       "           9.32417623e-01],\n",
       "        [  9.20454420e-01,   8.21082903e-01,   9.46367477e-01,\n",
       "           6.79992096e-02]],\n",
       "\n",
       "       [[  5.17735631e-01,   7.57666718e-01,   9.76354847e-01,\n",
       "           8.23073506e-01],\n",
       "        [  9.12873687e-01,   7.01160089e-01,   9.41861241e-01,\n",
       "           5.75122377e-01],\n",
       "        [  4.29768204e-01,   6.52651201e-01,   4.52733332e-01,\n",
       "           7.76359616e-01]],\n",
       "\n",
       "       [[  3.29293395e-01,   6.01805518e-01,   8.82878227e-01,\n",
       "           9.58704548e-01],\n",
       "        [  1.57352491e-01,   8.34085919e-01,   2.60621284e-01,\n",
       "           1.61751295e-01],\n",
       "        [  1.81993190e-01,   3.98928140e-01,   1.31517889e-01,\n",
       "           4.99537484e-03]]])"
      ]
     },
     "execution_count": 103,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# perform matrix multiplication for each of the 5 sets of matrices\n",
    "ws = matrix_multiply(us, vs) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5, 2, 4)"
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ws.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0.78714627,  1.00694579,  0.73049393,  0.27117477],\n",
       "        [ 0.65304469,  0.52990956,  0.36895086,  0.07632137]],\n",
       "\n",
       "       [[ 0.4954479 ,  0.49838267,  0.2700697 ,  0.40324843],\n",
       "        [ 0.81186204,  0.62685067,  0.56221777,  0.47536541]],\n",
       "\n",
       "       [[ 0.75571755,  0.85173269,  0.75777686,  0.50778237],\n",
       "        [ 0.96376431,  0.88895942,  0.73355161,  0.37892998]],\n",
       "\n",
       "       [[ 0.71717088,  0.75442382,  0.95959983,  0.73756047],\n",
       "        [ 0.81239634,  0.90221944,  0.96886187,  0.92880331]],\n",
       "\n",
       "       [[ 0.34280688,  0.87012156,  0.82445404,  0.83289018],\n",
       "        [ 0.31506361,  0.89102688,  0.5618071 ,  0.47072019]]])"
      ]
     },
     "execution_count": 106,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ws"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Saving and loading NDArrays\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving to and loading from text files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6],\n",
       "       [7, 8, 9]])"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x1 = np.arange(1,10).reshape(3,3)\n",
    "x1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "np.savetxt('../data/x1.txt', x1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00\r\n",
      "4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00\r\n",
      "7.000000000000000000e+00 8.000000000000000000e+00 9.000000000000000000e+00\r\n"
     ]
    }
   ],
   "source": [
    "!cat ../data/x1.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 112,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  2.,  3.],\n",
       "       [ 4.,  5.,  6.],\n",
       "       [ 7.,  8.,  9.]])"
      ]
     },
     "execution_count": 112,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x2 = np.loadtxt('../data/x1.txt')\n",
    "x2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saving to and loading from binary files (much faster and also preserves dtype)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "np.save('../data/x1.npy', x1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "�NUMPY\u0001\u0000F\u0000{'descr': '<i8', 'fortran_order': False, 'shape': (3, 3), }          \r\n",
      "\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0002\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0003\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0005\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0007\u0000\u0000\u0000\u0000\u0000\u0000\u0000\b\u0000\u0000\u0000\u0000\u0000\u0000\u0000\t\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
     ]
    }
   ],
   "source": [
    "!cat ../data/x1.npy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6],\n",
       "       [7, 8, 9]])"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x3 = np.load('../data/x1.npy')\n",
    "x3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}