{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Solving Linear Equations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy.linalg as la"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Linear Equations\n",
    "\n",
    "Consider a set of  $m$ linear equations in $n$ unknowns:\n",
    "\n",
    "\\begin{align*}\n",
    "a_{11} x_1 + &a_{12} x_2& +& ... + &a_{1n} x_n &=& b_1\\\\\n",
    "\\vdots  && &&\\vdots &= &\\vdots\\\\\n",
    "a_{m1} x_1 + &a_{m2} x_2& +& ... + &a_{mn} x_n &=&b_m \n",
    "\\end{align*}\n",
    "\n",
    "We can let\n",
    "\n",
    "\\begin{align*}\n",
    "    A=\\left[\\begin{matrix}a_{11}&\\cdots&a_{1n}\\\\\n",
    "               \\vdots & &\\vdots\\\\\n",
    "               a_{m1}&\\cdots&a_{mn}\\end{matrix}\\right]\n",
    "\\end{align*}\n",
    "\n",
    "\\begin{align*}\n",
    "x = \\left[\\begin{matrix}x_1\\\\\n",
    "               \\vdots\\\\\n",
    "               x_n\\end{matrix}\\right] & \\;\\;\\;\\;\\textrm{   and } &\n",
    "b =  \\left[\\begin{matrix}b_1\\\\\n",
    "               \\vdots\\\\\n",
    "               b_m\\end{matrix}\\right]\n",
    "\\end{align*}\n",
    "\n",
    "and re-write the system\n",
    "    \n",
    "$$ Ax = b$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear independence and existence of solutions\n",
    "\n",
    "* If $A$ is an $m\\times n$ matrix and $m>n$, if all $m$ rows are linearly independent, then the system is *overdetermined* and *inconsistent*.  The system cannot be solved exactly.  This is the usual case in data analysis, and why least squares is so important. For example, we may be finding the parameters of a linear model, where there are $m$ data points and $n$ parameters.\n",
    " \n",
    "* If $A$ is an $m\\times n$ matrix and $m<n$, if all $m$ rows are linearly independent, then the system is *underdetermined* and there are *infinite* solutions.\n",
    "\n",
    "* If $A$ is an $m\\times n$ matrix and some of its rows are linearly dependent, then the system is *reducible*.  We can get rid of some equations. In other words, there are equations in the system that do not give us any new information.\n",
    "\n",
    "* If $A$ is a square matrix and its rows are linearly independent, the system has a unique solution. ($A$ is invertible.)\n",
    "\n",
    "For a linear system, we can only get a unique solution, no solution, or infinite solutions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using `solve` to find unique solutions\n",
    "\n",
    "\\begin{align}\n",
    "x + 2y &= 3 \\\\\n",
    "3x + 4y &= 17\n",
    "\\end{align}\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2],\n",
       "       [3, 4]])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.array([[1,2],[3,4]])\n",
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 3],\n",
       "       [17]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = np.array([3,17]).reshape((-1,1))\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[11.],\n",
       "       [-4.]])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = la.solve(A, b)\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.allclose(A @ x, b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Under the hood of `solve`\n",
    "\n",
    "The `solve` function uses the `dgesv` fortran function in `lapack` (Linear Algebra Package) to do the actual work. In turn, `lapack` calls even lower level routines in `blas` (Basic Linear Algebra Subprograms). In particular, `solve` uses the LU matrix decomposition that we will show in detail shortly. \n",
    "\n",
    "There is rarely any reason to use `blas` or `lapack` functions directly becuase the `linalg` package provides more convenient functions that also perfrom error checking, but you can use Python to experiment with `lapack` or `blas` before using them in a language like C or Fortran.\n",
    "\n",
    "- [How to interpret lapack function names](http://www.netlib.org/lapack/lug/node24.html)\n",
    "- [Summary of BLAS functions](http://cvxopt.org/userguide/blas.html)\n",
    "- [Sumary of Lapack functions](http://cvxopt.org/userguide/lapack.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import scipy.linalg.lapack as lapack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[11.],\n",
       "       [-4.]])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lu, piv, x, info = lapack.dgesv(A, b)\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gaussian elimination"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's review how Gaussian elimination (ge) works.  We will deal with a $3\\times 3$ system of equations for conciseness, but everything here generalizes to the $n\\times n$ case. Consider the following equation:\n",
    "\n",
    "$$\\left(\\begin{matrix}a_{11}&a_{12} & a_{13}\\\\a_{21}&a_{22}&a_{23}\\\\a_{31}&a_{32}&a_{33}\\end{matrix}\\right)\\left(\\begin{matrix}x_1\\\\x_2\\\\x_3\\end{matrix}\\right) = \\left(\\begin{matrix}b_1\\\\b_2\\\\b_3\\end{matrix}\\right)$$\n",
    "\n",
    "For simplicity, let us assume that the leftmost matrix $A$ is non-singular.  To solve the system using ge, we start with the 'augmented matrix':\n",
    "\n",
    "$$\\left(\\begin{array}{ccc|c}a_{11}&a_{12} & a_{13}& b_1 \\\\a_{21}&a_{22}&a_{23}&b_2\\\\a_{31}&a_{32}&a_{33}&b_3\\end{array}\\right)$$\n",
    "\n",
    "We begin at the first entry, $a_{11}$.  If $a_{11} \\neq 0$, then we divide the first row by $a_{11}$ and then subtract the appropriate multiple of the first row from each of the other rows, zeroing out the first entry of all rows. (If $a_{11}$ is zero, we need to permute rows.  We will not go into detail of that here.)  The result is as follows:\n",
    "\n",
    "$$\\left(\\begin{array}{ccc|c}\n",
    "1 & \\frac{a_{12}}{a_{11}} & \\frac{a_{13}}{a_{11}} & \\frac{b_1}{a_{11}} \\\\\n",
    "0 & a_{22} - a_{21}\\frac{a_{12}}{a_{11}} & a_{23} - a_{21}\\frac{a_{13}}{a_{11}}  & b_2 - a_{21}\\frac{b_1}{a_{11}}\\\\\n",
    "0&a_{32}-a_{31}\\frac{a_{12}}{a_{11}} & a_{33} - a_{31}\\frac{a_{13}}{a_{11}}  &b_3- a_{31}\\frac{b_1}{a_{11}}\\end{array}\\right)$$\n",
    "\n",
    "We repeat the procedure for the second row, first dividing by the leading entry, then subtracting the appropriate multiple of the resulting row from each of the third and first rows, so that the second entry in row 1 and in row 3 are zero. Gaussian elimination stops at *row echelon* form (upper triangular, with ones on the diagonal), and then uses *back substitution* to obtain the final answer.\n",
    "\n",
    "Note that in some cases, it is necessary to permute rows to obtain row echelon form (when the pivot would otherwise be zero). This is called *partial pivoting*. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example\n",
    "\n",
    "We perform Gaussian elimination on the the following augmented matrix\n",
    "                           \n",
    "$$\n",
    "\\left(\n",
    "\\begin{array}{ccc|c}\n",
    "1 & 3 & 4 & 1\\\\\n",
    "2 & 1 & 3 & 2 \\\\\n",
    "4 & 7 & 2 & 3\n",
    "\\end{array}\n",
    "\\right)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to multiply row $1$ by $2$ and subtract from row $2$ to eliminate the first entry in row $2$, and then multiply row $1$ by $4$ and subtract from row $3$.\n",
    "\n",
    "$$\n",
    "\\left(\n",
    "\\begin{array}{ccc|c}\n",
    "1 & 3 & 4 & 1 \\\\\n",
    "0 & -5 & -5 & 0\\\\\n",
    "0 & -5 &-14 & -1\n",
    "\\end{array}\n",
    "\\right)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And then we eliminate the second entry in the third row:\n",
    "\n",
    "$$\n",
    "\\left(\n",
    "\\begin{matrix}\n",
    "1 & 3 & 4 & 1\\\\\n",
    "0 & -5 & -5 & 0 \\\\\n",
    "0 & 0 & -9 & -1\n",
    "\\end{matrix}\n",
    "\\right)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now do back-substitution to solve\n",
    "\n",
    "$$\n",
    "9x_3 = 1 \\\\\n",
    "x_2 = -x_3 = 0 \\\\\n",
    "x_1 = -3x_2 -4x_3 + 1\n",
    "$$\n",
    "\n",
    "to get\n",
    "\n",
    "$$\n",
    "x_1 = 8/9 \\\\\n",
    "x_2 = -1/9 \\\\\n",
    "x_3 = 1/9 \n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.88888889],\n",
       "       [-0.11111111],\n",
       "       [ 0.11111111]])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.array([\n",
    "    [1,3,4],\n",
    "    [2,1,3],\n",
    "    [4,7,2]\n",
    "])\n",
    "b = np.array([1,2,3]).reshape((-1,1))\n",
    "\n",
    "la.solve(A, b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gauss-Jordan elimination\n",
    "\n",
    "With Gauss-Jordan elimination, we make all the pivots equal to 1, and set all entries above and below a pivot 0.\n",
    "\n",
    "$$\n",
    "\\left(\n",
    "\\begin{matrix}\n",
    "1 & 0 & 0 & 8/9 \\\\\n",
    "0 & 1 & 0 & -1/9 \\\\\n",
    "0 & 0 & 1 & 1/9\n",
    "\\end{matrix}\n",
    "\\right)\n",
    "$$\n",
    "\n",
    "Then we can just read off the values of $x_1$, $x_2$ and $x_3$ directly. This is known as the *reduced row echelon form*. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gaussian-Jordan elimination and the number of solutions\n",
    "\n",
    "Consider 3 matrices $m > n$, $m = n$ and $m < n$ and provide some intuition as to the existence of a no, unique or infinite solutions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LU Decomposition\n",
    "\n",
    "LU stands for 'Lower Upper', and so an LU decomposition of a matrix $A$ is a decomposition so that \n",
    "$$A= LU$$\n",
    "where $L$ is lower triangular and $U$ is upper triangular.\n",
    "\n",
    "Now, LU decomposition is essentially gaussian elimination, but we work only with the matrix $A$ (as opposed to the augmented matrix). "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Gaussian elimination is all fine when we are solving a system one time, for one outcome $b$.  Many applications involve solutions to multiple problems, where the left-hand-side of our matrix equation does not change, but there are many outcome vectors $b$.  In this case, it is more efficient to *decompose* $A$.\n",
    "\n",
    "First, we start just as in ge, but we 'keep track' of the various multiples required to eliminate entries.  For example, consider the matrix\n",
    "\n",
    "$$A = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           2 & 1 & 3\\\\\n",
    "                           4 & 7 & 2\n",
    "                           \\end{matrix}\\right)$$\n",
    "\n",
    "We need to multiply row $1$ by $2$ and subtract from row $2$ to eliminate the first entry in row $2$, and then multiply row $1$ by $4$ and subtract from row $3$. Instead of entering zeroes into the first entries of rows $2$ and $3$, we record the multiples required for their elimination, as so:\n",
    "\n",
    "$$\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           (2)& -5 & -5\\\\\n",
    "                           (4)&-5 &-14\n",
    "                           \\end{matrix}\\right)$$\n",
    "                           \n",
    "\n",
    "And then we eliminate the second entry in the third row:\n",
    "\n",
    "\n",
    "$$\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           (2)& -5 & -5\\\\\n",
    "                           (4)& (1)&-9\n",
    "                           \\end{matrix}\\right)$$\n",
    "                           \n",
    "And now we have the decomposition:\n",
    "$$L= \\left(\\begin{matrix} 1 & 0 & 0 \\\\\n",
    "                           2& 1 & 0\\\\\n",
    "                           4& 1 &1\n",
    "                           \\end{matrix}\\right) \\,\n",
    "                          U = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           0& -5 & -5\\\\\n",
    "                           0&0&-9\n",
    "                           \\end{matrix}\\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Elementary matrices\n",
    "\n",
    "Why does the algorithm for LU work? Gaussian elimination consists of 3 elementary operations\n",
    "\n",
    "- Op1: swapping two rows\n",
    "- Op2: replace a row by the sum of that row and a multiple of another\n",
    "- Op3: multiplying a row by a non-zero scalar\n",
    "\n",
    "These can be recast as matrix operations - in particular, pre-multiplication with corresponding elementary matrices.\n",
    "\n",
    "- Op1: swapping two rows uses a permutation matrix\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "0 & 1 \\\\\n",
    "1 & 0 \n",
    "\\end{bmatrix}\\begin{bmatrix}\n",
    "3 & 4 \\\\\n",
    "1 & 2 \n",
    "\\end{bmatrix} = \\begin{bmatrix}\n",
    "1 & 2 \\\\\n",
    "3 & 4 \n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "- Op2: replace a row by the sum of that row and a multiple of another uses an lower triangular matrix\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "1 & 0 \\\\\n",
    "-3 & 1 \n",
    "\\end{bmatrix}\\begin{bmatrix}\n",
    "1 & 2 \\\\\n",
    "3 & 4 \n",
    "\\end{bmatrix} = \\begin{bmatrix}\n",
    "1 & 2 \\\\\n",
    "0 & -2 \n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "Note: The inverse operation just substitutes the negative of the multiple, and is also lower triangular.\n",
    "\n",
    "- Op3: multiplying a row by a non-zero scalar uses an lower triangular matrix\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "1 & 0 \\\\\n",
    "0 & -0.5 \n",
    "\\end{bmatrix}\\begin{bmatrix}\n",
    "1 & 2 \\\\\n",
    "0 & -2 \n",
    "\\end{bmatrix} = \\begin{bmatrix}\n",
    "1 & 2 \\\\\n",
    "0 & 1 \n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "Note: The inverse operation just substitutes the inverse of the scalar, and is also lower triangular.\n",
    "\n",
    "Multiplying an upper triangular matrix by another lower triangular matrix gives an lower triangular matrix. Hence if we put the permutations aside (i.e. keep a separate permutation matrix), Gaussian elimination can be expressed as a product of lower triangular matrices, which is just another lower triangular matrix $L$, with the original matrix $A$ to give an upper triangular matrix $U$. The lower triangular matrix $L$ is then the product of the inverse operations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [1,3,4],\n",
    "    [2,1,3],\n",
    "    [4,7,2]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 3, 4],\n",
       "       [2, 1, 3],\n",
       "       [4, 7, 2]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Construct U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "b1 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [-2, 1, 0],\n",
    "    [0, 0, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "b2 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [0, 1, 0],\n",
    "    [-4, 0, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "b3 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [0, 1, 0],\n",
    "    [0, -1, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  3,  4],\n",
       "       [ 0, -5, -5],\n",
       "       [ 4,  7,  2]])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b1 @ A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1,   3,   4],\n",
       "       [  0,  -5,  -5],\n",
       "       [  0,  -5, -14]])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b2 @ b1 @ A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  3,  4],\n",
       "       [ 0, -5, -5],\n",
       "       [ 0,  0, -9]])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b3 @ b2 @ b1 @ A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  3,  4],\n",
       "       [ 0, -5, -5],\n",
       "       [ 0,  0, -9]])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U = b3 @ b2 @ b1 @ A\n",
    "U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Construct L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ib1 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [2, 1, 0],\n",
    "    [0, 0, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ib2 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [0, 1, 0],\n",
    "    [4, 0, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "ib3 = np.array([\n",
    "    [1, 0, 0],\n",
    "    [0, 1, 0],\n",
    "    [0, 1, 1]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0, 0],\n",
       "       [2, 1, 0],\n",
       "       [4, 1, 1]])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L = ib1 @ ib2 @ ib3\n",
    "L"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### A is factorized into LU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 3, 4],\n",
       "       [2, 1, 3],\n",
       "       [4, 7, 2]])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L @ U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 3, 4],\n",
       "       [2, 1, 3],\n",
       "       [4, 7, 2]])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now use the LU decomposition to solve for *any* $b$ without having to perform Gaussian elimination again. \n",
    "\n",
    "- First solve $Ly = b$\n",
    "- Then solve $Ux = y$\n",
    "\n",
    "Since $L$ and $U$ are triangular, they can be cheaply solved by substitution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1],\n",
       "       [2],\n",
       "       [3]])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.],\n",
       "       [ 0.],\n",
       "       [-1.]])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = la.solve_triangular(L, b, lower=True)\n",
    "y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.88888889],\n",
       "       [-0.11111111],\n",
       "       [ 0.11111111]])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = la.solve_triangular(U, y)\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### LDU Decomposition\n",
    "\n",
    "Note that $L$ is a unit triangular matrix while $U$ is not. It is sometimes instructive to factorize $A = LDU$ where both $L$ and $U$ are unit triangular, and $D$ is a diagonal matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  3,  4],\n",
       "       [ 0, -5, -5],\n",
       "       [ 0,  0, -9]])"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  0,  0],\n",
       "       [ 0, -5,  0],\n",
       "       [ 0,  0, -9]])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "D = np.diag(np.diag(U))\n",
    "D"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  3.,  4.],\n",
       "       [-0.,  1.,  1.],\n",
       "       [-0., -0.,  1.]])"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U1 = U/np.diag(U)[:, None]\n",
    "U1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  3.,  4.],\n",
       "       [ 0., -5., -5.],\n",
       "       [ 0.,  0., -9.]])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "D @ U1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.allclose(A, L @ D @ U1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### LU Decomposition in practice"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "P, L, U = la.lu(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.  ,  0.  ,  0.  ],\n",
       "       [ 0.5 ,  1.  ,  0.  ],\n",
       "       [ 0.25, -0.5 ,  1.  ]])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4. ,  7. ,  2. ],\n",
       "       [ 0. , -2.5,  2. ],\n",
       "       [ 0. ,  0. ,  4.5]])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "P is a permutation matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0., 0., 1.],\n",
       "       [0., 1., 0.],\n",
       "       [1., 0., 0.]])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In practice, we can store both L and U in a single matrix LU."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "LU, P = la.lu_factor(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4.  ,  7.  ,  2.  ],\n",
       "       [ 0.5 , -2.5 ,  2.  ],\n",
       "       [ 0.25, -0.5 ,  4.5 ]])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "LU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.88888889],\n",
       "       [-0.11111111],\n",
       "       [ 0.11111111]])"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "la.lu_solve((LU, P), b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cholesky Decomposition"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that a square matrix $A$ is positive definite if\n",
    "\n",
    "$$u^TA u > 0$$\n",
    "\n",
    "for any non-zero n-dimensional vector $u$,\n",
    "\n",
    "and a symmetric, positive-definite matrix $A$ is a positive-definite matrix such that\n",
    "\n",
    "$$A = A^T$$\n",
    "\n",
    "For a positive definite square matrix, all the pivots (diagonal elements of $U$) are positive. If the matrix is also symmetric, then we must have an $LDU$ decomposition of the form $LDL^T$, where all the diagonal elements of $D$ are positive. Given this, $D^{1/2}$ is well-defined, and we have\n",
    "\n",
    "$$\n",
    "A = LDL^T = LD^{1/2}D^{1/2}L^T = LD^{1/2}(LD^{1/2})^T = CC^{T}\n",
    "$$\n",
    "\n",
    "where $C$ is lower-triangular with positive diagonal elements and $C^T$ is its transpose.  This decomposition is known as the Cholesky decomposition, and $C$ may be interpreted as the 'square root' of the matrix $A$.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $A$ be an $n\\times n$ matrix.  We find the matrix $L$ using the following iterative procedure:\n",
    "\n",
    "\n",
    "$$A = \\left(\\begin{matrix}a_{11}&A_{12}\\\\A_{12}&A_{22}\\end{matrix}\\right) =\n",
    "\\left(\\begin{matrix}\\ell_{11}&0\\\\\n",
    "L_{12}&L_{22}\\end{matrix}\\right)\n",
    "\\left(\\begin{matrix}\\ell_{11}&L_{12}\\\\0&L_{22}\\end{matrix}\\right)\n",
    "$$\n",
    "\n",
    "1.) Let $\\ell_{11} = \\sqrt{a_{11}}$\n",
    "\n",
    "2.) $L_{12} = \\frac{1}{\\ell_{11}}A_{12}$\n",
    "\n",
    "3.) Solve $A_{22} - L_{12}L_{12}^T = L_{22}L_{22}^T$ for $L_{22}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$A = \\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\ell_{11} = \\sqrt{a_{11}} = 1$$\n",
    "\n",
    "$$L_{12} = \\frac{1}{\\ell_{11}} A_{12} = A_{12}$$\n",
    "\n",
    "$\\begin{eqnarray*}\n",
    "A_{22} - L_{12}L_{12}^T &=& \\left(\\begin{matrix}13&23\\\\23&42\\end{matrix}\\right) - \\left(\\begin{matrix}9&15\\\\15&25\\end{matrix}\\right)\\\\\n",
    "&=& \\left(\\begin{matrix}4&8\\\\8&17\\end{matrix}\\right)\n",
    "\\end{eqnarray*}$\n",
    "\n",
    "This is also symmetric and positive definite, and can be solved by another iteration\n",
    "\n",
    "$\\begin{eqnarray*}\n",
    "&=& \\left(\\begin{matrix}2&0\\\\4&\\ell_{33}\\end{matrix}\\right) \\left(\\begin{matrix}2&4\\\\0&\\ell_{33}\\end{matrix}\\right)\\\\\n",
    "&=& \\left(\\begin{matrix}4&8\\\\8&16+\\ell_{33}^2\\end{matrix}\\right)\n",
    "\\end{eqnarray*}$\n",
    "\n",
    "And so we conclude that $\\ell_{33}=1$.\n",
    "\n",
    "\n",
    "This yields the decomposition:\n",
    "\n",
    "\n",
    "$$\\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right) = \n",
    "\\left(\\begin{matrix}1&0&0\\\\3&2&0\\\\5&4&1\\end{matrix}\\right)\\left(\\begin{matrix}1&3&5\\\\0&2&4\\\\0&0&1\\end{matrix}\\right)$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [1,3,5],\n",
    "    [3,13,23],\n",
    "    [5,23,42]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1., 3., 5.],\n",
       "       [0., 2., 4.],\n",
       "       [0., 0., 1.]])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "C = la.cholesky(A)\n",
    "C"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.75],\n",
       "       [-0.25],\n",
       "       [ 0.  ]])"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "C1 = la.cho_factor(A)\n",
    "la.cho_solve(C1, b)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}