{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Matrix Decompositions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Matrix decompositions are an important step in solving linear systems in a computationally efficient manner. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import sys\n",
    "import glob\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "%matplotlib inline\n",
    "%precision 4\n",
    "plt.style.use('ggplot')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Reference**\n",
    "\n",
    "[SciPy's official tutorial on Linear algebra](http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LU Decomposition and Gaussian Elimination"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "LU stands for \"Lower Upper\", and so an LU decomposition of a matrix $A$ is a decomposition such that \n",
    "$$A= LU$$\n",
    "where $L$ is lower triangular and $U$ is upper triangular.\n",
    "\n",
    "Now, LU decomposition is essentially Gaussian elimination, but we work only with the matrix $A$ (as opposed to the augmented matrix). \n",
    "\n",
    "Let's review how Gaussian elimination (GE) works.  We will deal with a $3\\times 3$ system of equations for conciseness, but everything here generalizes to the $n\\times n$ case. Consider the following equation:\n",
    "\n",
    "$$\\left(\\begin{matrix}a_{11}&a_{12} & a_{13}\\\\a_{21}&a_{22}&a_{23}\\\\a_{31}&a_{32}&a_{33}\\end{matrix}\\right)\\left(\\begin{matrix}x_1\\\\x_2\\\\x_3\\end{matrix}\\right) = \\left(\\begin{matrix}b_1\\\\b_2\\\\b_3\\end{matrix}\\right)$$\n",
    "\n",
    "For simplicity, let us assume that the leftmost matrix $A$ is non-singular.  To solve the system using GE, we start with the \"augmented matrix\":\n",
    "\n",
    "$$\\left(\\begin{array}{ccc|c}a_{11}&a_{12} & a_{13}& b_1 \\\\a_{21}&a_{22}&a_{23}&b_2\\\\a_{31}&a_{32}&a_{33}&b_3\\end{array}\\right)$$\n",
    "\n",
    "We begin at the first entry, $a_{11}$.  If $a_{11} \\neq 0$, then we divide the first row by $a_{11}$ and then subtract the appropriate multiple of the first row from each of the other rows, zeroing out the first entry of all rows. (If $a_{11}$ is zero, we need to permute rows.  We will not go into details of that here.)  The result is as follows:\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\\begin{align}\n",
    "\\left(\\begin{array}{ccc|c}\n",
    "1 & \\frac{a_{12}}{a_{11}} & \\frac{a_{13}}{a_{11}} & \\frac{b_1}{a_{11}} \\\\\n",
    "0 & a_{22} - a_{21}\\frac{a_{12}}{a_{11}} & a_{23} - a_{21}\\frac{a_{13}}{a_{11}}  & b_2 - a_{21}\\frac{b_1}{a_{11}}\\\\\n",
    "0&a_{32}-a_{31}\\frac{a_{12}}{a_{11}} & a_{33} - a_{31}\\frac{a_{13}}{a_{11}}  &b_3- a_{31}\\frac{b_1}{a_{11}}\\end{array}\\right)\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We repeat the procedure for the second row: first divide by the leading entry, then subtract the appropriate multiple of the resulting row from each of the third and first rows, so that the second entry in row 1 and in row 3 are zero.  We *could* continue until the matrix on the left is the identity. In that case, we can then just \"read off\" the solution: i.e., the vector $x$ is the resulting column vector on the right.  Usually, it is more efficient to stop at *reduced row  echelon* form (upper triangular, with ones on the diagonal), and then use *back substitution* to obtain the final answer.\n",
    "\n",
    "Note that in some cases, it is necessary to permute rows to obtain reduced row echelon form.  This is called *partial pivoting*.  If we also manipulate columns, that is called *full pivoting*.\n",
    "\n",
    "It should be mentioned that we may obtain the inverse of a matrix using GE, by reducing the matrix $A$ to the identity, with the identity matrix as the augmented portion.  \n",
    "\n",
    "Now, this is all fine when we are solving a system one time, for one outcome $b$.  Many applications involve solutions to multiple problems, where the left-hand-side of our matrix equation does not change, but there are many outcome vectors $b$.  In this case, it is more efficient to *decompose* $A$.\n",
    "\n",
    "First, we start just as in GE, but we \"keep track\" of the various multiples required to eliminate entries. For example, consider the matrix\n",
    "\n",
    "\\begin{align}\n",
    "A = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           2& 1& 3\\\\\n",
    "                           4&1&2\n",
    "                           \\end{matrix}\\right)\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to multiply row $1$ by $2$ and subtract from row $2$ to eliminate the first entry in row $2$, and then multiply row $1$ by $4$ and subtract from row $3$. Instead of entering zeroes into the first entries of rows $2$ and $3$, we record the multiples required for their elimination, as so:\n",
    "\n",
    "\\begin{align}\n",
    "\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           (2)& -5 & -5\\\\\n",
    "                           (4)&-11&-14\n",
    "                           \\end{matrix}\\right)\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "And then we eliminate the second entry in the third row:\n",
    "\n",
    "\n",
    "\\begin{align}\n",
    "\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           (2)& -5 & -5\\\\\n",
    "                           (4)&(\\frac{11}{5})&-3\n",
    "                           \\end{matrix}\\right)\n",
    "\\end{align}\n",
    "                           \n",
    "And now we have the decomposition:\n",
    "\n",
    "\\begin{align}\n",
    "L= \\left(\\begin{matrix} 1 & 0 & 0 \\\\\n",
    "                           2& 1 & 0\\\\\n",
    "                           4&\\frac{11}5&1\n",
    "                           \\end{matrix}\\right)\n",
    "                          U = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n",
    "                           0& -5 & -5\\\\\n",
    "                           0&0&-3\n",
    "                           \\end{matrix}\\right)\n",
    "\\end{align}\n",
    "                                                  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 1.  3.  4.]\n",
      " [ 2.  1.  3.]\n",
      " [ 4.  1.  2.]]\n",
      "[[ 1.   0.   0. ]\n",
      " [ 2.   1.   0. ]\n",
      " [ 4.   2.2  1. ]]\n",
      "[[ 1  3  4]\n",
      " [ 0 -5 -5]\n",
      " [ 0  0 -3]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy.linalg as la\n",
    "np.set_printoptions(suppress=True) \n",
    "\n",
    "A = np.array([[1,3,4],[2,1,3],[4,1,2]])\n",
    "\n",
    "L = np.array([[1,0,0],[2,1,0],[4,11/5,1]])\n",
    "U = np.array([[1,3,4],[0,-5,-5],[0,0,-3]])\n",
    "print(L.dot(U))\n",
    "print(L)\n",
    "print(U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can solve the system by solving two back-substitution problems:\n",
    "    \n",
    "$$Ly = b$$    and\n",
    "$$Ux=y$$\n",
    "\n",
    "\n",
    "These are both $O(n^2)$, so it is more efficient to decompose when there are multiple outcomes to solve for."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's do this with numpy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1 3 4]\n",
      " [2 1 3]\n",
      " [4 1 2]]\n",
      "[[ 4.  1.  2.]\n",
      " [ 1.  3.  4.]\n",
      " [ 2.  1.  3.]]\n",
      "\n",
      "[[ 4.  1.  2.]\n",
      " [ 1.  3.  4.]\n",
      " [ 2.  1.  3.]]\n",
      "[[ 0.  1.  0.]\n",
      " [ 0.  0.  1.]\n",
      " [ 1.  0.  0.]]\n",
      "[[ 1.      0.      0.    ]\n",
      " [ 0.25    1.      0.    ]\n",
      " [ 0.5     0.1818  1.    ]]\n",
      "[[ 4.      1.      2.    ]\n",
      " [ 0.      2.75    3.5   ]\n",
      " [ 0.      0.      1.3636]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy.linalg as la\n",
    "np.set_printoptions(suppress=True) \n",
    "\n",
    "A = np.array([[1,3,4],[2,1,3],[4,1,2]])\n",
    "\n",
    "print(A)\n",
    "\n",
    "P, L, U = la.lu(A)\n",
    "print(np.dot(P.T, A))\n",
    "print\n",
    "print(np.dot(L, U))\n",
    "print(P)\n",
    "print(L)\n",
    "print(U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the numpy decomposition uses *partial pivoting* (matrix rows are permuted to use the largest pivot).  This is because small pivots can lead to numerical instability, another reason why one should use library functions whenever possible!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cholesky Decomposition"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that a square matrix $A$ is positive definite if\n",
    "\n",
    "$$u^TA u > 0$$\n",
    "\n",
    "for any non-zero n-dimensional vector $u$,\n",
    "\n",
    "and a symmetric, positive-definite matrix $A$ is a positive-definite matrix such that\n",
    "\n",
    "$$A = A^T$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $A$ be a symmetric, positive-definite matrix.  There is a unique decomposition such that\n",
    "\n",
    "$$A = L L^T$$\n",
    "\n",
    "where $L$ is lower-triangular with positive diagonal elements and $L^T$ is its transpose.  This decomposition is known as the *Cholesky decompostion*, and $L$ may be interpreted as the \"square root\" of the matrix $A$.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Algorithm:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $A$ be an $n\\times n$ matrix.  We find the matrix $L$ using the following iterative procedure:\n",
    "\n",
    "\n",
    "\\begin{align}\n",
    "A = \\left(\\begin{matrix}a_{11}&A_{12}^T\\\\A_{12}&A_{22}\\end{matrix}\\right) =\n",
    "\\left(\\begin{matrix}\\ell_{11}&0\\\\\n",
    "L_{12}&L_{22}\\end{matrix}\\right)\n",
    "\\left(\\begin{matrix}\\ell_{11}&L_{12}^T\\\\0&L_{22}^{T}\\end{matrix}\\right)\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1.) Let $\\ell_{11} = \\sqrt{a_{11}}$\n",
    "\n",
    "2.) $L_{12} = \\frac{1}{\\ell_{11}}A_{12}$\n",
    "\n",
    "3.) Solve $A_{22} - L_{12}L_{12}^T = L_{22}L_{22}^T$ for $L_{22}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$A = \\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\ell_{11} = \\sqrt{a_{11}} = 1$$\n",
    "\n",
    "$$L_{12} = \\frac{1}{\\ell_{11}} A_{12} = A_{12}$$\n",
    "\n",
    "$\\begin{eqnarray*}\n",
    "A_{22} - L_{12}L_{12}^T &=& \\left(\\begin{matrix}13&23\\\\23&42\\end{matrix}\\right) - \\left(\\begin{matrix}9&15\\\\15&25\\end{matrix}\\right)\\\\\n",
    "&=& \\left(\\begin{matrix}4&8\\\\8&17\\end{matrix}\\right)\\\\\n",
    "&=& \\left(\\begin{matrix}2&0\\\\4&\\ell_{33}\\end{matrix}\\right) \\left(\\begin{matrix}2&4\\\\0&\\ell_{33}\\end{matrix}\\right)\\\\\n",
    "&=& \\left(\\begin{matrix}4&8\\\\8&16+\\ell_{33}^2\\end{matrix}\\right)\n",
    "\\end{eqnarray*}$\n",
    "\n",
    "And so we conclude that $\\ell_{33}=1$.\n",
    "\n",
    "\n",
    "This yields the decomposition:\n",
    "\n",
    "\\begin{align}\n",
    "\\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right) = \n",
    "\\left(\\begin{matrix}1&0&0\\\\3&2&0\\\\5&4&1\\end{matrix}\\right)\\left(\\begin{matrix}1&3&5\\\\0&2&4\\\\0&0&1\\end{matrix}\\right)\n",
    "\\end{align}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, with numpy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[  1.   3.   5.]\n",
      " [  3.  13.  23.]\n",
      " [  5.  23.  42.]]\n",
      "[[ 1.  3.  5.]\n",
      " [ 0.  2.  4.]\n",
      " [ 0.  0.  1.]]\n",
      "[[ 1  3  5]\n",
      " [ 3 13 23]\n",
      " [ 5 23 42]]\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[1,3,5],[3,13,23],[5,23,42]])\n",
    "L = la.cholesky(A)\n",
    "print(np.dot(L.T, L))\n",
    "\n",
    "print(L)\n",
    "print(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cholesky decomposition is about twice as fast as LU decomposition (though both scale as $n^3$)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Matrix Decompositions for PCA and Least Squares"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Eigendecomposition"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Eigenvectors and Eigenvalues"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First recall that an *eigenvector* of a matrix $A$ is a non-zero vector $v$ such that\n",
    "\n",
    "$$Av = \\lambda v$$\n",
    "\n",
    "for some scalar $\\lambda$\n",
    "\n",
    "The value $\\lambda$ is called an *eigenvalue* of $A$.\n",
    "\n",
    "An $n\\times n$ matrix $A$ is said to be *diagonalizable* if there exists an $n \\times n$ nonsingular matrix $B$ such that $B^{-1}AB =  \\Lambda$ for some diagonal matrix $\\Lambda$.\n",
    "\n",
    "If an $n\\times n$ matrix $A$ has $n$ linearly independent eigenvectors, then $A$ may be decomposed in the following manner:\n",
    "\n",
    "$$A = B\\Lambda B^{-1}$$\n",
    "\n",
    "where $\\Lambda$ is a diagonal matrix whose diagonal entries are the eigenvalues of $A$ and the columns of $B$ are the corresponding eigenvectors of $A$.\n",
    "\n",
    "Facts: \n",
    "\n",
    "* An $n\\times n$ matrix is diagonizable $\\iff$ It has $n$ linearly independent eigenvectors.\n",
    "* A symmetric, positive definite matrix has only positive eigenvalues and its eigendecomposition \n",
    "$$A=B\\Lambda B^{-1}$$\n",
    "\n",
    "is via an orthogonal transformation $B$. (i.e. its eigenvectors are an orthonormal set) \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Calculating Eigenvalues"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is easy to see from the definition that if $v$ is an eigenvector of an $n\\times n$ matrix $A$ with eigenvalue $\\lambda$, then\n",
    "\n",
    "$$(A - \\lambda I) v = \\bf{0}$$\n",
    "\n",
    "where $I$ is the identity matrix of dimension $n$ and $\\bf{0}$ is an n-dimensional zero vector. Therefore, the eigenvalues of $A$ satisfy:\n",
    "\n",
    "$$\\det\\left(A-\\lambda I\\right)=0$$\n",
    "\n",
    "The left-hand side above is a polynomial in $\\lambda$, and is called the *characteristic polynomial* of $A$.  Thus, to find the eigenvalues of $A$, we find the roots of the characteristic polynomial.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Computationally, however, computing the characteristic polynomial and then solving for the roots is prohibitively expensive.  Therefore, in practice, numerical methods are used - both to find eigenvalues and their corresponding eigenvectors. We won't go into the specifics of the algorithms used to calculate eigenvalues, but here is a ``numpy`` example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.+0.j  1.+0.j  1.+0.j]\n",
      " [ 2.+0.j  1.+0.j  0.+0.j]\n",
      " [ 3.+0.j  4.+0.j  5.+0.j]]\n",
      "[ 5.8541+0.j -0.8541+0.j  1.0000+0.j]\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[0,1,1],[2,1,0],[3,4,5]])\n",
    "\n",
    "u, V = la.eig(A)\n",
    "print(np.dot(V,np.dot(np.diag(u), la.inv(V))))\n",
    "print(u)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**NB:** Many matrices are *not* diagonizable, and many have *complex* eigenvalues (even if all entries are real).   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0  1]\n",
      " [-1  0]]\n",
      "[[ 0.+0.j  1.+0.j]\n",
      " [-1.+0.j  0.+0.j]]\n",
      "[ 0.+1.j  0.-1.j]\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[0,1],[-1,0]])\n",
    "print(A)\n",
    "\n",
    "u, V = la.eig(A)\n",
    "print(np.dot(V,np.dot(np.diag(u), la.inv(V))))\n",
    "print(u)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 5.8541+0.j -0.8541+0.j  1.0000+0.j]\n",
      "[ 5.8541 -0.8541  1.    ]\n"
     ]
    }
   ],
   "source": [
    "# If you know the eigenvalues must be real \n",
    "# because A is a positive definite (e.g. covariance) matrix \n",
    "# use real_if_close\n",
    "\n",
    "A = np.array([[0,1,1],[2,1,0],[3,4,5]])\n",
    "u, V = la.eig(A)\n",
    "print(u)\n",
    "print np.real_if_close(u)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Singular Values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For any $m\\times n$ matrix $A$, we define its *singular values* to be the square roots of the non-zero eigenvalues of $A^TA$.  These are well-defined as $A^TA$ is always symmetric and positive-definite ??, so its eigenvalues are real and positive.  Singular values are important properties of a matrix.  Geometrically, a matrix $A$ maps the unit sphere in $\\mathbb{R}^n$ to an ellipse.  The singular values are the lengths of the semi-axes.  \n",
    "\n",
    "Singular values also provide a measure of the *stability* of a matrix.  We'll revisit this in the end of the lecture."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## QR decompositon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As with the previous decompositions, *$QR$ decomposition* is a method to write a $m \\times n$ matrix $A$ of *full column rank* as the product of two matrices of simpler forms.  In this case, we want:\n",
    "\n",
    "$$ A= QR$$\n",
    "where $Q$ is an $m\\times n$ matrix with $Q Q^T ?? = I$ ($Q^T Q = I_{n}$)(i.e. $Q$ is *orthogonal*)??($Q$ is not orthogonal!) and $R$ is an $n\\times n$ upper-triangular matrix with positive diagonal elements.\n",
    "\n",
    "This is really just the matrix form of the Gram-Schmidt orthogonalization of the columns of $A$.  The G-S algorithm itself is unstable, so various other methods have been developed to compute the QR decomposition.  We won't cover those in detail as they are a bit beyond our scope."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first $k$ columns of $Q$ are an orthonormal basis for the column space of the first $k$ columns of $A$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Iterative QR decomposition is often used in the computation of eigenvalues."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Singular Value Decomposition"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another important matrix decomposition is singular value decomposition or SVD.   For any $m\\times n$ matrix $A$, we may write:\n",
    "\n",
    "$$A= UDV$$\n",
    "\n",
    "where $U$ is a unitary (orthogonal in the real case) $m\\times m$ matrix, $D$ is a rectangular, diagonal $m\\times n$ matrix with diagonal entries $d_1,...,d_m$ all non-negative. $V$ is a unitary (orthogonal) $n\\times n$ matrix. SVD is used in principle component analysis and in the computation of the Moore-Penrose pseudo-inverse."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Stability and Condition Number"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is important that numerical algorithms be *stable* and *efficient*.  Efficiency is a property of an algorithm, but stability can be a property of the system itself."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\left(\\begin{matrix}8&6&4&1\\\\1&4&5&1\\\\8&4&1&1\\\\1&4&3&6\\end{matrix}\\right)x = \\left(\\begin{matrix}19\\\\11\\\\14\\\\14\\end{matrix}\\right)$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  1.,  1.,  1.])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = np.array([[8,6,4,1],[1,4,5,1],[8,4,1,1],[1,4,3,6]])\n",
    "b = np.array([19,11,14,14])\n",
    "la.solve(A,b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([-2.34 ,  9.745, -4.85 , -1.34 ])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = np.array([19.01,11.05,14.07,14.05])\n",
    "la.solve(A,b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the *tiny* perturbations in the outcome vector $b$ cause *large* differences in the solution!  When this happens, we say that the matrix $A$ *ill-conditioned*.  This happens when a matrix is \"close\" to being singular (i.e. non-invertible)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Condition Number"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "  \n",
    "\n",
    "A measure of this type of behavior is called the *condition number*.  It is defined as:\n",
    "\n",
    "$$ cond(A) = ||A||\\cdot ||A^{-1}|| $$\n",
    "\n",
    "In general, it is difficult to compute.\n",
    "\n",
    "Fact: \n",
    "\n",
    "$$cond(A) = \\frac{\\lambda_1}{\\lambda_n}$$\n",
    "\n",
    "where $\\lambda_1$ is the maximum singular value of $A$ and $\\lambda_n$ is the smallest.  The higher the condition number, the more unstable the system.  In general if there is a large discrepancy between minimal and maximal singular values, the condition number is large."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 15.5457   6.9002   3.8363   0.0049]\n",
      "3198.6725812\n"
     ]
    }
   ],
   "source": [
    "U, s, V = np.linalg.svd(A)\n",
    "print(s)\n",
    "print(max(s)/min(s))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Preconditioning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can sometimes improve on this behavior by \"pre-conditioning\".  Instead of solving\n",
    "\n",
    "$$Ax=b$$\n",
    "\n",
    "we solve\n",
    "\n",
    "$$D^{-1}Ax=D^{-1}b$$\n",
    "\n",
    "where $D^{-1}A$ has a lower condition number than $A$ itself.   \n",
    "\n",
    "Preconditioning is a *very* involved topic, quite out of the range of this course.  It is mentioned here only to make you aware that such a thing exists, should you ever run into an ill-conditioned problem!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<font color=red>Exercises</font>\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1**. Compute the LU decomposition of the following matrix by hand and using numpy\n",
    "\n",
    "$$\\left(\\begin{matrix}1&2&3\\\\2&-4&6\\\\3&-9&-3\\end{matrix}\\right)$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Solution:\n",
    "\n",
    "First by hand:\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2**. Compute the Cholesky decomposition of the following matrix by hand and using numpy\n",
    "\n",
    "$$\\left(\\begin{matrix}1&2&3\\\\2&-4&6\\\\3&6&-3\\end{matrix}\\right)$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3**. Write a function in Python to solve a system\n",
    "\n",
    "$$Ax = b$$\n",
    "\n",
    "using SVD decomposition.  Your function should take $A$ and $b$ as input and return $x$.\n",
    "\n",
    "Your function should include the following:\n",
    "\n",
    "* First, check that $A$ is invertible - return error message if it is not\n",
    "* Invert $A$ using SVD and ``solve``\n",
    "* return $x$\n",
    "\n",
    "Test your function for correctness."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Your code here\n",
    "\n",
    "def svdsolver(A,b):\n",
    "    U, s, V = np.linalg.svd(A)\n",
    "    if np.prod(s) == 0:\n",
    "       print(\"Matrix is singular\")\n",
    "    else:\n",
    "       return np.dot(np.dot((V.T).dot(np.diag(s**(-1))), U.T),b)\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 5. -2.]\n",
      "[ 5. -2.]\n"
     ]
    }
   ],
   "source": [
    "A = np.array([[1,1],[1,2]])\n",
    "b = np.array([3,1])\n",
    "print(np.linalg.solve(A,b))\n",
    "print(svdsolver(A,b))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}