{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Matrix Decompositions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Matrix decompositions are an important step in solving linear systems in a computationally efficient manner. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import os\n", "import sys\n", "import glob\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "%precision 4\n", "plt.style.use('ggplot')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Reference**\n", "\n", "[SciPy's official tutorial on Linear algebra](http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## LU Decomposition and Gaussian Elimination" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "LU stands for \"Lower Upper\", and so an LU decomposition of a matrix $A$ is a decomposition such that \n", "$$A= LU$$\n", "where $L$ is lower triangular and $U$ is upper triangular.\n", "\n", "Now, LU decomposition is essentially Gaussian elimination, but we work only with the matrix $A$ (as opposed to the augmented matrix). \n", "\n", "Let's review how Gaussian elimination (GE) works. We will deal with a $3\\times 3$ system of equations for conciseness, but everything here generalizes to the $n\\times n$ case. Consider the following equation:\n", "\n", "$$\\left(\\begin{matrix}a_{11}&a_{12} & a_{13}\\\\a_{21}&a_{22}&a_{23}\\\\a_{31}&a_{32}&a_{33}\\end{matrix}\\right)\\left(\\begin{matrix}x_1\\\\x_2\\\\x_3\\end{matrix}\\right) = \\left(\\begin{matrix}b_1\\\\b_2\\\\b_3\\end{matrix}\\right)$$\n", "\n", "For simplicity, let us assume that the leftmost matrix $A$ is non-singular. To solve the system using GE, we start with the \"augmented matrix\":\n", "\n", "$$\\left(\\begin{array}{ccc|c}a_{11}&a_{12} & a_{13}& b_1 \\\\a_{21}&a_{22}&a_{23}&b_2\\\\a_{31}&a_{32}&a_{33}&b_3\\end{array}\\right)$$\n", "\n", "We begin at the first entry, $a_{11}$. If $a_{11} \\neq 0$, then we divide the first row by $a_{11}$ and then subtract the appropriate multiple of the first row from each of the other rows, zeroing out the first entry of all rows. (If $a_{11}$ is zero, we need to permute rows. We will not go into details of that here.) The result is as follows:\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\\begin{align}\n", "\\left(\\begin{array}{ccc|c}\n", "1 & \\frac{a_{12}}{a_{11}} & \\frac{a_{13}}{a_{11}} & \\frac{b_1}{a_{11}} \\\\\n", "0 & a_{22} - a_{21}\\frac{a_{12}}{a_{11}} & a_{23} - a_{21}\\frac{a_{13}}{a_{11}} & b_2 - a_{21}\\frac{b_1}{a_{11}}\\\\\n", "0&a_{32}-a_{31}\\frac{a_{12}}{a_{11}} & a_{33} - a_{31}\\frac{a_{13}}{a_{11}} &b_3- a_{31}\\frac{b_1}{a_{11}}\\end{array}\\right)\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We repeat the procedure for the second row: first divide by the leading entry, then subtract the appropriate multiple of the resulting row from each of the third and first rows, so that the second entry in row 1 and in row 3 are zero. We *could* continue until the matrix on the left is the identity. In that case, we can then just \"read off\" the solution: i.e., the vector $x$ is the resulting column vector on the right. Usually, it is more efficient to stop at *reduced row echelon* form (upper triangular, with ones on the diagonal), and then use *back substitution* to obtain the final answer.\n", "\n", "Note that in some cases, it is necessary to permute rows to obtain reduced row echelon form. This is called *partial pivoting*. If we also manipulate columns, that is called *full pivoting*.\n", "\n", "It should be mentioned that we may obtain the inverse of a matrix using GE, by reducing the matrix $A$ to the identity, with the identity matrix as the augmented portion. \n", "\n", "Now, this is all fine when we are solving a system one time, for one outcome $b$. Many applications involve solutions to multiple problems, where the left-hand-side of our matrix equation does not change, but there are many outcome vectors $b$. In this case, it is more efficient to *decompose* $A$.\n", "\n", "First, we start just as in GE, but we \"keep track\" of the various multiples required to eliminate entries. For example, consider the matrix\n", "\n", "\\begin{align}\n", "A = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n", " 2& 1& 3\\\\\n", " 4&1&2\n", " \\end{matrix}\\right)\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to multiply row $1$ by $2$ and subtract from row $2$ to eliminate the first entry in row $2$, and then multiply row $1$ by $4$ and subtract from row $3$. Instead of entering zeroes into the first entries of rows $2$ and $3$, we record the multiples required for their elimination, as so:\n", "\n", "\\begin{align}\n", "\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n", " (2)& -5 & -5\\\\\n", " (4)&-11&-14\n", " \\end{matrix}\\right)\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "And then we eliminate the second entry in the third row:\n", "\n", "\n", "\\begin{align}\n", "\\left(\\begin{matrix} 1 & 3 & 4 \\\\\n", " (2)& -5 & -5\\\\\n", " (4)&(\\frac{11}{5})&-3\n", " \\end{matrix}\\right)\n", "\\end{align}\n", " \n", "And now we have the decomposition:\n", "\n", "\\begin{align}\n", "L= \\left(\\begin{matrix} 1 & 0 & 0 \\\\\n", " 2& 1 & 0\\\\\n", " 4&\\frac{11}5&1\n", " \\end{matrix}\\right)\n", " U = \\left(\\begin{matrix} 1 & 3 & 4 \\\\\n", " 0& -5 & -5\\\\\n", " 0&0&-3\n", " \\end{matrix}\\right)\n", "\\end{align}\n", " " ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. 3. 4.]\n", " [ 2. 1. 3.]\n", " [ 4. 1. 2.]]\n", "[[ 1. 0. 0. ]\n", " [ 2. 1. 0. ]\n", " [ 4. 2.2 1. ]]\n", "[[ 1 3 4]\n", " [ 0 -5 -5]\n", " [ 0 0 -3]]\n" ] } ], "source": [ "import numpy as np\n", "import scipy.linalg as la\n", "np.set_printoptions(suppress=True) \n", "\n", "A = np.array([[1,3,4],[2,1,3],[4,1,2]])\n", "\n", "L = np.array([[1,0,0],[2,1,0],[4,11/5,1]])\n", "U = np.array([[1,3,4],[0,-5,-5],[0,0,-3]])\n", "print(L.dot(U))\n", "print(L)\n", "print(U)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can solve the system by solving two back-substitution problems:\n", " \n", "$$Ly = b$$ and\n", "$$Ux=y$$\n", "\n", "\n", "These are both $O(n^2)$, so it is more efficient to decompose when there are multiple outcomes to solve for." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's do this with numpy:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 3 4]\n", " [2 1 3]\n", " [4 1 2]]\n", "[[ 4. 1. 2.]\n", " [ 1. 3. 4.]\n", " [ 2. 1. 3.]]\n", "\n", "[[ 4. 1. 2.]\n", " [ 1. 3. 4.]\n", " [ 2. 1. 3.]]\n", "[[ 0. 1. 0.]\n", " [ 0. 0. 1.]\n", " [ 1. 0. 0.]]\n", "[[ 1. 0. 0. ]\n", " [ 0.25 1. 0. ]\n", " [ 0.5 0.1818 1. ]]\n", "[[ 4. 1. 2. ]\n", " [ 0. 2.75 3.5 ]\n", " [ 0. 0. 1.3636]]\n" ] } ], "source": [ "import numpy as np\n", "import scipy.linalg as la\n", "np.set_printoptions(suppress=True) \n", "\n", "A = np.array([[1,3,4],[2,1,3],[4,1,2]])\n", "\n", "print(A)\n", "\n", "P, L, U = la.lu(A)\n", "print(np.dot(P.T, A))\n", "print\n", "print(np.dot(L, U))\n", "print(P)\n", "print(L)\n", "print(U)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the numpy decomposition uses *partial pivoting* (matrix rows are permuted to use the largest pivot). This is because small pivots can lead to numerical instability, another reason why one should use library functions whenever possible!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cholesky Decomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall that a square matrix $A$ is positive definite if\n", "\n", "$$u^TA u > 0$$\n", "\n", "for any non-zero n-dimensional vector $u$,\n", "\n", "and a symmetric, positive-definite matrix $A$ is a positive-definite matrix such that\n", "\n", "$$A = A^T$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let $A$ be a symmetric, positive-definite matrix. There is a unique decomposition such that\n", "\n", "$$A = L L^T$$\n", "\n", "where $L$ is lower-triangular with positive diagonal elements and $L^T$ is its transpose. This decomposition is known as the *Cholesky decompostion*, and $L$ may be interpreted as the \"square root\" of the matrix $A$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Algorithm:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let $A$ be an $n\\times n$ matrix. We find the matrix $L$ using the following iterative procedure:\n", "\n", "\n", "\\begin{align}\n", "A = \\left(\\begin{matrix}a_{11}&A_{12}^T\\\\A_{12}&A_{22}\\end{matrix}\\right) =\n", "\\left(\\begin{matrix}\\ell_{11}&0\\\\\n", "L_{12}&L_{22}\\end{matrix}\\right)\n", "\\left(\\begin{matrix}\\ell_{11}&L_{12}^T\\\\0&L_{22}^{T}\\end{matrix}\\right)\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1.) Let $\\ell_{11} = \\sqrt{a_{11}}$\n", "\n", "2.) $L_{12} = \\frac{1}{\\ell_{11}}A_{12}$\n", "\n", "3.) Solve $A_{22} - L_{12}L_{12}^T = L_{22}L_{22}^T$ for $L_{22}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$A = \\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right)$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\ell_{11} = \\sqrt{a_{11}} = 1$$\n", "\n", "$$L_{12} = \\frac{1}{\\ell_{11}} A_{12} = A_{12}$$\n", "\n", "$\\begin{eqnarray*}\n", "A_{22} - L_{12}L_{12}^T &=& \\left(\\begin{matrix}13&23\\\\23&42\\end{matrix}\\right) - \\left(\\begin{matrix}9&15\\\\15&25\\end{matrix}\\right)\\\\\n", "&=& \\left(\\begin{matrix}4&8\\\\8&17\\end{matrix}\\right)\\\\\n", "&=& \\left(\\begin{matrix}2&0\\\\4&\\ell_{33}\\end{matrix}\\right) \\left(\\begin{matrix}2&4\\\\0&\\ell_{33}\\end{matrix}\\right)\\\\\n", "&=& \\left(\\begin{matrix}4&8\\\\8&16+\\ell_{33}^2\\end{matrix}\\right)\n", "\\end{eqnarray*}$\n", "\n", "And so we conclude that $\\ell_{33}=1$.\n", "\n", "\n", "This yields the decomposition:\n", "\n", "\\begin{align}\n", "\\left(\\begin{matrix}1&3&5\\\\3&13&23\\\\5&23&42\\end{matrix}\\right) = \n", "\\left(\\begin{matrix}1&0&0\\\\3&2&0\\\\5&4&1\\end{matrix}\\right)\\left(\\begin{matrix}1&3&5\\\\0&2&4\\\\0&0&1\\end{matrix}\\right)\n", "\\end{align}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, with numpy:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. 3. 5.]\n", " [ 3. 13. 23.]\n", " [ 5. 23. 42.]]\n", "[[ 1. 3. 5.]\n", " [ 0. 2. 4.]\n", " [ 0. 0. 1.]]\n", "[[ 1 3 5]\n", " [ 3 13 23]\n", " [ 5 23 42]]\n" ] } ], "source": [ "A = np.array([[1,3,5],[3,13,23],[5,23,42]])\n", "L = la.cholesky(A)\n", "print(np.dot(L.T, L))\n", "\n", "print(L)\n", "print(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cholesky decomposition is about twice as fast as LU decomposition (though both scale as $n^3$)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matrix Decompositions for PCA and Least Squares" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Eigendecomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Eigenvectors and Eigenvalues" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First recall that an *eigenvector* of a matrix $A$ is a non-zero vector $v$ such that\n", "\n", "$$Av = \\lambda v$$\n", "\n", "for some scalar $\\lambda$\n", "\n", "The value $\\lambda$ is called an *eigenvalue* of $A$.\n", "\n", "An $n\\times n$ matrix $A$ is said to be *diagonalizable* if there exists an $n \\times n$ nonsingular matrix $B$ such that $B^{-1}AB = \\Lambda$ for some diagonal matrix $\\Lambda$.\n", "\n", "If an $n\\times n$ matrix $A$ has $n$ linearly independent eigenvectors, then $A$ may be decomposed in the following manner:\n", "\n", "$$A = B\\Lambda B^{-1}$$\n", "\n", "where $\\Lambda$ is a diagonal matrix whose diagonal entries are the eigenvalues of $A$ and the columns of $B$ are the corresponding eigenvectors of $A$.\n", "\n", "Facts: \n", "\n", "* An $n\\times n$ matrix is diagonizable $\\iff$ It has $n$ linearly independent eigenvectors.\n", "* A symmetric, positive definite matrix has only positive eigenvalues and its eigendecomposition \n", "$$A=B\\Lambda B^{-1}$$\n", "\n", "is via an orthogonal transformation $B$. (i.e. its eigenvectors are an orthonormal set) \n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Calculating Eigenvalues" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is easy to see from the definition that if $v$ is an eigenvector of an $n\\times n$ matrix $A$ with eigenvalue $\\lambda$, then\n", "\n", "$$(A - \\lambda I) v = \\bf{0}$$\n", "\n", "where $I$ is the identity matrix of dimension $n$ and $\\bf{0}$ is an n-dimensional zero vector. Therefore, the eigenvalues of $A$ satisfy:\n", "\n", "$$\\det\\left(A-\\lambda I\\right)=0$$\n", "\n", "The left-hand side above is a polynomial in $\\lambda$, and is called the *characteristic polynomial* of $A$. Thus, to find the eigenvalues of $A$, we find the roots of the characteristic polynomial.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Computationally, however, computing the characteristic polynomial and then solving for the roots is prohibitively expensive. Therefore, in practice, numerical methods are used - both to find eigenvalues and their corresponding eigenvectors. We won't go into the specifics of the algorithms used to calculate eigenvalues, but here is a ``numpy`` example:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-0.+0.j 1.+0.j 1.+0.j]\n", " [ 2.+0.j 1.+0.j 0.+0.j]\n", " [ 3.+0.j 4.+0.j 5.+0.j]]\n", "[ 5.8541+0.j -0.8541+0.j 1.0000+0.j]\n" ] } ], "source": [ "A = np.array([[0,1,1],[2,1,0],[3,4,5]])\n", "\n", "u, V = la.eig(A)\n", "print(np.dot(V,np.dot(np.diag(u), la.inv(V))))\n", "print(u)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NB:** Many matrices are *not* diagonizable, and many have *complex* eigenvalues (even if all entries are real). " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1]\n", " [-1 0]]\n", "[[ 0.+0.j 1.+0.j]\n", " [-1.+0.j 0.+0.j]]\n", "[ 0.+1.j 0.-1.j]\n" ] } ], "source": [ "A = np.array([[0,1],[-1,0]])\n", "print(A)\n", "\n", "u, V = la.eig(A)\n", "print(np.dot(V,np.dot(np.diag(u), la.inv(V))))\n", "print(u)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 5.8541+0.j -0.8541+0.j 1.0000+0.j]\n", "[ 5.8541 -0.8541 1. ]\n" ] } ], "source": [ "# If you know the eigenvalues must be real \n", "# because A is a positive definite (e.g. covariance) matrix \n", "# use real_if_close\n", "\n", "A = np.array([[0,1,1],[2,1,0],[3,4,5]])\n", "u, V = la.eig(A)\n", "print(u)\n", "print np.real_if_close(u)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Singular Values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For any $m\\times n$ matrix $A$, we define its *singular values* to be the square roots of the non-zero eigenvalues of $A^TA$. These are well-defined as $A^TA$ is always symmetric and positive-definite ??, so its eigenvalues are real and positive. Singular values are important properties of a matrix. Geometrically, a matrix $A$ maps the unit sphere in $\\mathbb{R}^n$ to an ellipse. The singular values are the lengths of the semi-axes. \n", "\n", "Singular values also provide a measure of the *stability* of a matrix. We'll revisit this in the end of the lecture." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## QR decompositon" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As with the previous decompositions, *$QR$ decomposition* is a method to write a $m \\times n$ matrix $A$ of *full column rank* as the product of two matrices of simpler forms. In this case, we want:\n", "\n", "$$ A= QR$$\n", "where $Q$ is an $m\\times n$ matrix with $Q Q^T ?? = I$ ($Q^T Q = I_{n}$)(i.e. $Q$ is *orthogonal*)??($Q$ is not orthogonal!) and $R$ is an $n\\times n$ upper-triangular matrix with positive diagonal elements.\n", "\n", "This is really just the matrix form of the Gram-Schmidt orthogonalization of the columns of $A$. The G-S algorithm itself is unstable, so various other methods have been developed to compute the QR decomposition. We won't cover those in detail as they are a bit beyond our scope." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first $k$ columns of $Q$ are an orthonormal basis for the column space of the first $k$ columns of $A$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterative QR decomposition is often used in the computation of eigenvalues." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Singular Value Decomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another important matrix decomposition is singular value decomposition or SVD. For any $m\\times n$ matrix $A$, we may write:\n", "\n", "$$A= UDV$$\n", "\n", "where $U$ is a unitary (orthogonal in the real case) $m\\times m$ matrix, $D$ is a rectangular, diagonal $m\\times n$ matrix with diagonal entries $d_1,...,d_m$ all non-negative. $V$ is a unitary (orthogonal) $n\\times n$ matrix. SVD is used in principle component analysis and in the computation of the Moore-Penrose pseudo-inverse." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stability and Condition Number" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is important that numerical algorithms be *stable* and *efficient*. Efficiency is a property of an algorithm, but stability can be a property of the system itself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\left(\\begin{matrix}8&6&4&1\\\\1&4&5&1\\\\8&4&1&1\\\\1&4&3&6\\end{matrix}\\right)x = \\left(\\begin{matrix}19\\\\11\\\\14\\\\14\\end{matrix}\\right)$$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 1., 1., 1., 1.])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[8,6,4,1],[1,4,5,1],[8,4,1,1],[1,4,3,6]])\n", "b = np.array([19,11,14,14])\n", "la.solve(A,b)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([-2.34 , 9.745, -4.85 , -1.34 ])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b = np.array([19.01,11.05,14.07,14.05])\n", "la.solve(A,b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the *tiny* perturbations in the outcome vector $b$ cause *large* differences in the solution! When this happens, we say that the matrix $A$ *ill-conditioned*. This happens when a matrix is \"close\" to being singular (i.e. non-invertible)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Condition Number" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " \n", "\n", "A measure of this type of behavior is called the *condition number*. It is defined as:\n", "\n", "$$ cond(A) = ||A||\\cdot ||A^{-1}|| $$\n", "\n", "In general, it is difficult to compute.\n", "\n", "Fact: \n", "\n", "$$cond(A) = \\frac{\\lambda_1}{\\lambda_n}$$\n", "\n", "where $\\lambda_1$ is the maximum singular value of $A$ and $\\lambda_n$ is the smallest. The higher the condition number, the more unstable the system. In general if there is a large discrepancy between minimal and maximal singular values, the condition number is large." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 15.5457 6.9002 3.8363 0.0049]\n", "3198.6725812\n" ] } ], "source": [ "U, s, V = np.linalg.svd(A)\n", "print(s)\n", "print(max(s)/min(s))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Preconditioning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can sometimes improve on this behavior by \"pre-conditioning\". Instead of solving\n", "\n", "$$Ax=b$$\n", "\n", "we solve\n", "\n", "$$D^{-1}Ax=D^{-1}b$$\n", "\n", "where $D^{-1}A$ has a lower condition number than $A$ itself. \n", "\n", "Preconditioning is a *very* involved topic, quite out of the range of this course. It is mentioned here only to make you aware that such a thing exists, should you ever run into an ill-conditioned problem!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Exercises\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**1**. Compute the LU decomposition of the following matrix by hand and using numpy\n", "\n", "$$\\left(\\begin{matrix}1&2&3\\\\2&-4&6\\\\3&-9&-3\\end{matrix}\\right)$$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Solution:\n", "\n", "First by hand:\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**2**. Compute the Cholesky decomposition of the following matrix by hand and using numpy\n", "\n", "$$\\left(\\begin{matrix}1&2&3\\\\2&-4&6\\\\3&6&-3\\end{matrix}\\right)$$\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3**. Write a function in Python to solve a system\n", "\n", "$$Ax = b$$\n", "\n", "using SVD decomposition. Your function should take $A$ and $b$ as input and return $x$.\n", "\n", "Your function should include the following:\n", "\n", "* First, check that $A$ is invertible - return error message if it is not\n", "* Invert $A$ using SVD and ``solve``\n", "* return $x$\n", "\n", "Test your function for correctness." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Your code here\n", "\n", "def svdsolver(A,b):\n", " U, s, V = np.linalg.svd(A)\n", " if np.prod(s) == 0:\n", " print(\"Matrix is singular\")\n", " else:\n", " return np.dot(np.dot((V.T).dot(np.diag(s**(-1))), U.T),b)\n", " " ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 5. -2.]\n", "[ 5. -2.]\n" ] } ], "source": [ "A = np.array([[1,1],[1,2]])\n", "b = np.array([3,1])\n", "print(np.linalg.solve(A,b))\n", "print(svdsolver(A,b))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }