{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib as mpl\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import scipy.linalg as la\n",
    "\n",
    "sns.set_context('notebook', font_scale=1.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1**. Exact geometric solutions with $n = m$\n",
    "\n",
    "- Find the equation of the line that passes through the points (2,1) and (3,7)\n",
    "- Find the equation of the circle that passes through the points (1,7), (6,2) and (4,6)\n",
    "\n",
    "Hint: The equation of a circle can be written as\n",
    "\n",
    "$$\n",
    "(x - a)^2 + (y - b)^2 = r^2\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the equation of the line that passes through the points (2,1) and (3,7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the equation of the circle that passes through the points (1,7), (6,2) and (4,6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2**. Interpolating polynomials and choice of basis\n",
    "\n",
    "We have \n",
    "\n",
    "| $x_i$ | $y_i$ |\n",
    "| ----- | -------- |\n",
    "| 0     | -5       |\n",
    "| 1     | -3       |\n",
    "| -1     | -15       |\n",
    "| 2     | 39       |\n",
    "| -2     | -9       |\n",
    "\n",
    "Find interpolating polynomials using\n",
    "\n",
    "- Monomial basis $f_i(x_i) = x^i$ - code this using only simple linear algebra operations (including solve)\n",
    "- Lagrange basis\n",
    "$$\n",
    "l_j(x_j) = \\prod_{0 \\le m \\le k, m \\ne j} \\frac{x - x_m}{x_j - x_m}\n",
    "$$\n",
    "\n",
    "The Lagrange interpolation uses the values of $y$ as the coefficient for the basis polynomials. Do this manually and then using the scipy.interpolate package"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.array([0,1,-1,2,-2])\n",
    "y = np.array([-5, -3, -15, 39, -9])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Monomial basis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lagrange basis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using library functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.interpolate import lagrange"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3**. Markov chains\n",
    "\n",
    "$$\n",
    "P = \\pmatrix{\n",
    "    p_{11} & p_{12} & p_{13} \\\\\n",
    "    p_{21} & p_{22} & p_{33} \\\\\n",
    "    p_{31} & p_{32} & p_{33} \\\\\n",
    "    }\n",
    "$$\n",
    "\n",
    "By convention, the $rows$ of a Markov transition matrix sum to 1, and $p_{32}$ is the probability that the system will change from state 3 to state 2. Therefore, to see the next state of an initial probability row vector $v_k$, we need to perform left multiplication\n",
    "\n",
    "$$\n",
    "v_{k+1}^T = v_{k}^T P\n",
    "$$\n",
    "\n",
    "If this is confusing, you can work with the matrix $P^T$ and do right-multiplication with column vectors. In this case, $p_{32}$ is the probability that the system will change from state 2 to state 3.\n",
    "\n",
    "![img](../data/markov.png)\n",
    "\n",
    "Find the stationary vector $\\pi^T = \\pi^T P$ for the transition graph shown \n",
    "\n",
    "- by solving a set of linear equations\n",
    "- by solving an eigenvector problem\n",
    "- Check that the resulting vector is invariant with respect to the transition matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Brute force check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**4**. Graphs\n",
    "\n",
    "$M$ is the adjacency matrix of a directed graph $G$. Find the vertices that belong to a clique.\n",
    "\n",
    "$$\n",
    "M = \\pmatrix{\n",
    "    0 & 1 & 0 & 1 & 1 \\\\\n",
    "    1 & 0 & 0 & 1 & 0 \\\\\n",
    "    1 & 1 & 0 & 1 & 0 \\\\\n",
    "    1 & 1 & 0 & 0 & 0 \\\\\n",
    "    1 & 0 & 0 & 1 & 0\n",
    "    }\n",
    "$$\n",
    "\n",
    "A clique is defined as a subset of a graph where\n",
    "\n",
    "1. The subset has at least 3 vertices\n",
    "2. All pairs of vertices are connected\n",
    "3. The subset is as large as possible\n",
    "\n",
    "Because of the symmetry required in condition 2, we only need to consider the graph $S$ where $s_{ij} = 1$ if vertcies $i$ and $j$ communicate and 0 otherwise. Then the on-zero diagonal entries of $S^3$ is the set states recurrent in 3 steps. That is, there is a bi-directional path ${s_i \\leftrightarrow s_j \\leftrightarrow s_k \\leftrightarrow s_i}$, which means that the vertices $\\{s_i, s_j, s_k\\}$ form a subset of a clique."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "M = np.array([\n",
    "    [0,1,0,1,1],\n",
    "    [1,0,0,1,1],\n",
    "    [1,1,0,1,0],\n",
    "    [1,1,0,0,0],\n",
    "    [1,0,0,1,0]   \n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**5**. Suppose we wish to solve the problem $t = Mt + b$ - here the notation is from one type of such problems where $t$ is the temperature, $M$ is a matrix for diffusion, and $b$ represent fixed boundary conditions. Suppose we have a 5 by 5 grid system whose boundary temperatures are fixed. Let $M$ is a matrix with $1/4$ for the $\\pm 1$ off-diagonals and 0 elsewhere (i.e. diffusion is approximated by the average temperature of the 4 N, S, E, W neighbors), and $b$ is the vector $(5,2,3,3,0,1,3,0,1)$ - this assumes the temperatures along the bottom = 0, right edge = 1, top = 2 and left edge = 3. Find the equilibrium temperature at each of the 9 interior points\n",
    "\n",
    "- by solving a linear equation\n",
    "- by iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Direct solution - not possible for large matrices"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Jacobi iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**6**. Iterated affine maps\n",
    "\n",
    "Define the following mapping in $\\mathbb{R}^2$\n",
    "\n",
    "$$\n",
    "T_i: \\pmatrix{x \\\\ y} \\to s \\pmatrix{\\cos \\theta & - \\sin \\theta \\\\ \\sin \\theta & \\cos \\theta} \\pmatrix{x \\\\ y} + \\pmatrix{a_i \\\\ b_i}\n",
    "$$\n",
    "\n",
    "Suppose $s = 1/3$, $\\theta = 0$, and $\\pmatrix{a_i \\\\ b_i}$ are \n",
    "\n",
    "$$\n",
    "\\pmatrix{0 \\\\ 0}, \\pmatrix{1/3 \\\\ 0},\n",
    "\\pmatrix{2/3 \\\\ 0}, \\pmatrix{0 \\\\ 1/3},\n",
    "\\pmatrix{2/3 \\\\ 1/3}, \\pmatrix{0 \\\\ 2/3}, \n",
    "\\pmatrix{1/3 \\\\ 2/3}, \\pmatrix{2/3 \\\\ 2/3}\n",
    "$$\n",
    "\n",
    "Generate 1,000 points by first randomly selecting a point in the unit square, then applying at random one of th transformations $T_i$ to the point. Plot the resulting 1,000 points as a scatter plot on in a square frame.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ab =[\n",
    "    [0,0],\n",
    "    [1/3,0],\n",
    "    [2/3,0],\n",
    "    [0,1/3],\n",
    "    [2/3,1/3],\n",
    "    [0,2/3],\n",
    "    [1/3,2/3],\n",
    "    [2/3,2/3]\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**7**. The Fibonacci sequence came about from this toy model of rabbit population dynamics\n",
    "\n",
    "- A baby rabbit matures into an adult in 1 time unit\n",
    "- An adult gives birth to exactly 1 baby in 1 time unit\n",
    "- Rabbits are immortal\n",
    "\n",
    "This gives the well known formula for the number of rabbits over discrete time $F_{k+2} = F_{k} + F_{k+1}$\n",
    "\n",
    "- Express this model as a matrix equation, and calculate the long-term growth rate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let the population at any time be expreessed as the vector\n",
    "$$\n",
    "\\pmatrix{\\text{adult} \\\\ \\text{baby} }\n",
    "$$\n",
    "\n",
    "In the next time step, there will be \n",
    "\n",
    "- 1 adult from each adult, and one adult from each baby\n",
    "- 1 baby from each adult"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**8**. Age-structured population growth\n",
    "\n",
    "Suppose that we observe the following Leslie matrix\n",
    "\n",
    "$$\n",
    "L = \\pmatrix{\n",
    "0 & 3 & 2 & 0.5 \\\\\n",
    "0.8 & 0 & 0 & 0 \\\\\n",
    "0 & 0.9 & 0 & 0 \\\\\n",
    "0 & 0 & 0.7 & 0\n",
    "}\n",
    "$$\n",
    "\n",
    "![img](../data/leslie.png)\n",
    "\n",
    "- Starting with just 1,000 females in age-group 0-15 at time 0 and nobody else, what is the expected population after 90 years?\n",
    "- Suppose we could alter the fertility in a *single* age group for this population - can we achieve a steady state non-zero population?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L = np.array([\n",
    "    [0,3,2,0.5],\n",
    "    [0.8,0,0,0],\n",
    "    [0,0.9,0,0],\n",
    "    [0,0,0.7,0]\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x0 = np.array([1000,0,0,0]).reshape(-1,1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the real eigenvalue with real eigenvector is dominant $\\vert L_1 \\vert > \\vert L_k \\vert$.\n",
    "\n",
    "A theorem says this will be true if you have any two positive consecutive entries in the first row of $L$.\n",
    "\n",
    "The growth rate is determined by the dominant real eigenvalue with real eigenvector - in the long term, whether the population will grow, shrink or reach steady state depends on whether this is greater than, less than or equal to 1 respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**9**.\n",
    "\n",
    "You are given the following set of data to fit a quadratic polynomial to\n",
    "\n",
    "```python\n",
    "x = np.arange(10)\n",
    "y = np.array([  1.58873597,   7.55101533,  10.71372171,   7.90123225,\n",
    "                -2.05877605, -12.40257359, -28.64568712, -46.39822281,\n",
    "                -68.15488905, -97.16032044])\n",
    "```\n",
    "\n",
    "- Find the least squares solution by using the normal equations $A^T A \\hat{x} = A^T y$. (5 points)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.arange(10)\n",
    "y = np.array([  1.58873597,   7.55101533,  10.71372171,   7.90123225,\n",
    "                -2.05877605, -12.40257359, -28.64568712, -46.39822281,\n",
    "                -68.15488905, -97.16032044])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**10**. \n",
    "\n",
    "You are given the following data\n",
    "\n",
    "```python\n",
    "A = np.array([[1, 8, 0, 7],\n",
    "       [0, 2, 9, 4],\n",
    "       [2, 8, 8, 3],\n",
    "       [4, 8, 6, 1],\n",
    "       [2, 1, 9, 6],\n",
    "       [0, 7, 0, 1],\n",
    "       [4, 0, 2, 4],\n",
    "       [1, 4, 9, 5],\n",
    "       [6, 2, 6, 6],\n",
    "       [9, 9, 6, 3]], dtype='float')\n",
    "\n",
    "b = np.array([[2],\n",
    "       [5],\n",
    "       [0],\n",
    "       [0],\n",
    "       [6],\n",
    "       [7],\n",
    "       [2],\n",
    "       [6],\n",
    "       [7],\n",
    "       [9]], dtype='float')\n",
    "```\n",
    "\n",
    "- Using SVD directly (not via `lstsq`), find the least squares solution to $Ax = b$ (10 points)\n",
    "- Use SVD to find the best rank 3 approximation of A (10 points)\n",
    "- Calculate the approximation error in terms of the Frobenius norm (5 points)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.array([[1, 8, 0, 7],\n",
    "       [0, 2, 9, 4],\n",
    "       [2, 8, 8, 3],\n",
    "       [4, 8, 6, 1],\n",
    "       [2, 1, 9, 6],\n",
    "       [0, 7, 0, 1],\n",
    "       [4, 0, 2, 4],\n",
    "       [1, 4, 9, 5],\n",
    "       [6, 2, 6, 6],\n",
    "       [9, 9, 6, 3]], dtype='float')\n",
    "\n",
    "b = np.array([[2],\n",
    "       [5],\n",
    "       [0],\n",
    "       [0],\n",
    "       [6],\n",
    "       [7],\n",
    "       [2],\n",
    "       [6],\n",
    "       [7],\n",
    "       [9]], dtype='float')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**11**.\n",
    "\n",
    "The page rank of a node is given by the equation\n",
    "\n",
    "![i1](https://wikimedia.org/api/rest_v1/media/math/render/svg/6bb0f1469218a064274fd4691143e9ce64639dc2)\n",
    "\n",
    "and at steady state, we have the page rank vector $R$\n",
    "\n",
    "![i3](https://wikimedia.org/api/rest_v1/media/math/render/svg/65d2fed50688deaca4640b117c88a9e7a3c2ef0d)\n",
    "\n",
    "where $d$ is the damping factor, $N$ is the number of nodes, $1$ is a vector of ones, and \n",
    "\n",
    "![i2.5](https://wikimedia.org/api/rest_v1/media/math/render/svg/3e82b446a376633a386b10668703a4547f167d1c)\n",
    "\n",
    "where $L(p_j)$ is the number of outgoing links from node $p_j$.\n",
    "\n",
    "Consider the graph\n",
    "\n",
    "![i0](../data/pagerank.png)\n",
    "\n",
    "If $d = 0.9$ find the page rank of each node\n",
    "\n",
    "- By solving a linear system \n",
    "- By eigendecomposition\n",
    "\n",
    "Note: The Markov matrix constructed as instructed does not follow the usual convention. Here the columns of our Markov matrix are probability vectors, and the page rank is considered to be a column vector of the steady state probabilities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**12**.\n",
    "\n",
    "Recall that a covariance matrix is a matrix whose entries are\n",
    "\n",
    "![img](https://wikimedia.org/api/rest_v1/media/math/render/svg/4df2969e65403dd04f2c64137d21ff59b5f54190)\n",
    "\n",
    "1. Find the sample covariance matrix of the 4 features of the **iris** data set at http://bit.ly/2ow0oJO using basic `numpy` operations on `ndarrasy`. Do **not** use the `np.cov` or equivalent functions in `pandas` (except for checking). Remember to scale by $1/(n-1)$ for the sample covariance. (10 points)\n",
    "2. Plot the first 2 principal components of the `iris` data by using eigendecoposition, coloring each data point by the species (10 points)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "url = 'http://bit.ly/2ow0oJO'\n",
    "iris = pd.read_csv(url)\n",
    "iris.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}