{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Constrained Optimization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.linalg import cholesky, solve_triangular, cho_solve, cho_factor\n",
    "from scipy.linalg import solve\n",
    "from scipy.optimize import minimize"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lagrange multipliers and constrained optimization\n",
    "\n",
    "Recall why Lagrange multipliers are useful for constrained optimization - a stationary point must be where the constraint surface $g$ touches a level set of the function $f$ (since the value of $f$ does not change on a level set). At that point, $f$ and $g$ are parallel, and hence their gradients are also parallel (since the gradient is normal to the level set). So we want to solve\n",
    "\n",
    "$$\\nabla f = -\\lambda \\nabla g$$\n",
    "\n",
    "or equivalently,\n",
    "\n",
    "$$\\nabla f + \\lambda \\nabla g = 0$$\n",
    "\n",
    "![Lagrange multipliers](https://upload.wikimedia.org/wikipedia/commons/thumb/b/bf/LagrangeMultipliers2D.svg/300px-LagrangeMultipliers2D.svg.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Maximize $f (x, y, z) = xy + yz$ subject to the constraints $x + 2y = 6$ and $x − 3z = 0$.\n",
    "\n",
    "We set up the equations\n",
    "\n",
    "$$\n",
    "F (x, y, z, λ, μ) = xy + yz − λ(x + 2y − 6) − μ(x − 3z)\n",
    "$$\n",
    "\n",
    "Now set partial derivatives to zero and solve the following set of equations\n",
    "\n",
    "\\begin{align}\n",
    "y - \\lambda - \\mu &= 0 \\\\\n",
    "x + z - 2\\lambda &= 0 \\\\\n",
    "y +3\\mu &= 0 \\\\\n",
    "x + 2y - 6 &= 0 \\\\\n",
    "x - 3z &= 0\n",
    "\\end{align}\n",
    "\n",
    "which is a linear equation in $x, y, z, \\lambda, \\mu$\n",
    "\n",
    "$$\n",
    "\\begin{pmatrix}\n",
    "0 & 1 & 0 & -1 & -1 \\\\\n",
    "1 & 0 & 1 & -2 & 0 \\\\\n",
    "0 & 1 & 0 & 0 & 3 \\\\\n",
    "1 & 2 & 0 & 0 & 0 \\\\\n",
    "1 & 0 & -3 & 0 & 0 \\\\\n",
    "\\end{pmatrix}\\pmatrix{x \\\\ y \\\\ z \\\\ \\lambda \\\\ \\mu} = \\pmatrix{0 \\\\ 0 \\\\ 0 \\\\ 6 \\\\ 0}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [0, 1, 0, -1, -1],\n",
    "    [1, 0, 1, -2, 0],\n",
    "    [0, 1, 0, 0, 3],\n",
    "    [1, 2, 0, 0, 0],\n",
    "    [1, 0,-3, 0, 0]])\n",
    "\n",
    "b = np.array([0,0,0,6,0])\n",
    "\n",
    "sol = solve(A, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 3. ,  1.5,  1. ,  2. , -0.5])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def f(x, y, z):\n",
    "    return x*y + y*z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6.0"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f(*sol[:3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using `scipy.optimize`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We first need to set this as a minimization problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def f(x):\n",
    "    return -(x[0]*x[1] + x[1]*x[2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "cons = ({'type': 'eq',\n",
    "         'fun' : lambda x: np.array([x[0] + 2*x[1] - 6, x[0] - 3*x[2]])})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2.99999979, 1.50000011, 0.99999993])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x0 = np.array([2,2,0.67])\n",
    "res = minimize(f, x0, constraints=cons)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-5.999999999999969"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.fun"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2.99999979, 1.50000011, 0.99999993])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "res.x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the minimum of the following quadratic function on $\\mathbb{R}^2$ \n",
    "\n",
    "$$f(x) = x^TAx +b^Tx +c$$\n",
    "where\n",
    "$$A = \\left(\\begin{matrix}13&5\\\\5&7\\end{matrix}\\right), b = \\left(\\begin{matrix}1\\\\1\\end{matrix}\\right) \\textrm {and } c = 2$$\n",
    "\n",
    "Under the constraints:\n",
    "$$g(x) = 2x_1-5x_2=2 \\;\\;\\;\\;\\;\\; \\textrm{ and } \\;\\;\\;\\;\\;\\; h(x) = x_1+x_2=1$$\n",
    "\n",
    "1. Use a matrix decomposition method to find the minimum of the *unconstrained* problem without using `scipy.optimize` (Use library functions - no need to code your own). Note: for full credit you should exploit matrix structure. \n",
    "2. Find the solution using constrained optimization with the `scipy.optimize` package. \n",
    "2. Use Lagrange multipliers and solving the resulting set of equations directly without using `scipy.optimize`. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solve unconstrained problem\n",
    "\n",
    "To find the minimum, we differentiate $f(x)$ with respect to $x^T$ and set it equal to $0$. We thus need to solve\n",
    "\n",
    "$$\n",
    "2Ax + b = 0\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "$$ \n",
    "Ax = -\\frac{b}{2}\n",
    "$$\n",
    "\n",
    "We see that $A$ is a symmetric positive definite real matrix. Hence we use Cholesky factorization."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Steps\n",
    "\n",
    "$$\n",
    "L = \\text{cholesky}(A) \\\\\n",
    "\\text{solve } Ly = b \\\\\n",
    "\\text{solve } L^Tx = y\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [13,5],\n",
    "    [5,7]\n",
    "])\n",
    "b = np.array([1,1]).reshape(-1,1)\n",
    "c = 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "L = cholesky(A, lower=True)\n",
    "y = solve_triangular(L, -b/2, lower=True)\n",
    "x = solve_triangular(L.T, y, lower=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.01515152],\n",
       "       [-0.06060606]])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Short cut"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.01515152],\n",
       "       [-0.06060606]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cho_solve(cho_factor(A), -b/2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solve constrained problem using `scipy.optimize`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "f = lambda x, A, b, c: x.T @ A @ x + b.T @ x + c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from scipy.optimize import minimize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.00000000e+00, 3.41607086e-16])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cons = ({'type': 'eq', 'fun': lambda x: 2*x[0] - 5*x[1] - 2},\n",
    "        {'type': 'eq', 'fun': lambda x: x[0] + x[1] - 1})\n",
    "\n",
    "res = minimize(f, [0,0], constraints=cons, args=(A, b, c))\n",
    "res.x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Solve constrained problem using Lagrange multipliers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We set up the equations\n",
    "\n",
    "$$\n",
    "F(x_1, x_2, \\lambda, \\mu) = f + \\lambda g + \\mu h \n",
    "$$\n",
    "\n",
    "Sometimes this is written as \n",
    "\n",
    "$$\n",
    "F(x_1, x_2, \\lambda, \\mu) = f - \\lambda g - \\mu h\n",
    "$$\n",
    "\n",
    "All this means is a final sign change in the estimated values of $\\lambda$ and $\\mu$.\n",
    "\n",
    "We show the original equations for convenience\n",
    "\n",
    "$$f(x) = x^TAx +b^Tx +c$$\n",
    "\n",
    "where\n",
    "\n",
    "$$A = \\left(\\begin{matrix}13&5\\\\5&7\\end{matrix}\\right), b = \\left(\\begin{matrix}1\\\\1\\end{matrix}\\right) \\textrm {and } c = 2$$\n",
    "\n",
    "Under the constraints:\n",
    "\n",
    "$$g(x) = 2x_1-5x_2=2 \\;\\;\\;\\;\\;\\; \\textrm{ and } \\;\\;\\;\\;\\;\\; h(x) = x_1+x_2=1$$\n",
    "\n",
    "To make the calculations explicit, we rewrite $F$ as\n",
    "\n",
    "$$\n",
    "F = 13x_1^2 + 10xy_1x_2+ 7x_2^2 + x_1 + x_2 +2 + \\lambda(2x_1 - 5x_2 -2) + \\mu(x_1 + x_2 -1)\n",
    "$$\n",
    "\n",
    "Taking partial derivatives, we get\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\frac{\\delta F}{\\delta x_1} &=& 26 x_1 + 10 x_2 + 1 + 2\\lambda + \\mu &= 0 \\\\\n",
    "\\frac{\\delta F}{\\delta x_2} &=& 10 x_1 + 14 x_2 + 1 - 5\\lambda + \\mu &= 0 \\\\\n",
    "\\frac{\\delta F}{\\delta \\lambda} &=& 2 x_1 - 5 x_2 -2 &= 0 \\\\\n",
    "\\frac{\\delta F}{\\delta \\mu} &=& x_1 + x_2 - 1 &= 0\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "Plugging in the numbers and expressing in matrix form, we get\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "26 & 10 & 2 & 1 \\\\\n",
    "10 & 14 & -5 & 1 \\\\\n",
    "2 & -5 & 0 & 0 \\\\\n",
    "1 & 1 & 0 & 0 \n",
    "\\end{bmatrix} \\begin{bmatrix}\n",
    "x_1 \\\\\n",
    "x_2 \\\\\n",
    "\\lambda \\\\\n",
    "\\mu\n",
    "\\end{bmatrix} = \\begin{bmatrix}\n",
    "-1 \\\\\n",
    "-1 \\\\\n",
    "2 \\\\\n",
    "1\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "With a bit of practice, you can probably just write the matrix directly by inspection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "A = np.array([\n",
    "    [26, 10, 2, 1],\n",
    "    [10, 14, -5, 1],\n",
    "    [2, -5, 0, 0],\n",
    "    [1, 1, 0, 0]\n",
    "])\n",
    "b = np.array([-1, -1, 2, 1]).reshape(-1,1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.00000000e+00],\n",
       "       [-4.37360585e-17],\n",
       "       [-2.28571429e+00],\n",
       "       [-2.24285714e+01]])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "solve(A, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}