{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Constrained Optimization" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "from scipy.linalg import cholesky, solve_triangular, cho_solve, cho_factor\n", "from scipy.linalg import solve\n", "from scipy.optimize import minimize" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Lagrange multipliers and constrained optimization\n", "\n", "Recall why Lagrange multipliers are useful for constrained optimization - a stationary point must be where the constraint surface $g$ touches a level set of the function $f$ (since the value of $f$ does not change on a level set). At that point, $f$ and $g$ are parallel, and hence their gradients are also parallel (since the gradient is normal to the level set). So we want to solve\n", "\n", "$$\\nabla f = -\\lambda \\nabla g$$\n", "\n", "or equivalently,\n", "\n", "$$\\nabla f + \\lambda \\nabla g = 0$$\n", "\n", "![Lagrange multipliers](https://upload.wikimedia.org/wikipedia/commons/thumb/b/bf/LagrangeMultipliers2D.svg/300px-LagrangeMultipliers2D.svg.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Maximize $f (x, y, z) = xy + yz$ subject to the constraints $x + 2y = 6$ and $x − 3z = 0$.\n", "\n", "We set up the equations\n", "\n", "$$\n", "F (x, y, z, λ, μ) = xy + yz − λ(x + 2y − 6) − μ(x − 3z)\n", "$$\n", "\n", "Now set partial derivatives to zero and solve the following set of equations\n", "\n", "\\begin{align}\n", "y - \\lambda - \\mu &= 0 \\\\\n", "x + z - 2\\lambda &= 0 \\\\\n", "y +3\\mu &= 0 \\\\\n", "x + 2y - 6 &= 0 \\\\\n", "x - 3z &= 0\n", "\\end{align}\n", "\n", "which is a linear equation in $x, y, z, \\lambda, \\mu$\n", "\n", "$$\n", "\\begin{pmatrix}\n", "0 & 1 & 0 & -1 & -1 \\\\\n", "1 & 0 & 1 & -2 & 0 \\\\\n", "0 & 1 & 0 & 0 & 3 \\\\\n", "1 & 2 & 0 & 0 & 0 \\\\\n", "1 & 0 & -3 & 0 & 0 \\\\\n", "\\end{pmatrix}\\pmatrix{x \\\\ y \\\\ z \\\\ \\lambda \\\\ \\mu} = \\pmatrix{0 \\\\ 0 \\\\ 0 \\\\ 6 \\\\ 0}\n", "$$" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "A = np.array([\n", " [0, 1, 0, -1, -1],\n", " [1, 0, 1, -2, 0],\n", " [0, 1, 0, 0, 3],\n", " [1, 2, 0, 0, 0],\n", " [1, 0,-3, 0, 0]])\n", "\n", "b = np.array([0,0,0,6,0])\n", "\n", "sol = solve(A, b)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3. , 1.5, 1. , 2. , -0.5])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sol" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def f(x, y, z):\n", " return x*y + y*z" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.0" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f(*sol[:3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using `scipy.optimize`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first need to set this as a minimization problem." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def f(x):\n", " return -(x[0]*x[1] + x[1]*x[2])" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "cons = ({'type': 'eq',\n", " 'fun' : lambda x: np.array([x[0] + 2*x[1] - 6, x[0] - 3*x[2]])})" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2.99999979, 1.50000011, 0.99999993])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x0 = np.array([2,2,0.67])\n", "res = minimize(f, x0, constraints=cons)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-5.999999999999969" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res.fun" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2.99999979, 1.50000011, 0.99999993])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res.x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find the minimum of the following quadratic function on $\\mathbb{R}^2$ \n", "\n", "$$f(x) = x^TAx +b^Tx +c$$\n", "where\n", "$$A = \\left(\\begin{matrix}13&5\\\\5&7\\end{matrix}\\right), b = \\left(\\begin{matrix}1\\\\1\\end{matrix}\\right) \\textrm {and } c = 2$$\n", "\n", "Under the constraints:\n", "$$g(x) = 2x_1-5x_2=2 \\;\\;\\;\\;\\;\\; \\textrm{ and } \\;\\;\\;\\;\\;\\; h(x) = x_1+x_2=1$$\n", "\n", "1. Use a matrix decomposition method to find the minimum of the *unconstrained* problem without using `scipy.optimize` (Use library functions - no need to code your own). Note: for full credit you should exploit matrix structure. \n", "2. Find the solution using constrained optimization with the `scipy.optimize` package. \n", "2. Use Lagrange multipliers and solving the resulting set of equations directly without using `scipy.optimize`. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solve unconstrained problem\n", "\n", "To find the minimum, we differentiate $f(x)$ with respect to $x^T$ and set it equal to $0$. We thus need to solve\n", "\n", "$$\n", "2Ax + b = 0\n", "$$\n", "\n", "or\n", "\n", "$$ \n", "Ax = -\\frac{b}{2}\n", "$$\n", "\n", "We see that $A$ is a symmetric positive definite real matrix. Hence we use Cholesky factorization." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Steps\n", "\n", "$$\n", "L = \\text{cholesky}(A) \\\\\n", "\\text{solve } Ly = b \\\\\n", "\\text{solve } L^Tx = y\n", "$$" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "A = np.array([\n", " [13,5],\n", " [5,7]\n", "])\n", "b = np.array([1,1]).reshape(-1,1)\n", "c = 2" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "L = cholesky(A, lower=True)\n", "y = solve_triangular(L, -b/2, lower=True)\n", "x = solve_triangular(L.T, y, lower=False)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.01515152],\n", " [-0.06060606]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Short cut" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.01515152],\n", " [-0.06060606]])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cho_solve(cho_factor(A), -b/2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solve constrained problem using `scipy.optimize`" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "f = lambda x, A, b, c: x.T @ A @ x + b.T @ x + c" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from scipy.optimize import minimize" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1.00000000e+00, 3.41607086e-16])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cons = ({'type': 'eq', 'fun': lambda x: 2*x[0] - 5*x[1] - 2},\n", " {'type': 'eq', 'fun': lambda x: x[0] + x[1] - 1})\n", "\n", "res = minimize(f, [0,0], constraints=cons, args=(A, b, c))\n", "res.x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solve constrained problem using Lagrange multipliers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We set up the equations\n", "\n", "$$\n", "F(x_1, x_2, \\lambda, \\mu) = f + \\lambda g + \\mu h \n", "$$\n", "\n", "Sometimes this is written as \n", "\n", "$$\n", "F(x_1, x_2, \\lambda, \\mu) = f - \\lambda g - \\mu h\n", "$$\n", "\n", "All this means is a final sign change in the estimated values of $\\lambda$ and $\\mu$.\n", "\n", "We show the original equations for convenience\n", "\n", "$$f(x) = x^TAx +b^Tx +c$$\n", "\n", "where\n", "\n", "$$A = \\left(\\begin{matrix}13&5\\\\5&7\\end{matrix}\\right), b = \\left(\\begin{matrix}1\\\\1\\end{matrix}\\right) \\textrm {and } c = 2$$\n", "\n", "Under the constraints:\n", "\n", "$$g(x) = 2x_1-5x_2=2 \\;\\;\\;\\;\\;\\; \\textrm{ and } \\;\\;\\;\\;\\;\\; h(x) = x_1+x_2=1$$\n", "\n", "To make the calculations explicit, we rewrite $F$ as\n", "\n", "$$\n", "F = 13x_1^2 + 10xy_1x_2+ 7x_2^2 + x_1 + x_2 +2 + \\lambda(2x_1 - 5x_2 -2) + \\mu(x_1 + x_2 -1)\n", "$$\n", "\n", "Taking partial derivatives, we get\n", "\n", "$$\n", "\\begin{align}\n", "\\frac{\\delta F}{\\delta x_1} &=& 26 x_1 + 10 x_2 + 1 + 2\\lambda + \\mu &= 0 \\\\\n", "\\frac{\\delta F}{\\delta x_2} &=& 10 x_1 + 14 x_2 + 1 - 5\\lambda + \\mu &= 0 \\\\\n", "\\frac{\\delta F}{\\delta \\lambda} &=& 2 x_1 - 5 x_2 -2 &= 0 \\\\\n", "\\frac{\\delta F}{\\delta \\mu} &=& x_1 + x_2 - 1 &= 0\n", "\\end{align}\n", "$$\n", "\n", "Plugging in the numbers and expressing in matrix form, we get\n", "\n", "$$\n", "\\begin{bmatrix}\n", "26 & 10 & 2 & 1 \\\\\n", "10 & 14 & -5 & 1 \\\\\n", "2 & -5 & 0 & 0 \\\\\n", "1 & 1 & 0 & 0 \n", "\\end{bmatrix} \\begin{bmatrix}\n", "x_1 \\\\\n", "x_2 \\\\\n", "\\lambda \\\\\n", "\\mu\n", "\\end{bmatrix} = \\begin{bmatrix}\n", "-1 \\\\\n", "-1 \\\\\n", "2 \\\\\n", "1\n", "\\end{bmatrix}\n", "$$\n", "\n", "With a bit of practice, you can probably just write the matrix directly by inspection." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "A = np.array([\n", " [26, 10, 2, 1],\n", " [10, 14, -5, 1],\n", " [2, -5, 0, 0],\n", " [1, 1, 0, 0]\n", "])\n", "b = np.array([-1, -1, 2, 1]).reshape(-1,1)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1.00000000e+00],\n", " [-4.37360585e-17],\n", " [-2.28571429e+00],\n", " [-2.24285714e+01]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "solve(A, b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }