Constrained Optimization¶

In [19]:

import numpy as np
from scipy.linalg import cholesky, solve_triangular, cho_solve, cho_factor
from scipy.linalg import solve
from scipy.optimize import minimize

Lagrange multipliers and constrained optimization¶

Recall why Lagrange multipliers are useful for constrained optimization - a stationary point must be where the constraint surface \(g\) touches a level set of the function \(f\) (since the value of \(f\) does not change on a level set). At that point, \(f\) and \(g\) are parallel, and hence their gradients are also parallel (since the gradient is normal to the level set). So we want to solve

\[\nabla f = -\lambda \nabla g\]

or equivalently,

\[\nabla f + \lambda \nabla g = 0\]

Lagrange multipliers

In practice we solve the constrained optimization problem by finding the partial derivatives of the Lagrangian \(L\). We show for a system with two variables \(x\) and \(y\)

\[L = f(x,y) + \lambda g(x,y)\]

giving

\begin{align} \frac{\partial L}{\partial x} &= \frac{\partial f}{\partial x} + \lambda \frac{\partial g}{\partial x} \\ \frac{\partial L}{\partial y} &= \frac{\partial f}{\partial y} + \lambda \frac{\partial g}{\partial y} \\ \frac{\partial L}{\partial \lambda} &= g(x, y) \end{align}

Setting the partial derivatives to zero for the first two equations results in \(\nabla f = -\lambda \nabla g\), and setting the third equation to zero results in \(g(x, y) = 0\), the constraints we want to follow.

Example 1¶

Maximize \(f (x, y, z) = xy + yz\) subject to the constraints \(x + 2y = 6\) and \(x − 3z = 0\).

We set up the equations

\[F (x, y, z, λ, μ) = xy + yz − λ(x + 2y − 6) − μ(x − 3z)\]

Now set partial derivatives to zero and solve the following set of equations

\begin{align} y - \lambda - \mu &= 0 \\ x + z - 2\lambda &= 0 \\ y +3\mu &= 0 \\ x + 2y - 6 &= 0 \\ x - 3z &= 0 \end{align}

which is a linear equation in \(x, y, z, \lambda, \mu\)

\[\begin{split}\begin{pmatrix} 0 & 1 & 0 & -1 & -1 \\ 1 & 0 & 1 & -2 & 0 \\ 0 & 1 & 0 & 0 & 3 \\ 1 & 2 & 0 & 0 & 0 \\ 1 & 0 & -3 & 0 & 0 \\ \end{pmatrix}\pmatrix{x \\ y \\ z \\ \lambda \\ \mu} = \pmatrix{0 \\ 0 \\ 0 \\ 6 \\ 0}\end{split}\]

In [5]:

A = np.array([
    [0, 1, 0, -1, -1],
    [1, 0, 1, -2, 0],
    [0, 1, 0, 0, 3],
    [1, 2, 0, 0, 0],
    [1, 0,-3, 0, 0]])

b = np.array([0,0,0,6,0])

sol = solve(A, b)

In [6]:

sol

Out[6]:

array([ 3. ,  1.5,  1. ,  2. , -0.5])

In [12]:

def f(x, y, z):
    return x*y + y*z

In [14]:

f(*sol[:3])

Out[14]:

6.0

Using `scipy.optimize`¶

We first need to set this as a minimization problem.

In [15]:

def f(x):
    return -(x[0]*x[1] + x[1]*x[2])

In [16]:

cons = ({'type': 'eq',
         'fun' : lambda x: np.array([x[0] + 2*x[1] - 6, x[0] - 3*x[2]])})

In [23]:

x0 = np.array([2,2,0.67])
res = minimize(f, x0, constraints=cons)

Out[23]:

array([2.99999979, 1.50000011, 0.99999993])

In [25]:

res.fun

Out[25]:

-5.999999999999969

In [26]:

res.x

Out[26]:

array([2.99999979, 1.50000011, 0.99999993])

Example 2¶

Find the minimum of the following quadratic function on \(\mathbb{R}^2\)

\[ \begin{align}\begin{aligned}f(x) = x^TAx +b^Tx +c\\where\end{aligned}\end{align} \]

\[\begin{split}A = \left(\begin{matrix}13&5\\5&7\end{matrix}\right), b = \left(\begin{matrix}1\\1\end{matrix}\right) \textrm {and } c = 2\end{split}\]

Under the constraints:

\[g(x) = 2x_1-5x_2=2 \;\;\;\;\;\; \textrm{ and } \;\;\;\;\;\; h(x) = x_1+x_2=1\]

Use a matrix decomposition method to find the minimum of the unconstrained problem without using scipy.optimize (Use library functions - no need to code your own). Note: for full credit you should exploit matrix structure.
Find the solution using constrained optimization with the scipy.optimize package.
Use Lagrange multipliers and solving the resulting set of equations directly without using scipy.optimize.

Solve unconstrained problem¶

To find the minimum, we differentiate \(f(x)\) with respect to \(x^T\) and set it equal to \(0\). We thus need to solve

\[2Ax + b = 0\]

or

\[Ax = -\frac{b}{2}\]

We see that \(A\) is a symmetric positive definite real matrix. Hence we use Cholesky factorization.

Steps

\[\begin{split}L = \text{cholesky}(A) \\ \text{solve } Ly = b \\ \text{solve } L^Tx = y\end{split}\]

In [3]:

A = np.array([
    [13,5],
    [5,7]
])
b = np.array([1,1]).reshape(-1,1)
c = 2

In [4]:

L = cholesky(A, lower=True)
y = solve_triangular(L, -b/2, lower=True)
x = solve_triangular(L.T, y, lower=False)

In [5]:

Out[5]:

array([[-0.01515152],
       [-0.06060606]])

In [6]:

cho_solve(cho_factor(A), -b/2)

Out[6]:

array([[-0.01515152],
       [-0.06060606]])

Solve constrained problem using `scipy.optimize`¶

In [7]:

f = lambda x, A, b, c: x.T @ A @ x + b.T @ x + c

In [8]:

from scipy.optimize import minimize

In [9]:

cons = ({'type': 'eq', 'fun': lambda x: 2*x[0] - 5*x[1] - 2},
        {'type': 'eq', 'fun': lambda x: x[0] + x[1] - 1})

res = minimize(f, [0,0], constraints=cons, args=(A, b, c))
res.x

Out[9]:

array([1.00000000e+00, 3.41607086e-16])

Solve constrained problem using Lagrange multipliers¶

We set up the equations

\[F(x_1, x_2, \lambda, \mu) = f + \lambda g + \mu h\]

Sometimes this is written as

\[F(x_1, x_2, \lambda, \mu) = f - \lambda g - \mu h\]

All this means is a final sign change in the estimated values of \(\lambda\) and \(\mu\).

We show the original equations for convenience

\[f(x) = x^TAx +b^Tx +c\]

where

\[\begin{split}A = \left(\begin{matrix}13&5\\5&7\end{matrix}\right), b = \left(\begin{matrix}1\\1\end{matrix}\right) \textrm {and } c = 2\end{split}\]

Under the constraints:

\[g(x) = 2x_1-5x_2=2 \;\;\;\;\;\; \textrm{ and } \;\;\;\;\;\; h(x) = x_1+x_2=1\]

To make the calculations explicit, we rewrite \(F\) as

\[F = 13x_1^2 + 10xy_1x_2+ 7x_2^2 + x_1 + x_2 +2 + \lambda(2x_1 - 5x_2 -2) + \mu(x_1 + x_2 -1)\]

Taking partial derivatives, we get

\[\begin{split}\begin{align} \frac{\delta F}{\delta x_1} &=& 26 x_1 + 10 x_2 + 1 + 2\lambda + \mu &= 0 \\ \frac{\delta F}{\delta x_2} &=& 10 x_1 + 14 x_2 + 1 - 5\lambda + \mu &= 0 \\ \frac{\delta F}{\delta \lambda} &=& 2 x_1 - 5 x_2 -2 &= 0 \\ \frac{\delta F}{\delta \mu} &=& x_1 + x_2 - 1 &= 0 \end{align}\end{split}\]

Plugging in the numbers and expressing in matrix form, we get

\[\begin{split}\begin{bmatrix} 26 & 10 & 2 & 1 \\ 10 & 14 & -5 & 1 \\ 2 & -5 & 0 & 0 \\ 1 & 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \lambda \\ \mu \end{bmatrix} = \begin{bmatrix} -1 \\ -1 \\ 2 \\ 1 \end{bmatrix}\end{split}\]

With a bit of practice, you can probably just write the matrix directly by inspection.

In [11]:

A = np.array([
    [26, 10, 2, 1],
    [10, 14, -5, 1],
    [2, -5, 0, 0],
    [1, 1, 0, 0]
])
b = np.array([-1, -1, 2, 1]).reshape(-1,1)

In [12]:

solve(A, b)

Out[12]:

array([[ 1.00000000e+00],
       [-4.37360585e-17],
       [-2.28571429e+00],
       [-2.24285714e+01]])

In [ ]:

Constrained Optimization¶

Lagrange multipliers and constrained optimization¶

Example 1¶

Using `scipy.optimize`¶

Example 2¶

Solve unconstrained problem¶

Solve constrained problem using `scipy.optimize`¶

Solve constrained problem using Lagrange multipliers¶

Page contents

Previous page

Next page

This Page

Constrained Optimization¶

Lagrange multipliers and constrained optimization¶

Example 1¶

Using scipy.optimize¶

Example 2¶

Solve unconstrained problem¶

Solve constrained problem using scipy.optimize¶

Solve constrained problem using Lagrange multipliers¶

Using `scipy.optimize`¶

Solve constrained problem using `scipy.optimize`¶