Optimization

In [17]:
import os
import sys
import glob
import operator as op
import itertools as it
from functools import reduce, partial
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context("notebook", font_scale=1.5)
%matplotlib inline

Background for Exercises 1 and 2

The first 2 exercises are about using Newton’s method to find the cube roots of unity - find \(z\) such that \(z^3 = 1\). From the fundamental theorem of algebra, we know there must be exactly 3 complex roots since this is a degree 3 polynomial.

We start with Euler’s fabulous equation

e^{ix} = \cos x + i \sin x

Raising \(e^{ix}\) to the \(n\)th power where \(n\) is an integer, we get from Euler’s formula with \(nx\) substituting for \(x\)

(e^{ix})^n = e^{i(nx)} = \cos nx + i \sin nx

Whenever \(nx\) is an integer multiple of \(2\pi\) (i.e. \(n=2\pi k\)), we have

\cos nx + i \sin nx = 1

So

e^{2\pi i \frac{k}{n}}

is a root of 1 whenever \(k/n = 0, 1, 2, \ldots\).

So the cube roots of unity are \(1, e^{2\pi i/3}, e^{4\pi i/3}\).

While we can do this analytically, the idea is to use Newton’s method to find the basins of attraction for these roots, and in the process, discover some rather perplexing behavior of Newton’s method.

In [18]:
from sympy import Symbol, exp, I, pi, N, expand
from sympy import init_printing
init_printing()
In [19]:
expand(exp(2*pi*I/3), complex=True)
Out[19]:
\[- \frac{1}{2} + \frac{\sqrt{3} i}{2}\]
In [20]:
expand(exp(4*pi*I/3), complex=True)
Out[20]:
\[- \frac{1}{2} - \frac{\sqrt{3} i}{2}\]
In [21]:
plt.figure(figsize=(4,4))
roots = np.array([[1,0], [-0.5, np.sqrt(3)/2], [-0.5, -np.sqrt(3)/2]])
plt.scatter(roots[:,0], roots[:,1], s=50, c='red')
xp = np.linspace(0, 2*np.pi, 100)
plt.plot(np.cos(xp), np.sin(xp), c='blue');
homework/../_build/doctrees/nbsphinx/homework_Homework06_6_0.png

Exercise 1 (20 points). Netwon’s method for functions of complex variables - stability and basins of attraction.

  1. Write a function with the following function signature newton(z, f, fprime, max_iter=100, tol=1e-6) where
    • z is a starting value (a complex number e.g.3 + 4j)
    • f is a function of z
    • fprime is the derivative of f The function will run until either max_iter is reached or the absolute value of the Newton step is less than tol. In either case, the function should return the number of iterations taken and the final value of z as a tuple (i, z).
  2. Define the function f and fprime that will result in Newton’s method finding the cube roots of 1. Find 3 starting points that will give different roots, and print both the start and end points.
In [ ]:

Exercise 2 (20 points). Write the following two plotting functions to see some (pretty) aspects of Newton’s algorithm in the complex plane.

  1. The first function plot_newton_iters(f, fprime, n=200, extent=[-1,1,-1,1], cmap='hsv') calculates and stores the number of iterations taken for convergence (or max_iter) for each point in a 2D array. The 2D array limits are given by extent - for example, when extent = [-1,1,-1,1] the corners of the plot are (-i, -i), (1, -i), (1, i), (-1, i). There are n grid points in both the real and imaginary axes. The argument cmap specifies the color map to use - the suggested defaults are fine. Finally plot the image using plt.imshow - make sure the axis ticks are correctly scaled. Make a plot for the cube roots of 1.
  2. The second function plot_newton_basins(f, fprime, n=200, extent=[-1,1,-1,1], cmap='jet') has the same arguments, but this time the grid stores the identity of the root that the starting point convered to. Make a plot for the cube roots of 1 - since there are 3 roots, there should be only 3 colors in the plot.
In [ ]:

Exercise 3 (30 points). Create your own steepest descent optimizer.

  1. Your optimizer should take as arguments a function \(f\), its gradient, an initial guess, a step size, a tolerance (to determine convergence), and a maximum number of steps. It should output convergence status, total number of steps, and the value of the input variable that optimizes \(f\).

  2. Setting \(\alpha = -0.05\), use your optimizer to find the minimum of

    \[f(x,y) = x^2 + 2y^2\]

    using an initial guess of \((0.1,0.1)\), a tolerance of \(0.001\) and a max of \(100\) interations.

  3. In this case, it is possible to (analytically) determine the optimal step size. Compute the optimal step size \(\alpha^*\) for the initial point of \((0.1,0.1)\) and run the optimizer using step sizes \(\alpha = \alpha^*/2\) and \(\alpha = 2\alpha^*\). What do you observe about the behavior of the optimizer as step size is varied?

In [ ]:

Exercise 4 (30 points). Conjugate Gradient.

Consider the function:

\[f:\mathbb{R}^2\rightarrow \mathbb{R}\]
\[(x,y) \mapsto x^2 +2y^2 -2x+y\]

The following is intended to help you understand the conjugate gradient method, by computing the first two steps by hand (‘by hand’ means that you write code to explicitly compute each of the values requested).

  1. Express the function \(f\) in matrix form.
  2. Starting at some initial point \(x_0\in \mathbb{R}^2\), compute \(\alpha_0,p_0,x_1,\alpha_1,p_1\) and \(x_2\) as defined in the notes on Newton-CG.
In [ ]: