Optimization¶
In [17]:
import os
import sys
import glob
import operator as op
import itertools as it
from functools import reduce, partial
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context("notebook", font_scale=1.5)
%matplotlib inline
Background for Exercises 1 and 2¶
The first 2 exercises are about using Newton’s method to find the cube roots of unity - find \(z\) such that \(z^3 = 1\). From the fundamental theorem of algebra, we know there must be exactly 3 complex roots since this is a degree 3 polynomial.
We start with Euler’s fabulous equation
Raising \(e^{ix}\) to the \(n\)th power where \(n\) is an integer, we get from Euler’s formula with \(nx\) substituting for \(x\)
Whenever \(nx\) is an integer multiple of \(2\pi\) (i.e. \(n=2\pi k\)), we have
So
is a root of 1 whenever \(k/n = 0, 1, 2, \ldots\).
So the cube roots of unity are \(1, e^{2\pi i/3}, e^{4\pi i/3}\).
While we can do this analytically, the idea is to use Newton’s method to find the basins of attraction for these roots, and in the process, discover some rather perplexing behavior of Newton’s method.
In [18]:
from sympy import Symbol, exp, I, pi, N, expand
from sympy import init_printing
init_printing()
In [19]:
expand(exp(2*pi*I/3), complex=True)
Out[19]:
In [20]:
expand(exp(4*pi*I/3), complex=True)
Out[20]:
In [21]:
plt.figure(figsize=(4,4))
roots = np.array([[1,0], [-0.5, np.sqrt(3)/2], [-0.5, -np.sqrt(3)/2]])
plt.scatter(roots[:,0], roots[:,1], s=50, c='red')
xp = np.linspace(0, 2*np.pi, 100)
plt.plot(np.cos(xp), np.sin(xp), c='blue');
Exercise 1 (20 points). Netwon’s method for functions of complex variables - stability and basins of attraction.
- Write a function with the following function signature
newton(z, f, fprime, max_iter=100, tol=1e-6)
wherez
is a starting value (a complex number e.g.3 + 4j
)f
is a function ofz
fprime
is the derivative off
The function will run until either max_iter is reached or the absolute value of the Newton step is less than tol. In either case, the function should return the number of iterations taken and the final value ofz
as a tuple (i
,z
).
- Define the function
f
andfprime
that will result in Newton’s method finding the cube roots of 1. Find 3 starting points that will give different roots, and print both the start and end points.
In [ ]:
Exercise 2 (20 points). Write the following two plotting functions to see some (pretty) aspects of Newton’s algorithm in the complex plane.
- The first function
plot_newton_iters(f, fprime, n=200, extent=[-1,1,-1,1], cmap='hsv')
calculates and stores the number of iterations taken for convergence (or max_iter) for each point in a 2D array. The 2D array limits are given byextent
- for example, whenextent = [-1,1,-1,1]
the corners of the plot are(-i, -i), (1, -i), (1, i), (-1, i)
. There aren
grid points in both the real and imaginary axes. The argumentcmap
specifies the color map to use - the suggested defaults are fine. Finally plot the image usingplt.imshow
- make sure the axis ticks are correctly scaled. Make a plot for the cube roots of 1. - The second function
plot_newton_basins(f, fprime, n=200, extent=[-1,1,-1,1], cmap='jet')
has the same arguments, but this time the grid stores the identity of the root that the starting point convered to. Make a plot for the cube roots of 1 - since there are 3 roots, there should be only 3 colors in the plot.
In [ ]:
Exercise 3 (30 points). Create your own steepest descent optimizer.
Your optimizer should take as arguments a function \(f\), its gradient, an initial guess, a step size, a tolerance (to determine convergence), and a maximum number of steps. It should output convergence status, total number of steps, and the value of the input variable that optimizes \(f\).
Setting \(\alpha = -0.05\), use your optimizer to find the minimum of
\[f(x,y) = x^2 + 2y^2\]using an initial guess of \((0.1,0.1)\), a tolerance of \(0.001\) and a max of \(100\) interations.
In this case, it is possible to (analytically) determine the optimal step size. Compute the optimal step size \(\alpha^*\) for the initial point of \((0.1,0.1)\) and run the optimizer using step sizes \(\alpha = \alpha^*/2\) and \(\alpha = 2\alpha^*\). What do you observe about the behavior of the optimizer as step size is varied?
In [ ]:
Exercise 4 (30 points). Conjugate Gradient.
Consider the function:
The following is intended to help you understand the conjugate gradient method, by computing the first two steps by hand (‘by hand’ means that you write code to explicitly compute each of the values requested).
- Express the function \(f\) in matrix form.
- Starting at some initial point \(x_0\in \mathbb{R}^2\), compute \(\alpha_0,p_0,x_1,\alpha_1,p_1\) and \(x_2\) as defined in the notes on Newton-CG.
In [ ]: