Linear Algebra and Some Fun MathΒΆ

Question 1 (20 points).

Euclid’s algorithm for finding the greatest common divisor of two numbers is

gcd(a, 0) = a
gcd(a, b) = gcd(b, a modulo b)
  • Write a function to find the greatest common divisor in Python (8 points)
  • What is the greatest common divisor of 17384 and 1928? (2 point)
  • Write a function to calculate the least common multiple (8 points)
  • What is the least common multiple of 17384 and 1928? (2 point)
In [1]:





Question 2 (20 points).

Consider the linear transformation \(f(x)\) on \(\mathbb{R}^3\) that takes the standard basis \(\left\{e_1,e_2,e_3\right\}\) to \(\left\{v_1,v_2,v_3\right\}\) where

\[\begin{split}v_1=\left(\begin{matrix}10\\-10\\16\end{matrix}\right), v_2=\left(\begin{matrix}2\\-5\\20\end{matrix}\right) \textrm {and } v_3=\left(\begin{matrix}1\\-4\\13\end{matrix}\right)\end{split}\]
  1. Write a matrix \(A\) that represents the same linear transformaton. (4 points)
  2. Compute the rank of \(A\) using two different methods (do not use matrix_rank!). (4 points)
  3. Find the eigenvalues and eigenvectors of \(A\). (4 points)
  4. What is the matrix representation of \(f\) with respect to the eigenbasis? (48 points)
In [1]:





Exercise 3 (20 pts). Avodiing catastrophic cancellation.

Read the Wikipedia entry on loss of significance. Then answer the following problem:

The tail of the standard logistic distributon is given by \(1 - F(t) = 1 - (1+e^{-t})^{-1}\).

  • Define a function f1 to calculate the tail probability of the logistic distribution using the formula given above
  • Use `sympy <http://docs.sympy.org/latest/index.html>`__ to find the exact value of the tail distribution (using the same symbolic formula) to 20 decimal digits
  • Calculate the relative error of f1 when \(t = 25\) (The relative error is given by abs(exact - approximate)/exact)
  • Rewrite the expression for the tail of the logistic distribution using simple algebra so that there is no risk of cancellation, and write a function f2 using this formula. Calculate the relative error of f2 when \(t = 25\).
  • How much more accurate is f2 compared with f1 in terms of the relative error?
In [1]:




Exercise 4 (40 pts). One of the goals of the course it that you will be able to implement novel algorihtms from the literature.

  • Implement the mean-shift algorithm in 1D as described here.

    • Use the following function signature

      def mean_shift(xs, x, kernel, max_iters=100, tol=1e-6):
      
    • xs is the data set, x is the starting location, and kernel is a kernel function

    • tol is the difference in \(||x||\) across iterations

  • Use the following kernels with bandwidth \(h\) (a default value of 1.0 will work fine)

    • Flat - return 1 if \(||x|| < h\) and 0 otherwise

    • Gaussian

      \[\frac{1}{\sqrt{2 \pi h}}e^{\frac{-||x||^2}{h^2}}\]
    • Note that \(||x||\) is the norm of the data point being evaluated minus the current value of \(x\)

  • Use both kernels to find all 3 modes of the data set in x1d.npy

  • Modify the algorihtm abd/or kernels so that it now works in an arbitrary number of dimensions.

  • Uset both kernels to find all 3 modes of the data set in x2d.npy

  • Plot the path of successive intermeidate solutions of the mean-shift algorithm starting from x0 = (-4, 5) until it converges onto a mode in the 2D data for each kernel. Superimposet the path on top of a contour plot of the data density. Repeat for x0 = (0, 0) and x0 = (10, 10) .

In [1]: