Linear least squares

In [1]:
import warnings
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
import numpy as np
import scipy.linalg as la

Normal equations

  • Suppose \(Ax = b\) is inconsistent. In this case, we can look instead for \(\widehat{x}\) which minimizes the distance between \(Ax\) and \(b\). In other words, we need to minimize \(\Vert Ax - b \Vert^2\).
  • The minimum will occur when \(\langle Ax-b, Ax \rangle = 0\)
  • Solving \((Ax)^T(Ax - b) = 0\) gives the normal equations \(\widehat{x} = (A^TA)^{-1}A^T b\)
  • The corresponding vector in \(C(A)\) is \(A\widehat{x} = A(A^TA)^{-1}A^T b\)
  • Note that \(P_A = A(A^TA)^{-1}A^T\) is the projection matrix for \(C(A)\)
  • This makes sense, since any solution to \(Ax = b\) must be in \(C(A)\), and the nearest such vector to \(b\) must be the projection of \(b\) onto \(C(A)\).
In [2]:
m = 100
n = 5
A = np.random.rand(m, n)
b = np.random.rand(m, 1)

Using least squares function (SVD)

In [3]:
x, res, rank, s, = la.lstsq(A, b)
In [4]:
x
Out[4]:
array([[0.31192056],
       [0.15414167],
       [0.18783713],
       [0.07997706],
       [0.17878726]])
In [5]:
(A @ x).shape
Out[5]:
(100, 1)

Using normal equations (projection) - for illustration only.

In [6]:
la.inv(A.T @ A) @ A.T @ b
Out[6]:
array([[0.31192056],
       [0.15414167],
       [0.18783713],
       [0.07997706],
       [0.17878726]])

Projection matrices

  • Let \(P\) be the projection matrix onto \(C(A)\)

  • Since it is a projection operator, \(P^2\) = \(P\) (check)

  • Check that \(I-P\) is also idempotent (it turns out to also be a projection matrix)

  • Where does \(I-P\) project to?

  • Show \(C(I-P) = N(P)\)

  • Show \(\rho(P) + \nu(P) = n\)

  • This implies \(\mathbb{R}^n = C(P) \oplus C(I-P)\)

  • Trivial example

    \[\begin{split}P = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}, (I-P) = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}\end{split}\]
  • Geometry of \(P\) and \(I-P\) in linear least squares

In [7]:
P = A @ la.inv(A.T @ A) @ A.T
In [8]:
P.shape
Out[8]:
(100, 100)
In [9]:
x, res, rank, s, = la.lstsq(A, b)
In [10]:
Q = np.eye(m) - P
In [11]:
(P @ b)[:5]
Out[11]:
array([[0.2763989 ],
       [0.33523772],
       [0.4619637 ],
       [0.50292904],
       [0.49148152]])
In [12]:
(A @ x)[:5]
Out[12]:
array([[0.2763989 ],
       [0.33523772],
       [0.4619637 ],
       [0.50292904],
       [0.49148152]])
In [13]:
la.norm(Q @ b)**2
Out[13]:
9.834201864805834
In [14]:
res
Out[14]:
array([9.83420186])