Review of solving simultaneous linear equations (HS)¶
Reference
- Matrix algebra from a statistician’s perspective, David A. Harville
- Linear algebra and its applications, 4th ed, David C. Lay
Linear Systems: Consistency & Solutions¶
Motivation¶
In many instances, the solution to a problem in statistics can be reduced to a problem of solving a system of linear equations. For example, the problem of obtaining least squares estimates of the parameters in multiple linear regression can be reduced to solving a system of linear equations called the normal equations (will discuss in the next lecture), or constrained normal equations if the parameters are subject to linear constraints. (?? reference)
Consider a set of \(m\) linear equations in \(n\) unknowns:
Collectively, these equations are called a system of linear equations or simply a linear system.
We can let:
And re-write the system:
where \(\mathbf{A}\) is a called the coefficient matrix (or matrix of coefficients), \(\mathbf{x}\) is the solution to the linear system.
Solving linear systems¶
The basic strategy to solve linear systems is Gaussian Elimination (GE).
Let’s review how Gaussian elimination (GE) works. We will deal with a \(3\times 3\) system of equations for conciseness, but everything here generalizes to the \(n\times n\) case. Consider the following equation:
For simplicity, let us assume that the leftmost matrix \(A\) is non-singular. To solve the system using GE, we start with the “augmented matrix”:
We begin at the first entry, \(a_{11}\). If \(a_{11} \neq 0\), then we divide the first row by \(a_{11}\) and then subtract the appropriate multiple of the first row from each of the other rows, zeroing out the first entry of all rows. (If \(a_{11}\) is zero, we need to permute rows. We will not go into details of that here.) The result is as follows:
We repeat the procedure for the second row: first divide by the leading entry, then subtract the appropriate multiple of the resulting row from each of the third and first rows, so that the second entry in row 1 and in row 3 are zero. We could continue until the matrix on the left is the identity. In that case, we can then just “read off” the solution: i.e., the vector \(x\) is the resulting column vector on the right. Usually, it is more efficient to stop at reduced row echelon form (upper triangular, with ones on the diagonal), and then use back substitution to obtain the final answer.
Note that in some cases, it is necessary to permute rows to obtain reduced row echelon form. This is called partial pivoting. If we also manipulate columns, that is called full pivoting.
It should be mentioned that we may obtain the inverse of a matrix using GE, by reducing the matrix \(A\) to the identity, with the identity matrix as the augmented portion.
Summary¶
To solve a linear system, we apply elementary row operations to the argumented matrix until reduced row echelon form is obtained and then use back substitution to find the solutions.
Elementary row operations include: 1. (Replacement) Replace one row by the sum of itself and a multiple of another row. 2. (interchange) Interchange two rows. 3. (Scaling) Multiply all entries in a row by a non-zero constant.
Consistency (Existence of one or more solutions)¶
A linear system is said to be consistent if it has one or more solutions; a linear system is inconsistent if no solution exists.
Fact: There are ONLY three possible scenarios for a linear system:
- no solution
- exactly one solution
- infinitely many solutions
Linear Independence:¶
- If \(A\) is an \(m\times n\) matrix and \(m>n\), if all \(m\) rows are linearly independent, then the system is overdetermined and inconsistent. The system cannot be solved exactly. This is the usual case in data analysis, and why least squares is so important.
- If \(A\) is an \(m\times n\) matrix and \(m<n\), if all \(m\) rows are linearly independent, then the system is underdetermined and there are infinite solutions.
- If \(A\) is an \(m\times n\) matrix and some of its rows are linearly dependent, then the system is reducible. We can get rid of some equations.
- If \(A\) is a square matrix and its rows are linearly independent, the system has a unique solution. (\(A\) is invertible.)
Solutions¶
In [ ]: