Systems of Linear Equations
Table of contents
Basic Terms
System of Linear Equations
\[\begin{align*} a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n &= b_1 \\ &\vdots \\ a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n &= b_m \end{align*}\]- $x_1, x_2, \cdots, x_n$ are the unknowns.
As a Matrix Equation
System of linear equations can be compacted to:
\[A\mathbf{x} = \mathbf{b}\]where
- $A$ is the coefficient matrix
- $\mathbf{x}$ is the unknown vector
- $\mathbf{b}$ is the constant vector
$\mathbf{b}$ is a linear combination of the columns of $A$.
Solution
Every $n$-tuple $(x_1, x_2, \cdots, x_n)$ that satisfies the system of linear equations is called a solution.
Every system of linear equations has either:
- No solution: This is when we resort to approximation such as regression.
- Unique solution
- Infinitely many solutions
Augmented matrix
Matrix equation $A\mathbf{x} = \mathbf{b}$ can be further compacted as an augmented matrix:
e.g.
\[\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \begin{bmatrix} x \\ y \\ z \\ \end{bmatrix} = \begin{bmatrix} 10 \\ 11 \\ 12 \\ \end{bmatrix}\]can be represented as an augmented matrix:
\[\left[ \begin{array}{ccc|c} 1 & 2 & 3 & 10 \\ 4 & 5 & 6 & 11 \\ 7 & 8 & 9 & 12 \\ \end{array} \right]\]Finding the solution of a system of linear equations is equivalent to finding the reduced row echelon form of the augmented matrix.
Pivots / Row Echelon / Reduced Row Echelon Form
Basic / Free Variables
Take the following system:
\[\left[ \begin{array}{cccc|c} 1 & 0 & 8 & -4 & 42 \\ 0 & 1 & 2 & 12 & 8 \\ \end{array} \right]\]Which is a compact form of:
\[\begin{bmatrix} 1 & 0 & 8 & -4 \\ 0 & 1 & 2 & 12 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ \end{bmatrix} = \begin{bmatrix} 42 \\ 8 \\ \end{bmatrix}\]Basic variables are the variables corresponding to the pivot columns.
The pivot columns are the columns with the pivot entries, which in this case are the first two columns.
Free variables are the variables corresponding to the non-pivot columns.
So in this case, $x_1$ and $x_2$ are basic variables, and $x_3$ and $x_4$ are free variables.
Homogeneous System
A system of linear equations is homogeneous if the constant vector $\mathbf{b}$ is the zero vector.
\[A\mathbf{x} = \mathbf{0}\]A homogeneous system always has at least one solution, which is the trivial solution $\mathbf{x} = \mathbf{0}$.
Why homogeneous?
In math, equations are homogeneous if all the terms are of the same degree.
Let’s expand the system $A\mathbf{x} = \mathbf{b}$:
The equations are of the form:
\[a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n = b_1\]The other variables are of degree $1$. If $b_1$ is a constant, then it is not homogeneous. The only way to make it homogeneous is to have $b_1 = 0$.
Therefore, a homogeneous system is of the form $A\mathbf{x} = \mathbf{0}$.
Particular Solution
A particular solution or special solution of a system can be any solution that satisfies the system (perhaps one of the many).
The easiest way to identify one is to first bring your augmented matrix to a reduced row echelon form.
e.g.
\[\left[ \begin{array}{ccccc|c} 1 & -2 & 0 & 0 & -2 & 2 \\ 0 & 0 & 1 & 0 & 1 & -1 \\ 0 & 0 & 0 & 1 & -2 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right]\]- Basic variables: $x_1, x_3, x_4$
- Free variables: $x_2, x_5$
Setting the free variables to $0$, and solving for the basic variables will give you a particular solution:
\[x = \begin{bmatrix} 2 \\ 0 \\ -1 \\ 1 \\ 0 \\ \end{bmatrix}\]since
\[2c_1 + 0c_2 - 1c_3 + 1c_4 + 0c_5 = \begin{bmatrix} 2 \\ -1 \\ 1 \\ 0 \\ \end{bmatrix}\]General Solution
A general solution is the set of all possible solutions.
A particular solution combined with the null space or kernel of the coefficient matrix forms the general solution set.
Say we have a particular solution $\mathbf{x_p}$, s.t. $A\mathbf{x_p} = \mathbf{b}$.
$\forall \mathbf{u_i} \in \ker(A)$, $A\mathbf{u_i} = \mathbf{0}$.
Then $A(\mathbf{x_p} + \mathbf{u_i}) = \mathbf{b}$. Furthermore, $A(\mathbf{x_p} + \lambda_i \mathbf{u_i}) = \mathbf{b}$.
Then the general solution set is:
\[\set{ x \in \mathbb{R}^n \mid \forall \lambda_i \in \mathbb{R},\, x = \mathbf{x_p} + \lambda_i \mathbf{u_i} }\]Minus-1 Trick
There is a quick trick to identify the kernel of a matrix, or finding the solutions to a homogeneous system $A\mathbf{x} = \mathbf{0}$.
First reduce the matrix to a reduced row echelon form, e.g.
\[A = \left[ \begin{array}{ccccc} 1 & 3 & 0 & 0 & 3 \\ 0 & 0 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & -4 \\ \end{array} \right]\]Then augment the matrix to a square matrix so that the diagonal elements are $1$ for the pivot columns, and $-1$ for the non-pivot columns.
\[\tilde{A} = \left[ \begin{array}{ccccc} 1 & 3 & 0 & 0 & 3 \\ 0 & \mathbf{-1} & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 9 \\ 0 & 0 & 0 & 1 & -4 \\ 0 & 0 & 0 & 0 & \mathbf{-1} \\ \end{array} \right]\]Then the solutions to the homogeneous system consists of the columns with $-1$:
\[\ker(A) = \left\{ \begin{bmatrix} 3 \\ -1 \\ 0 \\ 0 \\ 0 \\ \end{bmatrix}, \begin{bmatrix} 3 \\ 0 \\ 9 \\ -4 \\ -1 \\ \end{bmatrix} \right\}\]Solving Systems of Linear Equations in Practice
Direct Methods
To solve a system of linear equations $Ax = b$, one often resorts to calculating the inverse matrix $A^{-1}$.
However, inverse matrix is not defined for most matrices.
Moore-Penrose pseudoinverse
Moore-Penrose pseudoinverse is a more loose version of the inverse under the mild assumption that $A$ has linearly independent columns. It is defined as $(A^TA)^{-1}A^T$.
Quick derivation is:
\[A\mathbf{x} = \mathbf{b} \Leftrightarrow A^TA\mathbf{x} = A^T\mathbf{b} \Leftrightarrow \mathbf{x} = (A^TA)^{-1}A^T\mathbf{b}\]However, this is also not recommended because it is computationally expensive.
Gaussian elimination is also impractical for large in real-life matrices.
Iterative Methods
In practice, iterative methods are used to find the solutions indirectly.