Eigenvalues / Eigenvectors

Table of contents
  1. Characteristic Polynomial
    1. Roots of the Characteristic Polynomial
  2. Eigenvalues and Eigenvector
    1. Finding Eigenvalues via Characteristic Polynomial
      1. Eigenvectors and the Kernel
      2. Rank of $\boldsymbol{A} - \lambda \boldsymbol{I}$
      3. Determinant of $\boldsymbol{A} - \lambda \boldsymbol{I}$
      4. Solving the Characteristic Polynomial
    2. Eigenvalues of a Transpose
  3. Defective Matrix
    1. Linear Independence of Eigenvectors
  4. Diagonalizable Matrix (Non-Defective Matrix)
    1. Eigen-Decomposition
    2. Determinant of a Matrix Using Eigenvalues
    3. Trace of a Matrix Using Eigenvalues
    4. Spectral Theorem
  5. Eigenspace
    1. Geometric Multiplicity

Characteristic Polynomial

For $\lambda \in \mathbb{R}$ and square matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$,

Remeber $\lambda$ is a variable here.

The characteristic polynomial of $\boldsymbol{A}$ is:

$$ \begin{align*} \label{eq:characteristic-polynomial} p_{\boldsymbol{A}}(\lambda) &= \det(\boldsymbol{A} - \lambda \boldsymbol{I}) \\[1em] &= (-1)^n \lambda^n + c_{n-1} \lambda^{n-1} + \cdots + c_1 \lambda + c_0 \\[1em] &= (-1)^n \lambda^n + (-1)^{n-1} \tr(\boldsymbol{A}) \lambda^{n-1} + \cdots + (-1)^0 \det(\boldsymbol{A}) \end{align*} $$

where $c_i \in \mathbb{R}$.

Sometimes it is defined like... \[p_{\boldsymbol{A}}(\lambda) = \det(\lambda \boldsymbol{I} - \boldsymbol{A})\]

The only difference is that this version guarantees the leading coefficient is positive, or monic:

\[p_{\boldsymbol{A}}(\lambda) = \lambda^n + c_{n-1} \lambda^{n-1} + \cdots + c_1 \lambda + c_0\]

Unlike the $(-1)^n$ above.

Roots of the Characteristic Polynomial

This is literally the roots $\lambda$ you get from solving the above polynomial equation:

$$ p_{\boldsymbol{A}}(\lambda) = 0 $$

Also called eigenvalues of $\boldsymbol{A}$.


Eigenvalues and Eigenvector

Let $\boldsymbol{A} \in \mathbb{R}^{n \times n}$.

For the eigenvalue equation:

$$ \boldsymbol{A} \boldsymbol{x} = \lambda \boldsymbol{x} $$

The satisfying non-zero vector $\boldsymbol{x} \in \mathbb{R}^n \setminus \{\boldsymbol{0}\}$ and corresponding scalar $\lambda \in \mathbb{R}$ are called eigenvectors and eigenvalues of $\boldsymbol{A}$, respectively.

Intuition

Intuitively, it means that for some linear transformation represented by $\boldsymbol{A}$, there is a vector that is unchanging in any other way except for the magnitude and maybe reversed direction (if $\lambda < 0$).

In other words, this transformation $\boldsymbol{A}$ can only scale the eigenvectors by its eigenvalues.

Some like to order the eigenvalues in descending order, and call them first, second, etc. eigenvalues.

Finding Eigenvalues via Characteristic Polynomial

Eigenvectors and the Kernel

The above equation can be rewritten as:

$$ (\boldsymbol{A} - \lambda \boldsymbol{I}) \boldsymbol{x} = \boldsymbol{0} $$

or equivalently:

\[(\lambda \boldsymbol{I} - \boldsymbol{A}) \boldsymbol{x} = \boldsymbol{0}\]

With this homogenous system, we see that the non-trivial solution $\boldsymbol{x}$ (or eigenvectors) are in the kernel of the matrix $(\boldsymbol{A} - \lambda \boldsymbol{I})$:

$$ \boldsymbol{x} \in \ker(\boldsymbol{A} - \lambda \boldsymbol{I}) $$

Rank of $\boldsymbol{A} - \lambda \boldsymbol{I}$

We saw that the eigenvector is a non-trivial element of the kernel.

Which means that the columns of $\boldsymbol{A} - \lambda \boldsymbol{I}$ are linearly dependent (they can zero themselves out by a linear combination).

Therefore, $\boldsymbol{A} - \lambda \boldsymbol{I}$ cannot be full rank:

$$ \rank(\boldsymbol{A} - \lambda \boldsymbol{I}) < n $$

Determinant of $\boldsymbol{A} - \lambda \boldsymbol{I}$

From here, we know that the determinant of the matrix is non-zero if and only if the matrix is full rank.

Matrix is invertible only when columns are linearly independent, but since it is not, determinant should be zero for the inverse to be undefined.

However, we just saw that $\boldsymbol{A} - \lambda \boldsymbol{I}$ cannot be full rank.

Hence:

$$ \det(\boldsymbol{A} - \lambda \boldsymbol{I}) = 0 $$

Solving the Characteristic Polynomial

Because we know

\[\det(\boldsymbol{A} - \lambda \boldsymbol{I}) = 0\]

we can use the characteristic polynomial to find the eigenvalues $\lambda$.

Eigenvalues of a Transpose

$\boldsymbol{A}^T$ and $\boldsymbol{A}$ have the same eigenvalues.

However, the eigenvectors are not necessarily the same.


Defective Matrix

A square matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$ is defective if it does not have $n$ linearly independent eigenvectors.

Linear Independence of Eigenvectors

Eigenvectors corresponding to distinct eigenvalues are linearly independent.

Therefore, eigenvectors with $n$ distinct eigenvalues form a basis of $\mathbb{R}^n$.


Diagonalizable Matrix (Non-Defective Matrix)

We say that a matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$ is diagonalizable (or non-defective) if it is similar to a diagonal matrix $\boldsymbol{D}$.

It is saying that we can make $\boldsymbol{A}$ into a diagonal matrix by changing its basis.

In other words, there exists an invertible matrix $\boldsymbol{P}$ such that:

\[\boldsymbol{D} = \boldsymbol{P}^{-1} \boldsymbol{A} \boldsymbol{P}\]

This can be rewritten as:

\[\boldsymbol{A} = \boldsymbol{P} \boldsymbol{D} \boldsymbol{P}^{-1}\]

Eigen-Decomposition

Any non-defective matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$ can be decomposed into:

$$ \boldsymbol{A} = \boldsymbol{P} \boldsymbol{D} \boldsymbol{P}^{-1} $$

where $\boldsymbol{P}$ is a matrix whose columns are the eigenvectors of $\boldsymbol{A}$, and $\boldsymbol{D}$ is a diagonal matrix whose diagonal elements are the eigenvalues of $\boldsymbol{A}$.

What?

If you rewrite the above equation as:

\[\boldsymbol{A} \boldsymbol{P} = \boldsymbol{P} \boldsymbol{D}\]

Let’s say $\boldsymbol{P} = [\boldsymbol{p}_1 \dots \boldsymbol{p}_n]$, then:

\[\boldsymbol{A} \boldsymbol{P} = [\boldsymbol{A} \boldsymbol{p}_1 \dots \boldsymbol{A} \boldsymbol{p}_n]\]

Keeping in mind that $D$ is a diagonal matrix, let the principal diagonal elements be $c_1, \dots, c_n$, then:

\[\boldsymbol{P} \boldsymbol{D} = [c_1 \boldsymbol{p}_1 \dots c_n \boldsymbol{p}_n]\]

Since $\boldsymbol{A} \boldsymbol{p}_i = c_i \boldsymbol{p}_i$,

We see that each $\boldsymbol{p}_i$ is an eigenvector of $\boldsymbol{A}$, and $c_i$ is the corresponding eigenvalue.

Determinant of a Matrix Using Eigenvalues

What follows is a quick way to calculate the determinant of a matrix usig its eigenvalues.

A property of the determinant is that it is invariant under change of basis.

So for similar matrices $\boldsymbol{A}$ and $\boldsymbol{D}$, their determinants are the same, but it is easy to calculate the determinant of a diagonal matrix $\boldsymbol{D}$, since it is just the product of its diagonal elements which are the eigenvalues of $\boldsymbol{A}$.

Therefore the determinant of a matrix is the product of its eigenvalues:

$$ \det(\boldsymbol{A}) = \prod_{i=1}^n \lambda_i $$

$\lambda_i$ is not necessarily unique! It can have repeated eigenvalues.

Trace of a Matrix Using Eigenvalues

Same logic applies to trace as it is also invariant under change of basis.

The trace of $\boldsymbol{D}$ is the sum of its diagonal elements (the eigenvalues of $\boldsymbol{A}$).

Therefore the trace of a matrix is the sum of its eigenvalues:

$$ \tr(\boldsymbol{A}) = \sum_{i=1}^n \lambda_i $$

Again, repetition counts.

Spectral Theorem

A symmetric matrix $\boldsymbol{A} \in \mathbb{R}^{n \times n}$ is always diagonalizable.

Further, the eigenvectors of $\boldsymbol{A}$ form an orthonormal basis:

\[\boldsymbol{P}^\top \boldsymbol{P} = \boldsymbol{I} \iff \boldsymbol{P}^\top = \boldsymbol{P}^{-1}\]

Therefore the decomposition is:

$$ \boldsymbol{A} = \boldsymbol{P} \boldsymbol{D} \boldsymbol{P}^\top $$


Eigenspace

Eigenvectors are not unique, that is, any scalar multiple of an eigenvector is also an eigenvector.

These eigenvectors that share the same eigenvalue span a subspace of $\mathbb{R}^n$, called the eigenspace of $\boldsymbol{A}$ with respect to the eigenvalue $\lambda$ (denoted with a subscript):

$$ E_{\lambda} = \ker(\boldsymbol{A} - \lambda \boldsymbol{I}) $$

Geometric Multiplicity

Let $\lambda$ be an eigenvalue of square matrix $\boldsymbol{A}$.

Then the geometric multiplicity of $\lambda$, is the number of linearly independent eigenvectors corresponding to $\lambda$, or

$$ \dim(E_{\lambda}) $$

Algebraic multiplicity is the multiplicity of $\lambda$ as a root of the characteristic polynomial. For example, if a matrix had two repeated eigenvalues, then the algebraic multiplicity of that eigenvalue is 2.