Eigenvalues and eigenvectors

Eigenvalues

At this point in my studies, I have learned about eigenvalues and how to discover them in the context of the determinant (reviewed below) but this method is not practical for size >3 matrices. Therefore, I had to skip ahead to quickly implement the algorithms for finding eigenvalues (and the corresponding eigenvectors). At present the methods cannot handle complex numbers, and so should be used for symmetric and triangular real matrices (since they always have real eigenvalues). This means I don't have a deep understanding of how these algorithms work so I'll postpone the usual description of how the code works until later, and instead focus on what I understand theoretically.

Thinking of a matrix $A$ as encoding a linear transformation of a vector space, any vector $x$ that points in the same direction after the transformation (i.e. after the multiplication $Ax$) as before it is called an eigenvector. The amount by which their length is changed is called the eigenvalue (typically denoted with $\lambda$). Typically we think of the eigenvalues and eigenvectors as properties of the matrix $A$. You can also view eigenvalues as the roots of a polynomial, which comes from the determinant. We have 2 unknowns to find: $x$ and $\lambda$:

$$ \begin{equation} Ax = \lambda x \end{equation} $$

The first thing we do is rewrite the equation:

$$ \begin{equation} Ax = \lambda Ix \end{equation} $$

Then rearrange:

$$ \begin{equation} Ax - \lambda Ix = 0 \end{equation} $$

Factoring out the $x$:

$$ \begin{equation} (A - \lambda I)x = 0 \end{equation} $$

This tells us that the matrix $(A - \lambda I)$ is singular, which means its determinant is zero.

In the 2 by 2 case we have:

$$ \begin{equation} \begin{bmatrix} a - \lambda & b \\ c & d - \lambda \end{bmatrix} \end{equation} $$

The determinant of that matrix is:

$$ \begin{equation} (a - \lambda)(d - \lambda) - (bc) = 0 \end{equation} $$

Expanding, we see the familiar polynomial form:

$$ \begin{equation} \lambda^2 -(a + dlambda + (ad - bc) = 0 \end{equation} $$

Notice the trace and the determinant appearing in the above equation:

$$ \begin{equation} \lambda^2 -(tracelambda + (determinant) = 0 \end{equation} $$

We refactor that equation:

$$ \begin{equation} (\lambda - z_1)(\lambda - z_2) = 0 \end{equation} $$ \begin{equation}

Where $z_1$ and $z_2$ are those numbers which bring about the trace and determinant in the previous equation. Whatever $z_1$ and $z_2$ are, they must be equal to $\lambda$ and therefore are the eigenvalues of the matrix.

Code implementation

def eigvalues(self, epsilon = 0.0001, max_its=100):
    if not (self.is_symmetric() or self.is_lower_tri() or self.is_upper_tri()):
        print('Matrix is not symmetric or triangular and may therefore have complex eigenvalues which this method cannot handle. Interpret results with care!')

    if self.is_upper_tri() or self.is_lower_tri():
        return Mat([self.diag()])
    if self.is_singular():
        print('Matrix is singular!')
        return None

    old_eig = 0
    final_eigs = []
    for its in range(max_its):

        # obtain off diagonal zeros
        _, E, _, _, _, _ = self.elimination()
        Einv = E.inverse()
        A = E.multiply(self).multiply(Einv)

        # shift A by -cI, where c is last diag
        shift = eye(A.size()).multiply_elwise(old_eig)

        # QR factorisation
        A = A.subtract(shift)
        _, Q, R = A.qr()
        A = R.multiply(Q)
        A = A.add(shift)

        current_eig = A.diag()[-1]
        diff = old_eig - current_eig
        old_eig = current_eig
        if abs(diff) < epsilon:
            if min(A.size()) == 2:
                final_eigs += A.diag()
                return Mat([final_eigs])
            else:
                final_eigs.append(current_eig)
                A = A.data[:-1]
                A = [row[:-1] for row in A]
                A = Mat(A)
                old_eig = A.diag()[-1]
    else:
        print('Did not converge!')
        return None

Demo

We create a matrix, call the eigvalues method and print the result:

import linalg as la

A = la.Mat([[1, 2, 3],
            [2, 4, 4],
            [3, 4, 5]])

evals = A.eigvalues()

la.print_mat(evals,2)

Outputs:

>>> la.print_mat(evals, 2)
[-0.61, 9.95, 0.65]

Eigenvectors

Code implementation

def eig(self, epsilon=0.0001, max_its=100):
    if self.is_singular():
        print('Matrix is singular!')
        return None, None
    evals = self.eigvalues()
    evects = []
    for evalue in evals.data[0]:
        # ensure we don't destroy the diagonal completely
        if evalue in self.diag():
            evalue -= 1e-12
        A_shifted = self.subtract(eye(self.size()).multiply_elwise(evalue))
        # A_shifted_inv = A_shifted.inverse()
        b = gen_mat([self.size(0),1], values=[1])
        b = b.norm()
        for its in range(max_its):
            old_b = dc(b)
            b = A_shifted.backsub(b)
            # b = A_shifted_inv.multiply(b)
            b = b.norm()
            diff1 = b.subtract(old_b)
            diff2 = b.subtract(old_b.multiply_elwise(-1))
            if diff2.length() or diff2.length() < epsilon:
                evects.append(b.tr().data[0])
                break
    evects = Mat(evects).tr()
    return evects, evals

Demo

We create a matrix, call the eig method and print the results. Then we check that an eigenvector multiplied by $A$ gives the same result as the eigenvector scaled by the eigenvalue:

A = la.Mat([[1, 2, 3],
            [2, 4, 4],
            [3, 4, 5]])

evects, evals = A.eig()

la.print_mat(evals, 2)
la.print_mat(evects,2)

# extract the first eigenvector
evect = Mat([evects.tr().data[0]]).tr()
# multiply eigenvector by A and print result
la.print_mat(A.multiply(evect),2)
# scale eigenvector by eigenvalue and print result
la.print_mat(evect.scale(evals.data[0][0]),2)

Outputs:

>>> la.print_mat(evals, 2)
[-0.61, 9.95, 0.65]

>>> la.print_mat(evects,2)
[-0.86, 0.37, 0.35]
[-0.07, 0.6, -0.8]
[0.51, 0.71, 0.49]

>>> la.print_mat(A.multiply(evect),2)
[0.53]
[0.04]
[-0.31]

>>> la.print_mat(evect.scale(evals.data[0][0]),2)
[0.53]
[0.04]
[-0.31]

< $A = QR$

Diagonalisation: $A = X\Lambda X^{-1}$ >

back to project main page
back to home