It is now quite easy to prove the easiest one of the so-called canonical form theorems. Our assumption about the scalar field (namely, that it is algebraically closed) is still in force.

Theorem 1. If \(A\) is any linear transformation on an \(n\) -dimensional vector space \(\mathcal{V}\) , then there exist \(n+1\) subspaces \(\mathcal{M}_{0}, \mathcal{M}_{1}, \ldots, \mathcal{M}_{n-1}, \mathcal{M}_{n}\) with the following properties:

  1. each \(\mathcal{M}_{j}\) ( \(j=0,1, \ldots, n-1, n\) ) is invariant under \(A\) ,
  2. the dimension of \(\mathcal{M}_{j}\) is \(j\) ,
  3. ( \(\mathcal{O} =\) ) \(\mathcal{M}_{0} \subset \mathcal{M}_{1} \subset \cdots \subset \mathcal{M}_{n-1} \subset \mathcal{M}_{n}\) ( \(= \mathcal{V}\) ).

Proof. If \(n=0\) or \(n=1\) , the result is trivial; we proceed by induction, assuming that the statement is correct for \(n-1\) . Consider the dual transformation \(A^{\prime}\) on \(\mathcal{V}^{\prime}\) ; since it has at least one proper vector, say \(x^{\prime}\) , there exists a one-dimensional subspace \(\mathcal{M}\) invariant under it, namely, the set of all multiples of \(x^{\prime}\) . Let us denote by \(\mathcal{M}_{n-1}\) the annihilator (in \(\mathcal{V}^{\prime \prime}=\mathcal{V}\) ) of \(\mathcal{M}\) , \(\mathcal{M}_{n-1}=\mathcal{M}^{0}\) ; then \(\mathcal{M}_{n-1}\) is an \((n-1)\) -dimensional subspace of \(\mathcal{V}\) , and \(\mathcal{M}_{n-1}\) is invariant under \(A\) . Consequently we may consider \(A\) as a linear transformation on \(\mathcal{M}_{n-1}\) alone, and we may find \(\mathcal{M}_{0}, \mathcal{M}_{1}, \ldots, \mathcal{M}_{n-2}\) , \(\mathcal{M}_{n-1}\) , satisfying the conditions (i), (ii), (iii). We write \(\mathcal{M}_{n}=\mathcal{V}\) , and we are done. ◻

The chief interest of this theorem comes from its matricial interpretation. Since \(\mathcal{M}_{1}\) is one-dimensional, we may find in it a vector \(x_{1} \neq 0\) . Since \(\mathcal{M}_{1} \subset \mathcal{M}_{2}\) , it follows that \(x_{1}\) is also in \(\mathcal{M}_{2}\) , and since \(\mathcal{N}_{2}\) is two-dimensional, we may find in it a vector \(x_{2}\) such that \(x_{1}\) and \(x_{2}\) span \(\mathcal{M}_{2}\) . We proceed in this way by induction, choosing vectors \(x_{j}\) so that \(x_{1}, \ldots, x_{j}\) lie in \(\mathcal{M}_{j}\) and span \(\mathcal{M}_{j}\) for \(j=1, \ldots, n\) . We obtain finally a basis \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) in \(\mathcal{V}\) ; let us compute the matrix of \(A\) in this coordinate system. Since \(x_{j}\) is in \(\mathcal{M}_{j}\) and since \(\mathcal{M}_{j}\) is invariant under \(A\) , it follows that \(A x_{j}\) must be a linear combination of \(x_{1}, \ldots, x_{j}\) . Hence in the expression \[A x_{j}=\sum_{i} \alpha_{i j} x_{i}\] the coefficient of \(x_{i}\) must vanish whenever \(i>j\) ; in other words, \(i>j\) implies \(\alpha_{i j}=0\) . Hence the matrix of \(A\) has the triangular form \begin{align} = \begin{bmatrix} \alpha_{11} & \alpha_{12} & \cdots & \alpha_{1(n-1)} & \alpha_{1n}\\ 0 & \alpha_{22} & \cdots & \alpha_{2(n-1)} & \alpha_{2n}\\ \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & \alpha_{(n-1)(n-1)} & \alpha_{(n-1)n}\\ 0 & 0 & \cdots & 0 & \alpha_{nn} \end{bmatrix}. \end{align} It is clear from this representation that \(\operatorname{det}(A-\alpha_{i i})=0\) for \(i=1, \ldots, n\) , so that the \(\alpha_{i i}\) are the proper values of \(A\) , appearing on the main diagonal of \([A]\) with the proper multiplicities. We sum up as follows.

Theorem 2. If \(A\) is a linear transformation on an \(n\) -dimensional vector space \(\mathcal{V}\) , then there exists a basis \(\mathcal{X}\) in \(\mathcal{V}\) such that the matrix \([A; X]\) is triangular; or, equivalently, if \([A]\) is any matrix, there exists a non-singular matrix \([B]\) such that \([B]^{-1}[A][B]\) is triangular.

The triangular form is useful for proving many results about linear transformations. It follows from it, for example, that for any polynomial \(p\) , the proper values of \(p(A)\) , including their algebraic multiplicities, are precisely the numbers \(p(\lambda)\) , where \(\lambda\) runs through the proper values of \(A\) .

A large part of the theory of linear transformations is devoted to improving the triangularization result just obtained. The best thing a matrix can be is not triangular but diagonal (that is, \(\alpha_{i j}=0\) unless \(i=j\) ); if a linear transformation is such that its matrix with respect to a suitable coordinate system is diagonal we shall call the transformation diagonable .

EXERCISES

Exercise 1. Interpret the following matrices as linear transformations on \(\mathbb{C}^{2}\) and, in each case, find a basis of \(\mathbb{C}^{2}\) such that the matrix of the transformation with respect to that basis is triangular.

  1. \(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\) .
  2. \(\begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}\) .
  3. \(\begin{bmatrix} 1 & 0 \\ 1 & 1 \end{bmatrix}\) .
  4. \(\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\) .
  5. \(\begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{bmatrix}\) .
  6. \(\begin{bmatrix} 0 & 1 & 1 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{bmatrix}\) .

Exercise 2. Two commutative linear transformations on a finite-dimensional vector space \(\mathcal{V}\) over an algebraically closed field can be simultaneously triangularized. In other words, if \(A B=B A\) , then there exists a basis \(\mathcal{X}\) such that both \([A; \mathcal{X}]\) and \([B; \mathcal{X}]\) are triangular. (Hint: to imitate the proof in Section: Triangular form , it is desirable to find a subspace \(\mathcal{M}\) of \(\mathcal{V}\) invariant under both \(A\) and \(B\) . With this in mind, consider any proper value \(\lambda\) of \(A\) and examine the set of all solutions of \(A x=\lambda x\) for the role of \(\mathcal{M}\) .)

Exercise 3. Formulate and prove the analogues of the results of Section: Triangular form for triangular matrices below the diagonal (instead of above it).

Exercise 4. Suppose that \(A\) is a linear transformation over an \(n\) -dimensional vector space. For every alternating \(n\) -linear form \(w\) , write \(\overline{A} w\) for the function defined by \begin{align} (\overline{A} w)(x_{1}, \ldots, x_{n}) &= w(A x_{1}, x_{2}, \ldots, x_{n})\\ &\quad + w(x_{1}, A x_{2}, \ldots, x_{n})\\ &\quad + \cdots\\ &\quad + w(x_{1}, x_{2}, \ldots, A x_{n}). \end{align} 

Since \(\overline{A} w\) is an alternating \(n\) -linear form, and, in fact, \(\overline{A}\) is a linear transformation on the (one-dimensional) space of such forms, it follows that \(\overline{A} w=\tau(A) \cdot w\) , where \(\tau(A)\) is a scalar.

  1. \(\tau(0)=0\) .
  2. \(\tau(1)=n\) .
  3. \(\tau(A+B)=\tau(A)+\tau(B)\) .
  4. \(\tau(\alpha A)=\alpha \tau(A)\) .
  5. If the scalar field has characteristic zero and if \(A\) is a projection, then \(\tau(A)=\rho(A)\) .
  6. If \((\alpha_{i j})\) is the matrix of \(A\) in some coordinate system, then \(\tau(A)=\sum_{i} \alpha_{i i}\) .
  7. \(\tau(A^{\prime})=\tau(A)\) .
  8. \(\tau(A B)=\tau(B A)\) .
  9. For which permutations \(\pi\) of the integers \(1, \ldots, k\) is it true that \[\tau(A_{1} \ldots A_{k})=\tau(A_{\pi(1)} \ldots A_{\pi(k)})\] for all \(k\) -tuples \((A_{1}, \ldots, A_{k})\) of linear transformations?
  10. If the field of scalars is algebraically closed, then \(\tau(A)=\operatorname{tr} A\) . (For this reason trace is usually defined to be \(\tau\) ; the most popular procedure is to use (f) as the definition.)

Exercise 5. 

  1. Suppose that the scalar field has characteristic zero. Prove that if \(E_{1}, \ldots, E_{k}\) and \(E_{1}+\cdots+E_{k}\) are projections, then \(E_{i} E_{j}=0\) whenever \(i \neq j\) . (Hint: from the fact that \[\operatorname{tr}(E_{1}+\cdots+E_{k})=\operatorname{tr}(E_{1})+\cdots+\operatorname{tr}(E_{k})\] conclude that the range of \(E_{1}+\cdots+E_{k}\) is the direct sum of the ranges of \(E_{1}, \ldots, E_{k}\) .)
  2. If \(A_{1}, \ldots, A_{k}\) are linear transformations on an \(n\) -dimensional vector space, and if \(A_{1}+\cdots+A_{k}=1\) and \(\rho(A_{1})+\cdots+\rho(A_{k}) \leq n\) , then each \(A_{i}\) is a projection and \(A_{i} A_{j}=0\) whenever \(i \neq j\) . (Start with \(k=2\) and proceed by induction; use a direct sum argument as in (a).)

Exercise 6. 

  1. If \(A\) is a linear transformation on a finite-dimensional vector space over a field of characteristic zero, and if \(\operatorname{tr} A=0\) , then there exists a basis \(\mathcal{X}\) such that if \([A; \mathcal{X}]=(\alpha_{i j})\) , then \(\alpha_{i i}=0\) for all \(i\) . (Hint: using the fact that \(A\) is not a scalar, prove first that there exists a vector \(x\) such that \(x\) and \(A x\) are linearly independent. This proves that \(\alpha_{11}\) can be made to vanish; proceed by induction.)
  2. Show that if the characteristic is not zero, the conclusion of (a) is false. (Hint: if the characteristic is \(2\) , compute \(B C-C B\) , where \(B=\big[\begin{smallmatrix} 0 & 1 \\ 0 & 0 \end{smallmatrix}\big]\) and \(C=\big[\begin{smallmatrix} 0 & 0 \\ 1 & 0 \end{smallmatrix}\big]\) .)