We are now ready to prove the main theorem of this book, the theorem of which many of the other results of this chapter are immediate corollaries. To some extent what we have been doing up to now was a matter of sport (useful, however, for generalizations); we wanted to show how much can conveniently be done with spectral theory before proving the spectral theorem. In the complex case, incidentally, the spectral theorem can be made to follow from the triangularization process we have already described; because of the importance of the theorem we prefer to give below its (quite easy) direct proof. The reader may find it profitable to adapt the method of proof (not the result) of Section: Triangular form , Theorem 2, to prove as much as he can of the spectral theorem and its consequences.
Theorem 1. To every self-adjoint linear transformation \(A\) on a finite-dimensional inner product space there correspond real numbers \(\alpha_{1}, \ldots, \alpha_{r}\) and perpendicular projections \(E_{1}, \ldots, E_{r}\) (where \(r\) is a strictly positive integer, not greater than the dimension of the space) so that
- the \(\alpha_{j}\) are pairwise distinct,
- the \(E_{j}\) are pairwise orthogonal and different from \(0\) ,
- \(\sum_{j} E_{j}=1\) ,
- \(\sum_{j} \alpha_{j} E_{j}=A\) .
Proof. Let \(\alpha_{1}, \ldots, \alpha_{r}\) be the distinct proper values of \(A\) , and let \(E_{j}\) be the perpendicular projection on the subspace consisting of all solutions of \(A x=\alpha_{j} x\) ( \(j=1, \ldots, r\) ). Condition (1) is then satisfied by definition; the fact that the \(\alpha\) ’s are real follows from Section: Characterization of spectra , Theorem 1. Condition (2) follows from Section: Characterization of spectra , Theorem 4. From the orthogonality of the \(E_{j}\) we infer that if \(E=\sum_{j} E_{j}\) , then \(E\) is a perpendicular projection. The dimension of the range of \(E\) is the sum of the dimensions of the ranges of the \(E_{j}\) , and consequently, by Section: Characterization of spectra , Theorem 6, the dimension of the range of \(E\) is equal to the dimension of the entire space; this implies (3). (Alternatively, if \(E \neq 1\) , then \(A\) considered on the range of \(1-E\) would be a self-adjoint transformation with no proper values.) To prove (4), take any vector \(x\) and write \(x_{j}=E_{j} x\) ; it follows that \(A x_{j}=\alpha_{j} x_{j}\) and hence that \begin{align} A x &= A\Big(\sum_{j} E_{j} x\Big)\\ &= \sum_{j} A x_{j}\\ &= \sum_{j} \alpha_{j} x_{j}\\ &= \sum_{j} \alpha_{j} E_{j} x \end{align} This completes the proof of the spectral theorem. ◻
The representation \(A=\sum_{j} \alpha_{j} E_{j}\) (where the \(\alpha\) ’s and the \(E\) ’s satisfy the conditions (1)-(3) of Theorem 1) is called a spectral form of \(A\) ; the main effect of the following result is to prove the uniqueness of the spectral form.
Theorem 2. If \(\sum_{j=1}^{r} \alpha_{j} E_{j}\) is the spectral form of a self-adjoint transformation \(A\) on a finite-dimensional inner product space, then the \(\alpha\) ’s are all the distinct proper values of \(A\) . If, moreover, \(1 \leq k \leq r\) , then there exist polynomials \(p_{k}\) , with real coefficients, such that \(p_{k}(\alpha_{j})=0\) whenever \(j \neq k\) and such that \(p_{k}(\alpha_{k})=1\) ; for every such polynomial \(p_{k}(A)=E_{k}\) .
Proof. Since \(E_{j} \neq 0\) , there exists a vector \(x\) in the range of \(E_{j}\) . Since \(E_{j} x=x\) and \(E_{i} x=0\) whenever \(i \neq j\) , it follows that \[A x=\sum_{i} \alpha_{i} E_{i} x=\alpha_{j} E_{j} x=\alpha_{j} x,\] so that each \(\alpha_{j}\) is a proper value of \(A\) . If, conversely, \(\lambda\) is any proper value of \(A\) , say \(A x=\lambda x\) with \(x \neq 0\) , then we write \(x_{j}=E_{j} x\) and we see that \[A x=\lambda x=\lambda \sum_{j} x_{j}\] and \[A x=A \sum_{j} x_{j}=\sum_{j} \alpha_{j} x_{j},\] so that \(\sum_{j}(\lambda-\alpha_{j}) x_{j}=0\) . Since the \(x_{j}\) are pairwise orthogonal, those among them that are not zero form a linearly independent set. It follows that, for each \(j\) , either \(x_{j}=0\) or else \(\lambda=\alpha_{j}\) . Since \(x \neq 0\) , we must have \(x_{j} \neq 0\) for some \(j\) , and consequently \(\lambda\) is indeed equal to one of the \(\alpha\) ’s.
Since \(E_{i} E_{j}=0\) if \(i \neq j\) , and \(E_{j}^{2}=E_{j}\) , it follows that \begin{align} A^{2} &= \Big(\sum_{i} \alpha_{i} E_{i}\Big)\Big(\sum_{j} \alpha_{j} E_{j}\Big)\\ &= \sum_{i} \sum_{j} \alpha_{i} \alpha_{j} E_{i} E_{j} \\ &= \sum_{j} \alpha_{j}^{2} E_{j}. \end{align} Similarly \[A^{n}=\sum_{j} \alpha_{j}^{n} E_{j}\] for every positive integer \(n\) (in case \(n=0\) , use (3)), and hence \[p(A)=\sum_{j} p(\alpha_{j}) E_{j}\] for every polynomial \(p\) . To conclude the proof of the theorem, all we need to do is to exhibit a (real) polynomial \(p_{k}\) such that \(p_{k}(\alpha_{j})=0\) whenever \(j \neq k\) and such that \(p_{k}(\alpha_{k})=1\) . If we write \[p_{k}(t)=\prod_{j \neq k} \frac{t-\alpha_{j}}{\alpha_{k}-\alpha_{j}},\] then \(p_{k}\) is a polynomial with all the required properties. ◻
Theorem 3. If \(\sum_{j=1}^{r} \alpha_{j} E_{j}\) is the spectral form of a self-adjoint transformation \(A\) on a finite-dimensional inner product space, then a necessary and sufficient condition that a linear transformation \(B\) commute with \(A\) is that it commute with each \(E_{j}\) .
Proof. The sufficiency of the condition is trivial; if \(A=\sum_{j} \alpha_{j} E_{j}\) and \(E_{j} B=B E_{j}\) for all \(j\) , then \(A B=B A\) . Necessity follows from Theorem 2; if \(B\) commutes with \(A\) , then \(B\) commutes with every polynomial in \(A\) , and therefore \(B\) commutes with each \(E_{j}\) . ◻
Before exploiting the spectral theorem any further, we remark on its matricial interpretation. If we choose an orthonormal basis in the range of each \(E_{j}\) , then the totality of the vectors in these little bases is a basis for the whole space; expressed in this basis the matrix of \(A\) will be diagonal. The fact that by a suitable choice of an orthonormal basis the matrix of a self-adjoint transformation can be made diagonal, or, equivalently, that any self-adjoint matrix can be isometrically transformed (that is, replaced by \([U]^{-1}[A][U]\) , where \(U\) is an isometry) into a diagonal matrix, already follows (in the complex case) from the theory of the triangular form. We gave the algebraic version for two reasons. First, it is this version that generalizes easily to the infinite-dimensional case, and, second, even in the finite-dimensional case, writing \(\sum_{j} \alpha_{j} E_{j}\) often has great notational and typographical advantages over the matrix notation.
We shall make use of the fact that a not necessarily self-adjoint transformation \(A\) is isometrically diagonable (that is, that its matrix with respect to a suitable orthonormal basis is diagonal) if and only if conditions (1)-(4) of Theorem 1 hold for it. Indeed, if we have (1)-(4), then the proof of diagonability, given for self-adjoint transformations, applies; the converse we leave as an exercise for the reader.
EXERCISES
Exercise 1. Suppose that \(A\) is a linear transformation on a complex inner product space. Prove that if \(A\) is Hermitian, then the linear factors of the minimal polynomial of \(A\) are distinct. Is the converse true?
Exercise 2.
- Two linear transformations \(A\) and \(B\) on a unitary space are unitarily equivalent if there exists a unitary transformation \(U\) such that \(A=U^{-1} B U\) . (The corresponding concept in the real case is called orthogonal equivalence .) Prove that unitary equivalence is an equivalence relation.
- Are \(A^{*} A\) and \(A A^{*}\) always unitarily equivalent?
- Are \(A\) and \(A^{*}\) always unitarily equivalent?
Exercise 3. Which of the following pairs of matrices are unitarily equivalent?
- \(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\) and \(\begin{bmatrix} 0 & 0 \\ 1 & 0 \end{bmatrix}\) .
- \(\begin{bmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \end{bmatrix}\) and \(\begin{bmatrix} 1/2 & 1/2 & 0 \\ 1/2 & 1/2 & 0 \\ 0 & 0 & -1 \end{bmatrix}\) .
- \(\begin{bmatrix} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & -1 \end{bmatrix}\) and \(\begin{bmatrix} -1 & 0 & 0 \\ 0 & i & 0 \\ 0 & 0 & i \end{bmatrix}\) .
- \(\begin{bmatrix} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & -1 \end{bmatrix}\) and \(\begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}\) .
Exercise 4. If two linear transformations are unitarily equivalent, then they are similar, and they are congruent; if two linear transformations are either similar or congruent, then they are equivalent. Show by examples that these implication relations are the only ones that hold among these concepts.