The easiest (and at the same time the most useful) generalizations of the spectral theorem apply to complex inner product spaces (that is, unitary spaces). In order to avoid irrelevant complications, in this section we exclude the real case and concentrate attention on unitary spaces only.
We have seen that every Hermitian transformation is diagonable, and that an arbitrary transformation \(A\) may be written in the form \(B+i C\) , with \(B\) and \(C\) Hermitian; why isn’t it true that simply by diagonalizing \(B\) and \(C\) separately we can diagonalize \(A\) ? The answer is, of course, that diagonalization involves the choice of a suitable orthonormal basis, and there is no reason to expect that a basis that diagonalizes \(B\) will have the same effect on \(C\) . It is of considerable importance to know the precise class of transformations for which the spectral theorem is valid, and fortunately this class is easy to describe.
We shall call a linear transformation \(A\) normal if it commutes with its adjoint, \(A^{*} A=A A^{*}\) . (This definition makes sense, and is used, in both real and complex inner product spaces; we shall, however, continue to use techniques that are inextricably tied up with the complex case.) We point out first that \(A\) is normal if and only if its real and imaginary parts commute. Suppose, indeed, that \(A\) is normal and that \(A=B+i C\) with \(B\) and \(C\) Hermitian; since \(B=\frac{1}{2}(A+A^{*})\) and \(C=\frac{1}{2 i}(A-A^{*})\) , it is clear that \(B C=C B\) . If, conversely, \(B C=C B\) , then the two relations \(A=B+i C\) and \(A^{*}=B-i C\) imply that \(A\) is normal. We observe that Hermitian and unitary transformations are normal.
The class of transformations possessing a spectral form in the sense of Section: Spectral theorem is precisely the class of normal transformations. Half of this statement is easy to prove: if \(A=\sum_{j} \alpha_{j} E_{j}\) , then \(A^{*}=\sum_{j} \bar{\alpha}_{j} E_{j}\) , and it takes merely a simple computation to show that \(A^{*} A=A A^{*}=\sum_{j}|\alpha_{j}|^{2} E_{j}\) . To prove the converse, that is, to prove that normality implies the existence of a spectral form, we have two alternatives. We could derive this result from the spectral theorem for Hermitian transformations, using the real and imaginary parts, or we could prove that the essential lemmas of Section: Characterization of spectra , on which the proof of the Hermitian case rests, are just as valid for an arbitrary normal transformation. Because its methods are of some interest, we adopt the second procedure. We observe that the machinery needed to prove the lemmas that follow was available to us in Section: Characterization of spectra , so that we could have stated the spectral theorem for normal transformations immediately; the main reason we traveled the present course was to motivate the definition of normality.
Theorem 1. If \(A\) is normal, then a necessary and sufficient condition that \(x\) be a proper vector of \(A\) is that it be a proper vector of \(A^{*}\) ; if \(A x=\lambda x\) , then \(A^{*} x=\bar{\lambda}x\) .
Proof. We observe that the normality of \(A\) implies that \begin{align} \|A x\|^{2} &= (A x, A x)\\ &= (A^{*} A x, x)\\ &= (A A^{*} x, x) \\ &= (A^{*} x, A^{*} x)\\ &= \|A^{*} x\|^{2}. \end{align}
Since \(A-\lambda\) is normal along with \(A\) , and since \((A-\lambda)^{*}=A^{*}-\bar{\lambda}\) , we obtain the relation \[\|A x-\lambda x\|=\|A^{*} x-\bar{\lambda} x\|,\] from which the assertions of the theorem follow immediately. ◻
Theorem 2. If \(A\) is normal, then proper vectors belonging to distinct proper values are orthogonal.
Proof. If \(A x_{1}=\lambda_{1} x_{1}\) and \(A x_{2}=\lambda_{2} x_{2}\) , then \[\lambda_{1}(x_{1}, x_{2})=(A x_{1}, x_{2})=(x_{1}, A^{*} x_{2})=\lambda_{2}(x_{1}, x_{2}).\] ◻
This theorem generalizes Section: Characterization of spectra , Theorem 4; in the proof of the spectral theorem for Hermitian transformations we needed also Section: Characterization of spectra , Theorems 5 and 6. The following result takes the place of the first of these.
Theorem 3. If \(A\) is normal, \(\lambda\) is a proper value of \(A\) , and \(\mathcal{M}\) is the set of all solutions of \(A x=\lambda x\) , then both \(\mathcal{M}\) and \(\mathcal{M}^{\perp}\) are invariant under \(A\) .
Proof. The fact that \(\mathcal{M}\) is invariant under \(A\) we have seen before; this has nothing to do with normality. To prove that \(\mathcal{M}^{\perp}\) is also invariant under \(A\) , it is sufficient to prove that \(\mathcal{M}\) is invariant under \(A^{*}\) . This is easy; if \(x\) is in \(\mathcal{M}\) , then \[A(A^{*} x)=A^{*}(A x)=\lambda(A^{*} x),\] so that \(A^{*} x\) is also in \(\mathcal{M}\) . ◻
This theorem is much weaker than its correspondent in Section: Characterization of spectra . The important thing to observe, however, is that the proof of Section: Characterization of spectra , Theorem 6, depended only on the correspondingly weakened version of Theorem 5; the only subspaces that need to be considered are the ones of the type mentioned in the preceding theorem.
This concludes the spade work; the spectral theorem for normal operators follows just as before in the Hermitian case. If in the theorems of Section: Spectral theorem we replace the word "self-adjoint" by "normal," delete all references to reality, and insist that the underlying inner product space be complex, the remaining parts of the statements and all the proofs remain unchanged.
It is the theory of normal transformations that is of chief interest in the study of unitary spaces. One of the most useful facts about normal transformations is that spectral conditions of the type given in Section: Characterization of spectra , Theorems 1 and 3, there shown to be necessary for the self-adjoint, positive, and isometric character of a transformation, are in the normal case also sufficient.
Theorem 4. A normal transformation on a finite-dimensional unitary space is (1) Hermitian, (2) positive, (3) strictly positive, (4) unitary, (5) invertible, (6) idempotent if and only if all its proper values are (1 \('\) ) real, (2 \('\) ) positive, (3 \('\) ) strictly positive, (4 \('\) ) of absolute value one, (5 \('\) ) different from zero, (6 \('\) ) equal to zero or one.
Proof. The fact that (1), (2), (3), and (4) imply (1 \('\) ), (2 \('\) ), (3 \('\) ), and (4 \('\) ), respectively, follows from Section: Characterization of spectra . If \(A\) is invertible and \(A x=\lambda x\) , with \(x \neq 0\) , then \[x=A^{-1} A x=\lambda A^{-1} x,\] and therefore \(\lambda \neq 0\) ; this proves that (5) implies (5 \('\) ). If \(A\) is idempotent and \(A x=\lambda x\) , with \(x \neq 0\) , then \[\lambda x=A x=A^{2} x=\lambda^{2} x,\] so that \((\lambda-\lambda^{2}) x=0\) and therefore \(\lambda=\lambda^{2}\) ; this proves that (6) implies (6 \('\) ). Observe that these proofs are valid for an arbitrary inner product space (not even necessarily finite-dimensional) and that the auxiliary assumption that \(A\) is normal is also superfluous.
Suppose now that the spectral form of \(A\) is \(\sum_{j} \alpha_{j} E_{j}\) . Since \(A^{*}=\sum_{j} \bar{\alpha}_{j} E_{j}\) , we see that (1 \('\) ) implies (1). Since \[(A x, x)=\sum_{j} \alpha_{j}(E_{j} x, x)=\sum_{j} \alpha_{j}\|E_{j} x\|^{2},\] it follows that (2 \('\) ) implies (2). If \(\alpha_{j}>0\) for all \(j\) and if \((A x, x)=0\) , then we must have \(E_{j} x=0\) for all \(j\) , and therefore \(x=\sum_{j} E_{j} x=0\) ; this proves that (3 \('\) ) implies (3). The implication from (4 \('\) ) to (4) follows from the relation \[A^{*} A=\sum_{j}|\alpha_{j}|^{2} E_{j}.\] If \(\alpha_{j} \neq 0\) for all \(j\) , we may form the linear transformation \(B=\sum_{j} \frac{1}{\alpha_{j}} E_{j}\) ; since \(A B=B A=1\) , it follows that (5 \('\) ) implies (5). Finally \(A^{2}=\sum_{j} \alpha_{j}^{2} E_{j}\) ; from this we infer that (6 \('\) ) implies (6).
We observe that the implication relations (5) \(\implies\) (5 \('\) ), (2) \(\implies\) (2 \('\) ), and (3 \('\) ) \(\implies\) (3) together fulfill a promise we made in Section: Positive transformations ; if \(A\) is positive and invertible, then \(A\) is strictly positive. ◻
EXERCISES
Exercise 1. Give an example of a normal transformation that is neither Hermitian nor unitary.
Exercise 2.
- If \(A\) is an arbitrary linear transformation (on a finite-dimensional unitary space), and if \(\alpha\) and \(\beta\) are complex numbers such that \(|\alpha|=|\beta|=1\) , then \(\alpha A + \beta A^{*}\) is normal.
- If \(\|A x\|=\|A^{*} x\|\) for all \(x\) , then \(A\) is normal.
- Is the sum of two normal transformations always normal?
Exercise 3. If \(A\) is a normal transformation on a finite-dimensional unitary space and if \(\mathcal{M}\) is a subspace invariant under \(A\) , then the restriction of \(A\) to \(\mathbb{M}\) is also normal.
Exercise 4. A linear transformation \(A\) on a finite-dimensional unitary space \(\mathcal{V}\) is normal if and only if \(A \mathcal{M} \subset \mathcal{M}\) implies \(A \mathcal{M}^{\perp} \subset \mathcal{M}^{\perp}\) for every subspace \(\mathcal{M}\) of \(\mathcal{V}\) .
Exercise 5.
- If \(A\) is normal and idempotent, then it is self-adjoint.
- If \(A\) is normal and nilpotent, then it is zero.
- If \(A\) is normal and \(A^{3}=A^{2}\) , then \(A\) is idempotent. Does the conclusion remain true if the assumption of normality is omitted?
- If \(A\) is self-adjoint and if \(A^{k}=1\) for some strictly positive integer \(k\) , then \(A^{2}=1\) .
Exercise 6. If \(A\) and \(B\) are normal and if \(A B=0\) , does it follow that \(B A=0\) ?
Exercise 7. Suppose that \(A\) is a linear transformation on an \(n\) -dimensional unitary space; let \(\lambda_{1}, \ldots, \lambda_{n}\) be the proper values of \(A\) (each occurring a number of times equal to its algebraic multiplicity). Prove that \[\sum_{i}|\lambda_{i}|^{2} \leq \operatorname{tr}(A^{*} A),\] and that \(A\) is normal if and only if equality holds.
Exercise 8. The numerical range of a linear transformation \(A\) on a finite-dimensional unitary space is the set \(W(A)\) of all complex numbers of the form \((A x, x)\) , with \(\|x\|=1\) .
- If \(A\) is normal, then \(W(A)\) is convex. (This means that if \(\xi\) and \(\eta\) are in \(W(A)\) and if \(0 \leq \alpha \leq 1\) , then \(\alpha \xi+(1-\alpha) \eta\) is also in \(W(A)\) .)
- If \(A\) is normal, then every extreme point of \(W(A)\) is a proper value of \(A\) . (An extreme point is one that does not have the form \(\alpha \xi+(1-\alpha) \eta\) for any \(\xi\) and \(\eta\) in \(W(A)\) and for any \(\alpha\) properly between \(0\) and \(1\) .)
- It is known that the conclusion of (a) remains true even if normality is not assumed. This fact can be phrased as follows: if \(A_{1}\) and \(A_{2}\) are Hermitian transformations, then the set of all points of the form \(\big((A_{1} x, x),(A_{2} x, x)\big)\) in the real coordinate plane (with \(\|x\|=1\) ) is convex. Show that the generalization of this assertion to more than two Hermitian transformations is false.
- Prove that the conclusion of (b) may be false for non-normal transformations.