Appendix: Hilbert Space

Probably the most useful and certainly the best developed generalization of the theory of finite-dimensional inner product spaces is the theory of Hilbert space. Without going into details and entirely without proofs we shall now attempt to indicate how this generalization proceeds and what are the main difficulties that have to be overcome.

The definition of Hilbert space is easy: it is an inner product space satisfying one extra condition. That this condition (namely, completeness) is automatically satisfied in the finite-dimensional case is proved in elementary analysis. In the infinite-dimensional case it may be possible that for a sequence \((x_{n})\) of vectors \(\|x_{n}-x_{m}\| \to 0\) as \(n, m \to \infty\) , but still there is no vector \(x\) for which \(\|x_{n}-x\| \to 0\) ; the only effective way of ruling out this possibility is explicitly to assume its opposite. In other words: a Hilbert space is a complete inner product space. (Sometimes the concept of Hilbert space is restricted by additional conditions, whose purpose is to limit the size of the space from both above and below. The most usual conditions require that the space be infinite-dimensional and separable. In recent years, ever since the realization that such additional restrictions do not pay for themselves in results, it has become customary to use "Hilbert space" for the concept we defined.)

It is easy to see that the space \(\mathcal{P}\) of polynomials with the inner product defined by \((x, y)=\int_{0}^{1} x(t) \overline{y(t)}\,dt\) is not complete. In connection with the completeness of certain particular Hilbert spaces there is quite an extensive mathematical lore. Thus, for instance, the main assertion of the celebrated (Riesz-Fischer theorem is that a Hilbert space manufactured out of the set of all those functions \(x\) for which \(\int_{0}^{1}|x(t)|^{2} \,dt < \infty\) (in the sense of Lebesgue integration) is a Hilbert space (with formally the same definition of inner product as for polynomials). Another popular Hilbert space, reminiscent in its appearance of finite-dimensional coordinate space, is the space of all those sequences \((\xi_{n})\) of numbers (real or complex, as the case may be) for which \(\sum_{n}|\xi_{n}|^{2}\) converges.

Using completeness in order to discuss intelligently the convergence of some infinite sums, one can proceed for quite some time in building the theory of Hilbert spaces without meeting any difficulties due to infinite-dimensionality. Thus, for instance, the notions of orthogonality and of complete orthonormal sets can be defined in the general case exactly as we defined them. Our proof of Bessel’s inequality and of the equivalence of the various possible formulations of completeness for orthonormal sets have to undergo slight verbal changes only. (The convergence of the various infinite sums that enter is an automatic consequence of Bessel’s inequality.) Our proof of Schwarz’s inequality is valid, as it stands, in the most general case. Finally, the proof of the existence of complete orthonormal sets parallels closely the proof in the finite case. In the unconstructive proof Zorn’s lemma (or transfinite induction) replaces ordinary induction, and even the constructive steps of the Gram-Schmidt process are easily carried out.

In the discussion of manifolds, functionals, and transformations the situation becomes uncomfortable if we do not make a concession to the topology of Hilbert space. Good generalizations of all our statements for the finite-dimensional case can be proved if we consider closed linear manifolds, continuous linear functionals, and bounded linear transformations. (In a finite-dimensional space every linear manifold is closed, every linear functional is continuous, and every linear transformation is bounded.) If, however, we do agree to make these concessions, then once more we can coast on our finite-dimensional proofs without any change most of the time, and with only the insertion of an occasional \(\epsilon\) the rest of the time. Thus once more we obtain that \(\mathcal{V} = \mathcal{M} \oplus \mathcal{M}^{\perp}\) , that \(\mathcal{M} = \mathcal{M}^{\perp \perp}\) , and that every linear functional of \(x\) has the form \((x, y)\) ; our definitions of self-adjoint and of positive transformations still make sense, and all our theorems about perpendicular projections (as well as their proofs) carry over without change.

The first hint of how things can go wrong comes from the study of orthogonal and unitary transformations We still call a transformation \(U\) orthogonal or unitary (according as the space is real or complex) if \(UU^{*}=U^{*}U=1\) , and it is still true that such a transformation is isometric, that is, that \(\|U x\|=\|x\|\) for all \(x\) , or, equivalently, \((U x, U y)=(x, y)\) for all \(x\) and \(y\) . It is, however, easy to construct an isometric transformation that is not unitary: because of its importance in the construction of counterexamples we shall describe one such transformation. We consider a Hilbert space in which there is a countable complete orthonormal set, say \(\{x_{0}, x_{1}, x_{2}, \ldots\}\) . A unique bounded linear transformation \(U\) is defined by the conditions \(U x_{n}=x_{n+1}\) for \(n=0,1,2, \ldots\) . This \(U\) is isometric ( \(U^{*} U=1\) ), but, since \(U U^{*} x_{0}=0\) , it is not true that \(U U^{*}=1\) .

It is when we come to spectral theory that the whole flavor of the development changes radically. The definition of proper value as a number \(\lambda\) for which \(A x=\lambda x\) has a non-zero solution still makes sense, and our theorem about the reality of the proper values of a self-adjoint transformation is still true. The notion of proper value loses, however, much of its significance. Proper values are so very useful in the finite-dimensional case because they are a handy way of describing the fact that something goes wrong with the inverse of \(A-\lambda\) , and the only thing that can go wrong is that the inverse refuses to exist. Essentially different things can happen in the infinite-dimensional case; just to illustrate the possibilities, we mention, for example, that the inverse of \(A-\lambda\) may exist but be unbounded. That there is no useful generalization of determinant, and hence of the characteristic equation, is the least of our worries. The whole theory has, in fact, attained its full beauty and maturity only after the slavish imitation of such finite-dimensional methods was given up.

After some appreciation of the fact that the infinite-dimensional case has to overcome great difficulties, it comes as a pleasant surprise that the spectral theorem for self-adjoint transformations (and, in the complex case, even for normal ones) does have a very beautiful and powerful generalization. (Although we describe the theorem for bounded transformations only, there is a large class of unbounded ones for which it is valid.) In order to be able to understand the analogy, let us re-examine the finite-dimensional case.

Let \(A\) be a self-adjoint linear transformation on a finite-dimensional inner product space, and let \(A=\sum_{j} \lambda_{j} F_{j}\) be its spectral form. If \(M\) is an interval in the real axis, we write \(E(M)\) for the sum of all those \(F_{j}\) for which \(\lambda_{j}\) belongs to \(M\) . It is clear that \(E(M)\) is a perpendicular projection for each \(M\) . The following properties of the projection-valued interval-function \(E\) are the crucial ones: if \(M\) is the union of a countable collection \(\{M_{n}\}\) of disjoint intervals, then \[E(M)=\sum_{n} E(M_{n}), \tag{1}\] and if \(M\) is the improper interval consisting of all real numbers, then \(E(M)=1\) . The relation between \(A\) and \(E\) is described by the equation \[A=\sum_{j} \lambda_{j} E(\{\lambda_{j}\}),\] where, of course, \(\{\lambda_{j}\}\) is the degenerate interval consisting of the single number \(\lambda_{j}\) . Those familiar with Lebesgue-Stieltjes integration will recognize the last written sum as a typical approximating sum to an integral of the form \(\int \lambda \,dE(\lambda)\) and will therefore see how one may expect the generalization to go. The algebraic concept of summation is to be replaced by the analytic concept of integration; the generalized relation between \(A\) and \(E\) is described by the equation \[A=\int \lambda \,dE(\lambda). \tag{2}\] Except for this formal alteration, the spectral theorem for self-adjoint transformations is true in Hilbert space. We have, of course, to interpret correctly the meaning of the limiting operations involved in (1) and (2). Once more we are faced with the three possibilities mentioned in Section 91. They are called uniform, strong, and weak convergence respectively, and it turns out that both (1) and (2) may be given the strong interpretation. (The reader deduces, of course, from our language that in an infinite-dimensional Hilbert space the three possibilities are indeed distinct.)

We have seen that the projections \(F_{j}\) entering into the spectral form of \(A\) in the finite-dimensional case are very simple functions of \(A\) (Section 82). Since the \(E(M)\) are obtained from the \(F_{j}\) by summation, they also are functions of \(A\) , and it is quite easy to describe what functions. We write \(g_{M}(\zeta)= 1\) if \(\zeta\) is in \(M\) and \(g_{M}(\zeta)=0\) otherwise; then \(E(M)=g_{M}(A)\) . This fact gives the main clue to a possible proof of the general spectral theorem. The usual process is to discuss the functional calculus for polynomials, and, by limiting processes, to extend it to a class of functions that includes all the functions \(g_M\) . Once this is done, we may define the interval-function \(E\) by writing \(E(M)=g_{M}(A)\) ; there is no particular difficulty in establishing that \(E\) and \(A\) satisfy (1) and (2).

After the spectral theorem is proved, it is easy to deduce from it the generalized versions of our theorems concerning square roots, the functional calculus, the polar decomposition, and properties of commutativity, and, in fact, to answer practically every askable question about bounded normal transformations.

The chief difficulties that remain are the considerations of non-normal an \(d\) of unbounded transformations. Concerning general non-normal transformations, it is quite easy to describe the state of our knowledge; it is non-existent. No even unsatisfactory generalization exists for the triangular form or for the Jordan canonical form and the theory of elementary divisors. Very different is the situation concerning normal (and particularly self-adjoint) unbounded transformations. (The reader will sympathize with the desire to treat such transformations if he recalls that the first and most important functional operation that most of us learn is differentiation.) In this connection we shall barely hint at the main obstacle the theory faces. It is not very difficult to show that if a self-adjoint linear transformation is defined for all vectors of Hilbert space, then it is bounded. In other words, the first requirement concerning transformations that we are forced to give up is that they be defined everywhere. The discussion of the precise domain on which a self-adjoint transformation may be defined and of the extent to which this domain may be enlarged is the chief new difficulty encountered in the study of unbounded transformations.