Theorem 1. If \(\mathcal{V}\) is an \(n\) -dimensional inner product space, then there exist complete orthonormal sets in \(\mathcal{V}\) , and every complete orthonormal set in \(\mathcal{V}\) contains exactly \(n\) elements. The orthogonal dimension of \(\mathcal{V}\) is the same as its linear dimension.
Proof. To people not fussy about hunting for an element in a possibly uncountable set, the existence of complete orthonormal sets is obvious. Indeed, we have already seen that orthonormal sets exist, so we choose one; if it is not complete, we may enlarge it, and if the resulting orthonormal set is still not complete, we enlarge it again, and we proceed in this way by induction. Since an orthonormal set may contain at most \(n\) elements, in at most \(n\) steps we shall reach a complete orthonormal set. This set spans the whole space (see Section: Completeness , Theorem 2, (1) \(\implies\) (3)), and, since it is also linearly independent, it is a basis and therefore contains precisely \(n\) elements. This proves the first assertion of the theorem; the second assertion is now obvious from the definitions. ◻
There is a constructive method of avoiding this crude induction, and since it sheds further light on the notions involved, we reproduce it here as an alternative proof of the theorem.
Let \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) be any basis in \(\mathcal{V}\) . We shall construct a complete orthonormal set \(\mathcal{Y}=\{y_{1}, \ldots, y_{n}\}\) with the property that each \(y_{j}\) is a linear combination of \(x_{1}, \ldots, x_{j}\) . To begin the construction, we observe that \(x_{1} \neq 0\) (since \(\mathcal{X}\) is linearly independent) and we write \(y_{1}=x_{1} /\|x_{1}\|\) . Suppose now that \(y_{1}, \ldots, y_{r}\) have been found so that they form an orthonormal set and so that each \(y_{j}\) ( \(j=1, \ldots, r\) ) is a linear combination of \(x_{1}, \ldots, x_{j}\) . We write \[z=x_{r+1}-(\alpha_{1} y_{1}+\cdots+\alpha_{r} y_{r}),\] where the values of the scalars \(\alpha_{1}, \ldots, \alpha_{r}\) are still to be determined. Since \[(z, y_{j})=\Big(x_{r+1}-\sum_{i} \alpha_{i} y_{i}, y_{j}\Big)=(x_{r+1}, y_{j})-\alpha_{j}\] for \(j=1, \ldots, r\) , it follows that if we choose \(\alpha_{j}=(x_{r+1}, y_{j})\) , then \((z, y_{j})=0\) for \(j=1, \ldots, r\) . Since, moreover, \(z\) is a linear combination of \(x_{r+1}\) and \(y_{1}, \ldots, y_{r}\) , it is also a linear combination of \(x_{r+1}\) and \(x_{1}, \ldots, x_{r}\) . Finally \(z\) is different from zero, since \(x_{1}, \ldots, x_{r}, x_{r+1}\) are linearly independent and the coefficient of \(x_{r+1}\) in the expression for \(z\) is not zero. We write \(y_{r+1}=z /\|z\|\) ; clearly \(\{y_{1}, \ldots, y_{r}, y_{r+1}\}\) is again an orthonormal set with all the desired properties, and the induction step is accomplished. We shall make use of the fact that not only is each \(y_{j}\) a linear combination of the \(x\) ’s with indices between \(1\) and \(j\) , but, vice versa, each \(x_{j}\) is a linear combination of the \(y\) ’s with indices between \(1\) and \(j\) . The method of converting a linear basis into a complete orthonormal set that we just described is known as the Gram-Schmidt orthogonalization process .
We shall find it convenient and natural, in inner product spaces, to work exclusively with such bases as are also complete orthonormal sets. We shall call such a basis an orthonormal basis or an orthonormal coordinate system ; in the future, whenever we discuss bases that are not necessarily orthonormal, we shall emphasize this fact by calling them linear bases.
EXERCISES
Exercise 1. Convert \(\mathcal{P}_{2}\) into an inner product space by writing \((x, y)=\int_{0}^{1} x(t) \overline{y(t)} \,d t\) whenever \(x\) and \(y\) are in \(\mathcal{P}_{2}\) , and find a complete orthonormal set in that space.
Exercise 2. If \(x\) and \(y\) are orthogonal unit vectors (that is, \(\{x, y\}\) is an orthonormal set), what is the distance between \(x\) and \(y\) ?
Exercise 3. Prove that if \(|(x, y)|=\|x\| \cdot\|y\|\) (that is, if the Schwarz inequality reduces to an equality), then \(x\) and \(y\) are linearly dependent.
Exercise 4.
- Prove that the Schwarz inequality remains true if, in the definition of an inner product, "strictly positive" is replaced by "non-negative."
- Prove that for a "non-negative" inner product of the type mentioned in (a), the set of all those vectors \(x\) for which \((x, x)=0\) is a subspace.
- Form the quotient space modulo the subspace mentioned in (b) and show that the given "inner product" induces on that quotient space, in a natural manner, an honest (strictly positive) inner product.
- Do the considerations in (a), (b), and (c) extend to normed spaces (with possibly no inner product)?
Exercise 5.
- Given a strictly positive number \(\alpha\) , try to define a norm in \(\mathbb{R}^{2}\) by writing \[\|x\|=(|\xi_{1}|^{\alpha}+|\xi_{2}|^{\alpha})^{1 / \alpha}\] whenever \(x=(\xi_{1}, \xi_{2})\) . Under what conditions on \(\alpha\) does this equation define a norm?
- Prove that the equation \[\|x\|=\max \{|\xi_{1}|,|\xi_{2}|\}\] defines a norm in \(\mathbb{R}^{2}\) .
- To which ones among the norms defined in (a) and (b) does there correspond an inner product in \(\mathbb{R}^{2}\) such that \(\|x\|^{2}=(x, x)\) for all \(x\) in \(\mathbb{R}^{2}\) ?
Exercise 6.
- Prove that a necessary and sufficient condition on a real normed space that there exist an inner product satisfying the equation \(\|x\|^{2}=(x, x)\) for all \(x\) is that \[\|x+y\|^{2}+\|x-y\|^{2}=2\|x\|^{2}+2\|y\|^{2}\] for all \(x\) and \(y\) .
- Discuss the corresponding assertion for complex spaces.
- Prove that a necessary and sufficient condition on a norm in \(\mathbb{R}^{2}\) that there exist an inner product satisfying the equation \(\|x\|^{2}=(x, x)\) for all \(x\) in \(\mathbb{R}^{2}\) is that the locus of the equation \(\|x\|=1\) be an ellipse.
Exercise 7. If \(\{x_{1}, \ldots, x_{n}\}\) is a complete orthonormal set in an inner product space, and if \(y_{j}=\sum_{i=1}^{j} x_{i}\) , \(j=1, \ldots, n\) , express in terms of the \(x\) ’s the vectors obtained by applying the Gram-Schmidt orthogonalization process to the \(y\) ’s.