Finite-Dimensional Vector Spaces

Although what we have been doing with linear transformations so far may have been complicated, it was to a large extent automatic. Having introduced the new concept of linear transformation, we merely let some of the preceding concepts suggest ways in which they are connected with linear transformations. We now begin the proper study of linear transformations. As a first application of the theory we shall solve the problems arising from a change of basis. These problems can be formulated without mentioning linear transformations, but their solution is most effectively given in terms of linear transformations.

Let \(\mathcal{V}\) be an \(n\) -dimensional vector space and let \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) and \(\mathcal{Y}=\{y_{1}, \ldots, y_{n}\}\) be two bases in \(\mathcal{V}\) . We may ask the following two questions.

Question I. If \(x\) is in \(\mathcal{V}\) , \(x=\sum_{i} \xi_{i} x_{i}=\sum_{i} \eta_{i} y_{i}\) , what is the relation between its coordinates \((\xi_{1}, \ldots, \xi_{n})\) with respect to \(\mathcal{X}\) and its coordinates \((\eta_{1}, \ldots, \eta_{n})\) with respect to \(\mathcal{Y}\) ?

Question II. If \((\xi_{1}, \ldots, \xi_{n})\) is an ordered set of \(n\) scalars, what is the relation between the vectors \(x=\sum_{i} \xi_{i} x_{i}\) and \(y=\sum_{i} \xi_{i} y_{i}\) ?

Both these questions are easily answered in the language of linear transformations. We consider, namely, the linear transformation \(A\) defined by \(A x_{i}=y_{i}\) , \(i=1, \ldots, n\) . More explicitly: \[A\Big(\sum_{i} \xi_{i} x_{i}\Big)=\sum_{i} \xi_{i} y_{i}.\] Let \((\alpha_{i j})\) be the matrix of \(A\) in the basis \(\mathcal{X}\) , that is, \(y_{j}=A x_{j}=\sum_{i} \alpha_{i j} x_{i}\) . We observe that \(A\) is invertible, since \(\sum_{i} \xi_{i} y_{i}=0\) implies that \(\xi_{1}=\xi_{2}=\cdots=\xi_{n}=0\) .

Answer to question I. Since \begin{align} \sum_{j} \eta_{j} y_{j} &= \sum_{j} \eta_{j} A x_{j}\\ &= \sum_{j} \eta_{j} \sum_{i} \alpha_{i j} x_{i} \\ &= \sum_{i}\Big(\sum_{j} \alpha_{i j} \eta_{j}\Big) x_{i}, \end{align} we have \[\xi_{i}=\sum_{j} \alpha_{i j} \eta_{j}. \tag{1}\]

Answer to question II. \[y=A x. \tag{2}\]

Roughly speaking, the invertible linear transformation \(A\) (or, more properly, the matrix \((\alpha_{i j})\) may be considered as a transformation of coordinates (as in (1)), or it may be considered (as we usually consider it, in (2)) as a transformation of vectors.

In classical treatises on vector spaces it is customary to treat vectors as numerical \(n\) -tuples, rather than as abstract entities; this necessitates the introduction of some cumbersome terminology. We give here a brief glossary of some of the more baffling terms and notations that arise in connection with dual spaces and adjoint transformations.

If \(\mathcal{V}\) is an \(n\) -dimensional vector space, a vector \(x\) is given by its coordinates with respect to some preferred, absolute coordinate system; these coordinates form an ordered set of \(n\) scalars. It is customary to write this set of scalars in a column, \[x=\begin{bmatrix} \xi_{1} \\ \xi_2 \\ \vdots \\ \xi_{n} \end{bmatrix}.\] Elements of the dual space \(\mathcal{V}^{\prime}\) are written as rows, \(x^{\prime}=(\xi_{1}^{\prime}, \ldots, \xi_{n}^{\prime})\) . If we think of \(x\) as a (rectangular) \(n\) -by-one matrix, and of \(x^{\prime}\) as a one-by- \(n\) matrix, then the matrix product \(x^{\prime} x\) is a one-by-one matrix, that is, a scalar. In our notation this scalar is \([x, x^{\prime}]=\xi_{1} \xi_{1}^{\prime}+\cdots+\xi_{n} \xi_{n}^{\prime}\) . The trick of considering vectors as thin matrices works even when we consider the full-grown matrices of linear transformations. Thus the matrix product of \((\alpha_{i j})\) with the column \((\xi_{j})\) is the column whose \(i\) -th element is \(\eta_{i}=\sum_{j} \alpha_{i j} \xi_{j}\) . Instead of worrying about dual bases and adjoint transformations, we may form similarly the product of the row \((\xi_{j}^{\prime})\) with the matrix \((\alpha_{i j})\) in the order \((\xi_{j}^{\prime})(\alpha_{i j})\) ; the result is the row that we earlier denoted by \(y^{\prime}=A^{\prime} x^{\prime}\) . The expression \([A x, x^{\prime}]\) is now abbreviated as \(x^{\prime} \cdot A \cdot x\) ; both dots denote ordinary matrix multiplication. The vectors \(x\) in \(\mathcal{V}\) are called covariant and the vectors \(x^{\prime}\) in \(\mathcal{V}^{\prime}\) are called contravariant . Since the notion of the product \(x^{\prime} \cdot x\) (that is, \([x, x^{\prime}]\) ) depends, from this point of view, on the coordinates of \(x\) and \(x^{\prime}\) , it becomes relevant to ask the following question: if we change basis in \(\mathcal{V}\) , in accordance with the invertible linear transformation \(A\) , what must we do in \(\mathcal{V}^{\prime}\) to preserve the product \(x^{\prime} \cdot x\) ? In our notation: if \([x, x^{\prime}]=[y, y^{\prime}]\) , where \(y=A x\) , then how is \(y^{\prime}\) related to \(x^{\prime}\) ? Answer: \(y^{\prime}=(A^{\prime})^{-1} x^{\prime}\) . To express this whole tangle of ideas the classical terminology says that the vectors \(x\) vary cogrediently whereas the \(x^{\prime}\) vary contragrediently .