Let us now pick up the loose threads; having introduced the new concept of linear transformation, we must now find out what it has to do with the old concepts of bases, linear functionals, etc.

One of the most important tools in the study of linear transformations on finite-dimensional vector spaces is the concept of a matrix. Since this concept usually has no decent analogue in infinite-dimensional spaces, and since it is possible in most considerations to do without it, we shall try not to use it in proving theorems. It is, however, important to know what a matrix is; we enter now into the detailed discussion.

Definition 1. Let \(\mathcal{V}\) be an \(n\) -dimensional vector space, let \(\mathcal{X}=\{x_{1}, \ldots, x_n\}\) be any basis of \(\mathcal{V}\) , and let \(A\) be a linear transformation on \(\mathcal{V}\) . Since every vector is a linear combination of the \(x_{i}\) , we have in particular \[A x_{j}=\sum_{i} \alpha_{i j} x_{i}\] for \(j=1, \ldots, n\) . The set \((\alpha_{i j})\) of \(n^{2}\) scalars, indexed with the double subscript \(i\) , \(j\) , is the matrix of \(A\) in the coordinate system \(\mathcal{X}\) ; we shall generally denote it by \([A]\) , or, if it becomes necessary to indicate the particular basis \(\mathcal{X}\) under consideration, by \([A; \mathcal{X}]\) . A matrix \((\alpha_{i j})\) is usually written in the form of a square array: \[[A]=\begin{bmatrix} \alpha_{11} & \alpha_{12} & \cdots & \alpha_{1 n} \\ \alpha_{21} & \alpha_{22} & \cdots & \alpha_{2 n} \\ \vdots & \vdots & \ddots & \vdots \\ \alpha_{n 1} & \alpha_{n 2} & \cdots & \alpha_{n n} \end{bmatrix};\] the scalars \((\alpha_{i 1}, \ldots, \alpha_{i n})\) form a row , and \((\alpha_{1 j}, \ldots, \alpha_{n j})\) a column , of \([A]\) .

This definition does not define "matrix"; it defines "the matrix associated under certain conditions with a linear transformation." It is often useful to consider a matrix as something existing in its own right as a square array of scalars; in general, however, a matrix in this book will be tied up with a linear transformation and a basis.

We comment on notation. It is customary to use the same symbol, say, \(A\) , for the matrix as for the transformation. The justification for this is to be found in the discussion below (of properties of matrices). We do not follow this custom here, because one of our principal aims, in connection with matrices, is to emphasize that they depend on a coordinate system (whereas the notion of linear transformation does not), and to study how the relation between matrices and linear transformations changes as we pass from one coordinate system to another.

We call attention also to a peculiarity of the indexing of the elements \(\alpha_{i j}\) of a matrix \([A]\) . A basis is a basis, and so far, although we usually indexed its elements with the first \(n\) positive integers, the order of the elements in it was entirely immaterial. It is customary, however, when speaking of matrices, to refer to, say, the first row or the first column. This language is justified only if we think of the elements of the basis \(\mathcal{X}\) as arranged in a definite order. Since in the majority of our considerations the order of the rows and the columns of a matrix is as irrelevant as the order of the elements of a basis, we did not include this aspect of matrices in our definition. It is important, however, to realize that the appearance of the square array associated with \([A]\) varies with the ordering of \(\mathcal{X}\) .

Everything we shall say about matrices can, accordingly, be interpreted from two different points of view; either in strict accordance with the letter of our definition, or else following a modified definition which makes correspond a matrix (with ordered rows and columns) not merely to a linear transformation and a basis, but also to an ordering of the basis.

One more word to those in the know. It is a perversity not of the author, but of nature, that makes us write

\[A x_{j}=\sum_{i} \alpha_{i j} x_{i}\] 

instead of the more usual equation

\[A x_{i}=\sum_{j} \alpha_{i j} x_{j}.\] 

The reason is that we want the formulas for matrix multiplication and for the application of matrices to numerical vectors (that is, vectors \((\xi_{1}, \ldots, \xi_{n})\) in \(\mathbb{C}^{n}\) ) to appear normal, and somewhere in the process of passing from vectors to their coordinates the indices turn around. To state our rule explicitly: write \(A x_{j}\) as a linear combination of \(x_{1}, \ldots, x_{n}\) , and write the coefficients so obtained as the \(j\)-th column of the matrix \([A]\) . (The first index on \(\alpha_{i j}\) is always the row index; the second one, the column index.)

For an example we consider the differentiation transformation \(D\) on the space \(\mathcal{P}_{n}\) , and the basis \(\{x_{1}, \ldots, x_{n}\}\) defined by \(x_{i}(t)=t^{i-1}\) , \(i=1, \ldots, n\) . What is the matrix of \(D\) in this basis? We have \begin{align} D x_{1} &= 0 x_{1} &+ 0 x_{2} &+ \cdots +& 0 x_{n-1} &+ 0 x_{n} \\ D x_{2} &= 1 x_{1} &+ 0 x_{2} &+ \cdots +& 0 x_{n-1} &+ 0 x_{n} \\ D x_{3} &= 0 x_{1} &+ 2 x_{2} &+ \cdots +& 0 x_{n-1} &+ 0 x_{n} \tag{1}\\ \ & \vdots & & \quad \cdots & & \vdots \\ D x_{n} &= 0 x_{1} &+ 0 x_{2} &+ \cdots +& (n - 1) x_{n-1} &+ 0 x_{n}, \\ \end{align} so that \[[D]=\begin{bmatrix} 0 & 1 & 0 & \cdots & 0 & 0\\ 0 & 0 & 2 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 0 & n-1 \\ 0 & 0 & 0 & \cdots & 0 & 0 \end{bmatrix}. \tag{2}\] 

The unpleasant phenomenon of indices turning around is seen by comparing (1) and (2).