The following two questions are closely related to those of the preceding section.

Question III. If \(B\) is a linear transformation on \(\mathcal{V}\) , what is the relation between its matrix \((\beta_{i j})\) with respect to \(\mathcal{X}\) and its matrix \((\gamma_{i j})\) with respect to \(\mathcal{Y}\) ?

Question IV. If \((\beta_{i j})\) is a matrix, what is the relation between the linear transformations \(B\) and \(C\) defined, respectively, by \(B x_{j}=\sum_{i} \beta_{i j} x_{i}\) and \(C y_{j}=\sum_{i} \beta_{i j} y_{i}\) ?

Questions III and IV are explicit formulations of a problem we raised before: to one transformation there correspond (in different coordinate systems) many matrices (question III) and to one matrix there correspond many transformations (question IV).

Answer to question III. We have \[B x_{j}=\sum_{i} \beta_{i j} x_{i} \tag{1}\] and \[B y_{j}=\sum_{i} \gamma_{i j} y_{i}. \tag{2}\] Using the linear transformation \(A\) defined in the preceding section, we may write \begin{align} B y_{j} &= B A x_{j} \\ &= B\Big(\sum_{k} \alpha_{k j} x_{k}\Big)\\ &= \sum_{k} \alpha_{k j} B x_{k}\\ &= \sum_{k} \alpha_{k j} \sum_{i} \beta_{i k} x_{i}\\ &= \sum_{i}\Big(\sum_{k} \beta_{i k} \alpha_{k j}\Big) x_{i} \tag{3} \end{align} and \begin{align} \sum_{k} \gamma_{k j} y_{k} &= \sum_{k} \gamma_{k j} A x_{k}\\ &= \sum_{k} \gamma_{k j} \sum_{i} \alpha_{i k} x_{i}\\ &= \sum_{i} \Big(\sum_{k} \alpha_{i k} \gamma_{k j}\Big) x_{i}. \tag{4} \end{align} Comparing (2), (3), and (4), we see that \[\sum_{k} \alpha_{i k} \gamma_{k j}=\sum_{k} \beta_{i k} \alpha_{k j}.\] Using matrix multiplication, we write this in the dangerously simple form \[=[B][A] \tag{5}\] The danger lies in the fact that three of the four matrices in (5) correspond to their linear transformations in the basis \(\mathcal{X}\) ; the fourth one –namely, the one we denoted by \([C]\) – corresponds to \(B\) in the basis \(\mathcal{Y}\) . With this understanding, however, (5) is correct. A more usual form of (5), adapted, in principle, to computing \([C]\) when \([A]\) and \([B]\) are known, is \[=[A]^{-1}[B][A]. \tag{6}\] 

Answer to question IV. To bring out the essentially geometric character of this question and its answer, we observe that \[C y_{j}=C A x_{j}\] and \begin{align} \sum_{i} \beta_{i j} y_{i} &= \sum_{i} \beta_{i j} A x_{i}\\ &= A\Big(\sum_{i} \beta_{i j} x_{i}\Big)\\ &= A B x_{j}. \end{align} Hence \(C\) is such that \[C A x_{j}=A B x_{j},\] or, finally, \[C=A B A^{-1}. \tag{7}\] There is no trouble with (7) similar to the one that caused us to make a reservation about the interpretation of (6); to find the linear transformation (not matrix) \(C\) , we multiply the transformations \(A\) , \(B\) , and \(A^{-1}\) , and nothing needs to be said about coordinate systems. Compare, however, the formulas (6) and (7), and observe once more the innate perversity of mathematical symbols. This is merely another aspect of the facts already noted in Sections 37 and 38.

Two matrices \([B]\) and \([C]\) are called similar if there exists an invertible matrix \([A]\) satisfying (6); two linear transformations \(B\) and \(C\) are called similar if there exists an invertible transformation \(A\) satisfying (7). In this language the answers to questions III and IV can be expressed very briefly; in both cases the answer is that the given matrices or transformations must be similar.

Having obtained the answer to question IV, we see now that there are too many subscripts in its formulation. The validity of (7) is a geometric fact quite independent of linearity, finite-dimensionality, or any other accidental property that \(A, B\) , and \(C\) may possess; the answer to question IV is also the answer to a much more general question. This geometric question, a paraphrase of the analytic formulation of question IV, is this: If \(B\) transforms \(\mathcal{V}\) , and if \(C\) transforms \(A \mathcal{V}\) the same way, what is the relation between \(B\) and \(C\) ? The expression "the same way" is not so vague as it sounds; it means that if \(B\) takes \(x\) into, say, \(u\) , then \(C\) takes \(A x\) into \(A u\) . The answer is, of course, the same as before: since \(B x=u\) and \(C y=v\) (where \(y=A x\) and \(v=A u\) ), we have \[A B x=A u=v=C y=C A x.\] The situation is conveniently summed up in the following mnemonic diagram: \begin{align} \begin{array}{rcl} \ & B & \ \\ x & \longrightarrow & u\\ A \,\, \big\downarrow & \ & \big\downarrow \,\, A\\ y & \longrightarrow & v \\ \ & C & \ \end{array} \end{align} We may go from \(y\) to \(v\) by using the short cut \(C\) , or by going around the block; in other words \(C=A B A^{-1}\) . Remember that \(A B A^{-1}\) is to be applied to \(y\) from right to left: first \(A^{-1}\) , then \(B\) , then \(A\) .

We have seen that the theory of changing bases is coextensive with the theory of invertible linear transformations. An invertible linear transformation is an automorphism , where by an automorphism we mean an isomorphism of a vector space with itself. (See Section: Isomorphism .) We observe that, conversely, every automorphism is an invertible linear transformation.

We hope that the relation between linear transformations and matrices is by now sufficiently clear that the reader will not object if in the sequel, when we wish to give examples of linear transformations with various properties, we content ourselves with writing down a matrix. The interpretation always to be placed on this procedure is that we have in mind the concrete vector space \(\mathbb{C}^{n}\) (or one of its generalized versions \(\mathbb{F}^{n}\) ) and the concrete basis \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) defined by \(x_{i}=(\delta_{i 1}, \ldots, \delta_{i n})\) . With this understanding, a matrix \((\alpha_{i j})\) defines, of course, a unique linear transformation \(A\) , given by the usual formula \[A\Big(\sum_{i} \xi_{i} x_{i}\Big)=\sum_{i}\Big(\sum_{j} \alpha_{i j} \xi_{j}\Big) x_{i}.\] 

EXERCISES

Exercise 1. If \(A\) is a linear transformation from a vector space \(\mathcal{U}\) to a vector space \(\mathcal{V}\) , then corresponding to each fixed \(y\) in \(\mathcal{V}^{\prime}\) there exists a vector, which might as well be denoted by \(A^{\prime} y\) , in \(\mathcal{U}^{\prime}\) so that \[[A x, y]=[x, A^{\prime} y]\] for all \(x\) in \(\mathcal{U}\) . Prove that \(A^{\prime}\) is a linear transformation from \(\mathcal{V}^{\prime}\) to \(\mathcal{U}^{\prime}\) . (The transformation \(A^{\prime}\) is called the adjoint of \(A\) .) Interpret and prove as many as possible among the equations Section: Adjoints , (2)-(8) for this concept of adjoint.

Exercise 2. 

  1. Prove that similarity of linear transformations on a vector space is an equivalence relation (that is, it is reflexive, symmetric, and transitive).
  2. If \(A\) is similar to a scalar \(\alpha\) , then \(A=\alpha\) .
  3. If \(A\) and \(B\) are similar, then so also are \(A^{2}\) and \(B^{2}\) , \(A^{\prime}\) and \(B^{\prime}\) , and, in case \(A\) and \(B\) are invertible, \(A^{-1}\) and \(B^{-1}\) .
  4. Generalize the concept of similarity to two transformations defined on different vector spaces. Which of the preceding results remain valid for the generalized concept?

Exercise 3. 

  1. If \(A\) and \(B\) are linear transformations on the same vector space and if at least one of them is invertible, then \(A B\) and \(B A\) are similar.
  2. Does the conclusion of (a) remain valid if neither \(A\) nor \(B\) is invertible?

Exercise 4. If the matrix of a linear transformation \(A\) on \(\mathbb{C}^{2}\) , with respect to the basis \(\{(1,0),(0,1)\}\) is \[\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix},\] what is the matrix of \(A\) with respect to the basis \(\{(1,1), (1,-1)\}\) ? What about the basis \(\{(1,0),(1,1)\}\) ?

Exercise 5. If the matrix of a linear transformation \(A\) on \(\mathbb{C}^{3}\) , with respect to the basis \(\{(1,0,0),(0,1,0),(0,0,1)\}\) is \[\begin{bmatrix} 0 & 1 & 1 \\ 1 & 0 & -1 \\ -1 & -1 & 0 \end{bmatrix},\] what is the matrix of \(A\) with respect to the basis \(\{(0,1,-1),(1,-1,1),(-1,1,0)\}\) ?

Exercise 6. 

  1. The construction of a matrix associated with a linear transformation depends on two bases, not one. Indeed, if \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) and \(\overline{\mathcal{X}}=\{\bar{x}_{1}, \ldots, \bar{x}_{n}\}\) are bases of \(\mathcal{V}\) , and if \(A\) is a linear transformation on \(\mathcal{V}\) , then the matrix \([A; \mathcal{X}, \overline{\mathcal{X}}]\) of \(A\) with respect to \(\mathcal{X}\) and \(\overline{\mathcal{X}}\) should be defined by \[A x_{j}=\sum_{i} \alpha_{i j} \bar{x}_{i}.\] The definition adopted in the text corresponds to the special case in which \(\overline{\mathcal{X}}=\mathcal{X}\) . The special case leads to the definition of similarity ( \(B\) and \(C\) are similar if there exist bases \(\mathcal{X}\) and \(\mathcal{Y}\) such that \([B; \mathcal{X}]=[C; \mathcal{Y}]\) ). The analogous relation suggested by the general case is called equivalence; \(B\) and \(C\) are equivalent if there exist basis pairs \((\mathcal{X}, \overline{\mathcal{X}})\) and \((\mathcal{Y}, \overline{\mathcal{Y}})\) such that \([B; \mathcal{X}, \overline{\mathcal{X}}] = [C; \mathcal{Y}, \overline{\mathcal{Y}}]\) . Prove that this notion is indeed an equivalence relation.
  2. Two linear transformations \(B\) and \(C\) are equivalent if and only if there exist invertible linear transformations \(P\) and \(Q\) such that \(P B=C Q\) .
  3. If \(A\) and \(B\) are equivalent, then so also are \(A^{\prime}\) and \(B^{\prime}\) .
  4. Does there exist a linear transformation \(A\) such that \(A\) is equivalent to a scalar \(\alpha\) , but \(A \neq \alpha\) ?
  5. Do there exist linear transformations \(A\) and \(B\) such that \(A\) and \(B\) are equivalent, but \(A^{2}\) and \(B^{2}\) are not?
  6. Generalize the concept of equivalence to two transformations defined on different vector spaces. Which of the preceding results remain valid for the generalized concept?