Finite-Dimensional Vector Spaces

It is, of course, possible to generalize the considerations of the preceding section to multilinear forms and multiple tensor products. Instead of entering into that part of multilinear algebra, we proceed in a different direction; we go directly after determinants.

Suppose that \(A\) is a linear transformation on an \(n\) -dimensional vector space \(\mathcal{V}\) and let \(w\) be an alternating \(n\) -linear form on \(\mathcal{V}\) . If we write \(\overline{A} w\) for the function defined by \[(\overline{A} w)(x_{1}, \ldots, x_{n}) = w(A x_{1}, \ldots, A x_{n}),\] then \(\overline{A} w\) is an alternating \(n\) -linear form on \(\mathcal{V}\) , and, in fact, \(\overline{A}\) is a linear transformation on the space of such forms. Since (see Section: Alternating forms of maximal degree ) that space is one-dimensional, it follows that \(\overline{A}\) is equal to multiplication by an appropriate scalar. In other words, there exists a scalar \(\delta\) such that \(\overline{A} w=\delta w\) for every alternating \(n\) -linear form \(w\) . By this somewhat roundabout procedure (from \(A\) to \(\overline{A}\) to \(\delta\) ) we have associated a uniquely determined scalar \(\delta\) with every linear transformation \(A\) on \(\mathcal{V}\) ; we call \(\delta\) the determinant of \(A\) , and we write \(\delta=\det A\) . Observe that \(\det\) is neither a scalar nor a transformation, but a function that associates a scalar with each linear transformation.

Our immediate purpose is to study the function \(\det\) . We begin by finding the determinants of the simplest linear transformations, that is, the multiplications by scalars. If \(A x=\alpha x\) for every \(x\) in \(\mathcal{V}\) , then \begin{align} (\overline{A} w)(x_{1}, \ldots, x_{n}) &= w(\alpha x_{1}, \ldots, \alpha x_{n})\\ &= \alpha^{n} w(x_{1}, \ldots, x_{n}) \end{align} for every alternating \(n\) -linear form \(w\) ; it follows that \(\det A=\alpha^{n}\) . We note, in particular, that \(\operatorname{det} 0=0\) and \(\operatorname{det} 1=1\) .

Next we ask about the multiplicative properties of \(\det\) . Suppose that \(A\) and \(B\) are linear transformations on \(\mathcal{V}\) , and write \(C=A B\) . If \(w\) is an alternating \(n\) -linear form, then \begin{align} (\overline{C} w)(x_{1}, \ldots, x_{n}) &= w(A B x_{1}, \ldots, A B x_{n})\\ &= (\overline{A} w)(B x_{1}, \ldots, B x_{n})\\ &= (\overline{B} \overline{A} w)(x_{1}, \ldots, x_{n}), \end{align} so that \(\overline{C}=\overline{B} \overline{A}\) . Since \[\overline{C}_{w}=(\operatorname{det} C) w\] and \[\overline{B} \overline{A} w=(\operatorname{det} B) \overline{A} w=(\operatorname{det} B)(\operatorname{det} A) w,\] it follows that \[\operatorname{det}(A B)=(\operatorname{det} A)(\operatorname{det} B).\] (The values of \(\det\) are scalars, and therefore commute with each other.)

A linear transformation \(A\) is called singular if \(\operatorname{det} A=0\) and non-singular otherwise. Our next result is that \(A\) is invertible if and only if it is non-singular. Indeed, if \(A\) is invertible, then \[1=\operatorname{det} 1=\operatorname{det}(A A^{-1})=(\operatorname{det} A)(\operatorname{det} A^{-1}),\] and therefore \(\operatorname{det} A \neq 0\) . Suppose, on the other hand, that \(\operatorname{det} A \neq 0\) . If \(\{x_{1}, \ldots, x_{n}\}\) is a basis in \(\mathcal{V}\) , and if \(w\) is a non-zero alternating \(n\) -linear form on \(\mathcal{V}\) , then \((\operatorname{det} A) w(x_{1}, \ldots, x_{n}) \neq 0\) by Section: Alternating forms , Theorem 3. This implies, by Section: Alternating forms , Theorem 2, that the set \(\{A x_{1}, \ldots, A x_{n}\}\) is linearly independent (and therefore a basis); from this, in turn, we infer that \(A\) is invertible.

In the classical literature determinant is defined as a function of matrices (not linear transformations); we are now in a position to make contact with that approach. We shall derive an expression for \(\operatorname{det} A\) in terms of the elements \(\alpha_{i j}\) of the matrix corresponding to \(A\) in some coordinate system \(\{x_{1}, \ldots, x_{n}\}\) . Let \(w\) be a non-zero alternating \(n\) -linear form; we know that \[(\operatorname{det} A) w(x_{1}, \ldots, x_{n})=w(A x_{1}, \ldots, A x_{n}). \tag{1}\] If we replace each \(A x_{j}\) in the right side of (1) by \(\sum_{i} \alpha_{i j} x_{i}\) and expand the result by multilinearity, we obtain a long linear combination of terms such as \(w(z_{1}, \ldots, z_{n})\) , where each \(z\) is one of the \(x\) ’s. (Compare this part of the argument with the proof of Section: Alternating forms , Theorem 3.) If, in such a term, two of the \(z\) ’s coincide, then, since \(w\) is alternating, that term must vanish. If, on the other hand, all the \(z\) ’s are distinct, then \(w(z_{1}, \ldots, z_{n})=\pi w(x_{1}, \ldots, x_{n})\) for some permutation \(\pi\) , and, moreover, every permutation \(\pi\) can occur in this way. The coefficient of the term \(\pi w(x_{1}, \ldots, x_{n})\) is the product \(\alpha_{\pi(1), 1} \ldots \alpha_{\pi(n), n}\) . Since ( Section: Alternating forms , Theorem 1) \(w\) is skew symmetric, it follows that \[\operatorname{det} A=\sum_{\pi}(\operatorname{sgn} \pi) \alpha_{\pi(1), 1} \ldots \alpha_{\pi(n), n} \tag{2}\] where the summation is extended over all permutations \(\pi\) in \(\mathcal{S}_{n}\) . (Recall that \(w(x_{1}, \ldots, x_{n}) \neq 0\) , by Section: Alternating forms , Theorem 3, so that division by \(w(x_{1}, \ldots, x_{n})\) is legitimate.)

From this classical equation (2) we could derive many special properties of determinants by straightforward computation. Here is one example. If \(\sigma\) and \(\pi\) are permutations (in \(\mathcal{S}_{n}\) ), then (since \(\pi \sigma\) is also a permutation), it follows that the products \(\alpha_{\pi(1), 1} \ldots \alpha_{\pi(n), n}\) and \(\alpha_{\pi \sigma(1), \sigma(1)} \ldots \alpha_{\pi \sigma(n), \sigma(n)}\) differ in the order of their factors only. If, for each \(\pi\) , we take \(\sigma=\pi^{-1}\) , and then alter each summand in (2) accordingly, we obtain \[\operatorname{det} A=\sum_{\pi}(\operatorname{sgn} \pi) \alpha_{1, \pi(1)} \ldots \alpha_{n, \pi(n)}.\] (Note that \(\operatorname{sgn} \pi=\operatorname{sgn} \pi^{-1}\) and that the sum over all \(\pi\) is the same as the sum over all \(\pi^{-1}\) .) Since this last sum is just like the sum in (2), except that \(\alpha_{i, \pi(i)}\) appears in place of \(\alpha_{\pi(i), i}\) , it follows from an application of (2) to \(A^{\prime}\) in place of \(A\) that \[\operatorname{det} A=\operatorname{det} A^{\prime}.\]

Here is another useful fact about determinants. If \(\mathcal{M}\) is a subspace invariant under \(A\) , if \(B\) is the transformation \(A\) considered on \(\mathcal{M}\) only, and if \(C\) is the quotient transformation \(A / \mathcal{M}\) , then \(\operatorname{det} A=\operatorname{det} B \cdot \operatorname{det} C\) . This multiplicative relation holds if, in particular, \(A\) is the direct sum of two transformations \(B\) and \(C\) . The proof can be based directly on the definition of determinants, or, alternatively, on the expansion obtained in the preceding paragraph.

If, for a fixed linear transformation \(A\) , we write \(p(\lambda)=\operatorname{det}(A-\lambda)\) , then \(p\) is a function of the scalar \(\lambda\) ; we assert that it is, in fact, a polynomial of degree \(n\) in \(\lambda\) , and that the coefficient of \(\lambda^{n}\) is \((-1)^{n}\) . For the proof we may use the notation of (1). It is easy to see that \(w((A-\lambda) x_{1}, \ldots, (A-\lambda) x_{n})\) is a sum of terms such as \(\lambda^{k} w(y_{1}, \ldots, y_{n})\) , where \(y_{i}=x_{i}\) for exactly \(k\) values of \(i\) and \(y_{i}=A x_{i}\) for the remaining \(n-k\) values of \(i\) ( \(k=0,1, \ldots, n\) ). The polynomial \(p\) is called the characteristic polynomial of \(A\) ; the equation \(p=0\) , that is, \(\operatorname{det}(A-\lambda)=0\) , is the characteristic equation of \(A\) . The roots of the characteristic equation of \(A\) (that is, the scalars \(\alpha\) such that \(\operatorname{det}(A-\alpha)=0\) ) are called the characteristic roots of \(A\) .

EXERCISES

Exercise 1. Use determinants to get a new proof of the fact that if \(A\) and \(B\) are linear transformations on a finite-dimensional vector space, and if \(A B=1\) , then both \(A\) and \(B\) are invertible.

Exercise 2. If \(A\) and \(B\) are linear transformations such that \(A B=0\) , \(A \neq 0\) , \(B \neq 0\) , then \(\operatorname{det} A=\operatorname{det} B=0\) .

Exercise 3. Suppose that \((\alpha_{i j})\) is a non-singular \(n\) -by- \(n\) matrix, and suppose that \(A_{1}, \ldots, A_{n}\) are linear transformations (on the same vector space). Prove that if the linear transformations \(\sum_{j} \alpha_{i j} A_{j}\) , \(i=1, \ldots, n\) , commute with each other, then the same is true of \(A_{1}, \ldots, A_{n}\) .

Exercise 4. If \(\{x_{1}, \ldots, x_{n}\}\) and \(\{y_{1}, \ldots, y_{n}\}\) are bases in the same vector space, and if \(A\) is a linear transformation such that \(A x_{i}=y_{i}\) , \(i=1, \ldots, n\) , then \(\operatorname{det} A \neq 0\) .

Exercise 5. Suppose that \(\{x_{1}, \ldots, x_{n}\}\) is a basis in a finite-dimensional vector space \(\mathcal{V}\) . If \(y_{1}, \ldots, y_{n}\) are vectors in \(\mathcal{V}\) , write \(w(y_{1}, \ldots, y_{n})\) for the determinant of the linear transformation \(A\) such that \(A x_{j}=y_{j}\) , \(j=1, \ldots, n\) . Prove that \(w\) is an alternating \(n\) -linear form.

Exercise 6. If, in accordance with Section: Determinants , (2), the determinant of a matrix \((\alpha_{i j})\) (not a linear transformation) is defined to be \(\sum_{\pi}(\operatorname{sgn} \pi) \alpha_{\pi(1), 1} \ldots \alpha_{\pi(n), n}\) , then, for each linear transformation \(A\) , the determinants of all the matrices \([A; \mathcal{X}]\) are all equal to each other. (Here \(\mathcal{X}\) is an arbitrary basis.)

Exercise 7. If \((\alpha_{i j})\) is an \(n\) -by- \(n\) matrix such that \(\alpha_{i j}=0\) for more than \(n^{2}-n\) pairs of values of \(i\) and \(j\) , then \(\operatorname{det}(\alpha_{i j})=0\) .

Exercise 8. If \(A\) and \(B\) are linear transformations on vector spaces of dimensions \(n\) and \(m\) , respectively, then \[\operatorname{det}(A \otimes B)=(\operatorname{det} A)^{m} \cdot(\operatorname{det} B)^{n}.\]

Exercise 9. If \(A\) , \(B\) , \(C\) , and \(D\) are matrices such that \(C\) and \(D\) commute and \(D\) is invertible, then (cf. Section: Matrices of transformations , Ex. 19) \[\operatorname{det}\begin{bmatrix} A & B \\ C & D \end{bmatrix}=\operatorname{det}(A D-B C).\] (Hint: multiply on the right by \(\big[\begin{smallmatrix} 1 & 0 \\ X & 1 \end{smallmatrix}\big]\) .) What if \(D\) is not invertible? What if \(C\) and \(D\) do not commute?

Exercise 10. Do \(A\) and \(A^{\prime}\) always have the same characteristic polynomial?

Exercise 11.

If \(A\) and \(B\) are similar, then \(\operatorname{det} A=\operatorname{det} B\) .
If \(A\) and \(B\) are similar, then \(A\) and \(B\) have the same characteristic polynomial.
If \(A\) and \(B\) have the same characteristic polynomial, then \(\operatorname{det} A=\operatorname{det} B\) .
Is the converse of any of these assertions true?

Exercise 12. Determine the characteristic polynomial of the matrix (or, rather, of the linear transformation defined by the matrix) \[\begin{bmatrix} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \\ \alpha_{n-1} & \alpha_{n-2} & \alpha_{n-3} & \cdots & \alpha_{0} \end{bmatrix},\] and conclude that every polynomial is the characteristic polynomial of some linear transformation.

Exercise 13. Suppose that \(A\) and \(B\) are linear transformations on the same finite-dimensional vector space.

Prove that if \(A\) is a projection, then \(A B\) and \(B A\) have the same characteristic polynomial. (Hint: choose a basis that makes the matrix of \(A\) as simple as possible and then compute directly with matrices.)
Prove that, in all cases, \(A B\) and \(B A\) have the same characteristic polynomial. (Hint: find an invertible \(P\) such that \(P A\) is a projection and apply (a) to \(P A\) and \(B P^{-1}\) .)