Finite-Dimensional Vector Spaces

We shall now restrict attention to the finite-dimensional case and draw certain easy conclusions from the theorem of the preceding section.

Definition 1. The rank , \(\rho(A)\) , of a linear transformation \(A\) on a finite-dimensional vector space is the dimension of \(\mathcal{R}(A)\) ; the nullity , \(\nu(A)\) , is the dimension of \(\mathcal{N}(A)\) .

Theorem 1. If \(A\) is a linear transformation on an \(n\) -dimensional vector space, then \(\rho(A)=\rho(A^{\prime})\) and \(\nu(A)=n-\rho(A)\) .

Proof. The theorem of the preceding section and Section: Annihilators , Theorem 1, together imply that \[\nu(A^{\prime})=n-\rho(A). \tag{1}\] Let \(\mathcal{X}=\{x_{1}, \ldots, x_{n}\}\) be any basis for which \(x_{1}, \ldots, x_{\nu}\) are in \(\mathcal{N}(A)\) ; then, for any \(x=\sum_{i} \xi_{i} x_{i}\) , we have \[A x=\sum_{i} \xi_{i} A x_{i}=\sum_{i=\nu+1}^{n} \xi_{i} A x_{i}.\] In other words, \(A x\) is a linear combination of the \(n-\nu\) vectors \(A x_{\nu+1}, \ldots, A x_{n}\) ; it follows that \(\rho(A) \leq n-\nu(A)\) . Applying this result to \(A^{\prime}\) and using (1), we obtain \[\rho(A^{\prime}) \leq n-\nu(A^{\prime})=\rho(A). \tag{2}\] In (2) we may replace \(A\) by \(A^{\prime}\) , obtaining \[\rho(A)=\rho(A^{\prime \prime}) \leq \rho(A^{\prime}); \tag{3}\] (2) and (3) together show that \[\rho(A)=\rho(A^{\prime}), \tag{4}\] and (1) and (4) together show that \[\nu(A^{\prime})=n-\rho(A^{\prime}). \tag{5}\] Replacing \(A\) by \(A^{\prime}\) in (5) gives, finally, \[\nu(A)=n-\rho(A). \tag{6}\] and concludes the proof of the theorem. ◻

These results are usually discussed from a little different point of view. Let \(A\) be a linear transformation on an \(n\) -dimensional vector space, and let \(\mathcal{X} = \{x_{1}, \ldots, x_{n}\}\) be a basis in that space; let \([A]=(\alpha_{i j})\) be the matrix of \(A\) in the coordinate system \(\mathcal{X}\) , so that \[A x_{j}=\sum_{i} \alpha_{i j} x_{i}.\] Since if \(x=\sum_{j} \xi_{j} x_{j}\) , then \(A x=\sum_{j} \xi_{j} A x_{j}\) , it follows that every vector in \(\mathcal{R}(A)\) is a linear combination of the \(A x_{j}\) , and hence of any maximal linearly independent subset of the \(A x_{j}\) . It follows that the maximal number of linearly independent \(A x_{j}\) is precisely \(\rho(A)\) . In terms of the coordinates \((\alpha_{1 j}, \ldots, \alpha_{n j})\) of \(A x_{j}\) we may express this by saying that \(\rho(A)\) is the maximal number of linearly independent columns of the matrix \([A]\) . Since ( Section: Adjoints of projections ) the columns of \([A^{\prime}]\) (the matrix being expressed in terms of the dual basis of \(\mathcal{X}\) ) are the rows of \([A]\) , it follows from Theorem 1 that \(\rho(A)\) is also the maximal number of linearly independent rows of \([A]\) . Hence "the row rank of \([A]=\) the column rank of \([A]=\) the rank of \([A]\) ."

Theorem 2. If \(A\) is a linear transformation on the \(n\) -dimensional vector space \(\mathcal{V}\) , and if \(\mathcal{H}\) is any \(h\) -dimensional subspace of \(\mathcal{V}\) , then the dimension of \(A\mathcal{H}\) is \(\geq h-\nu(A)\) .

Proof. Let \(\mathcal{K}\) be any subspace for which \(\mathcal{V} = \mathcal{H} \oplus \mathcal{K}\) , so that if \(k\) is the dimension of \(\mathcal{K}\) , then \(k=n-h\) . Upon operating with \(A\) we obtain \[A\mathcal{V} = A\mathcal{H} + A\mathcal{K}.\] (The sum is not necessarily a direct sum; see Section: Calculus of subspaces .) Since \(A\mathcal{V}=\mathcal{R}(A)\) has dimension \(n-\nu(A)\) , since the dimension of \(A \mathcal{K}\) is clearly \(\leq k=n-h\) , and since the dimension of the sum is \(\leq\) the sum of the dimensions, we have the desired result. ◻

Theorem 3. If \(A\) and \(B\) are linear transformations on a finite-dimensional vector space, then \begin{align} & \rho(A+B) \leq \rho(A)+\rho(B), \tag{7}\\ & \rho(A B) \leq \min \{\rho(A), \rho(B)\}, \tag{8} \end{align} and \[\nu(A B) \leq \nu(A)+\nu(B). \tag{9}\] If \(B\) is invertible, then \[\rho(A B)=\rho(B A)=\rho(A). \tag{10}\]

Proof. Since \((A B) x=A(B x)\) , it follows that \(\mathcal{R}(A B)\) is contained in \(\mathcal{R}(A)\) , so that \(\rho(A B) \leq \rho(A)\) , or, in other words, the rank of a product is not greater than the rank of the first factor. Let us apply this auxiliary result to \(B^{\prime} A^{\prime}\) ; this, together with what we already know, yields (8). If \(B\) is invertible, then \[\rho(A)=\rho(A B \cdot B^{-1}) \leq \rho(A B)\] and \[\rho(A)=\rho(B^{-1} \cdot B A) \leq \rho(B A);\] together with (8) this yields (10). The equation (7) is an immediate consequence of an argument we have already used in the proof of Theorem 2. The proof of (9) we leave as an exercise for the reader. (Hint: apply Theorem 2 with \(\mathcal{H} = B\mathcal{V} = \mathcal{R}(B)\) .) Together the two formulas (8) and (9) are known as Sylvester’s law of nullity . ◻