Consider a matrix \(X_{n\times p}\) with \(n >> p\). It induces a linear map \(X: \mathbb{R}^p \to \mathbb{R}^n\), \(v \mapsto u = Xv\).
We can find a maximizer of \(f(v)=|Xv|^2\) subject to \(g(v)=|v|^2 = 1\) using Lagrange Multiplier \(\lambda\):
\[X^T X v = \lambda v, \qquad v^T v = 1.\]
Say \(v_1\) is a solution (maybe nonunique) and denote the corresponding multiplier as \(\lambda_1\). Let \(\sigma_1 = |Xv_1|\). Then \(u_1 = \frac{1}{\sigma_1} X v_1\) is a unit vector in the target space. Moreover,
\[\sigma_1^2 = v_1^T X^TXv_1 = \lambda_1 v_1^T v_1 = \lambda_1.\]
Therefore, \(\sigma_1 = \sqrt{\lambda_1}\).
A key observation for the induction step is: for any \(v \perp v_1\):
\[(Xv)\cdot u_1 = v^{T} X^T X v_1 = \lambda_1 v^T v_1 = 0.\]
That is, \(Xv \perp u_1\). Therefore, we have the restricted linear map \(X: \langle v_1\rangle^{\perp} \to \langle u_1\rangle^{\perp}\). By Induction, we find orthogonal vectors \((v_k, \sigma_k, u_k): 1\le k \le p\) such that
\[X(v_1, \dots, v_p) = (\sigma_1 u_1, \dots, \sigma_p u_p), \text{ or equally, } XV= U\Sigma. \]
Next, we extend \((u_1, \dots, u_p)\) to a complete orthonormal basis of the target space, \(U^e = (u_1, \dots, u_p, u_{p+1}, \dots, u_n)\). To adapt the above matrix formulation, we extend the \(p\times p\) diagonal matrix \(\Sigma\) to a \(n\times p\) diagonal matrix \(\Sigma^e\) by appending \((n-p)\times p\) rows of zeros. Then we have
\[XV= U^e \Sigma^e, \text{ or equally, } X= U^e\Sigma^e V^T.\]
For convenience, we usually drop the supscript \(e\) from both notations and write the SVD of \(X\) as \(X= U\Sigma V^T\).
Leave a Reply