Category: OLS


  • 1. Linear regression seeks \(\hat\beta\) minimizing \(\|y – X\beta\|^2\), yielding the OLS estimator: \[\hat\beta = (X^TX)^{-1}X^Ty\] The feature matrix \(X\) encodes \(n\) sample points in \(\mathbb{R}^p\). The matrix \(X^TX\) captures the geometric spread of these points. 2. PCA as a Description of Sample Spread Centering \(\tilde{X} = X – \mathbf{1}\bar{x}^T\), the sample covariance \(S =…