Principle Component Analysis 主成分分析

Principle Component Analysis 主成分分析


Intro

PCA is one of the most important and widely used dimension reduction algorithm. Other dimension reduction algorithm include LDA, LLE and Laplacian Eigenmaps.
You can regard PCA as a one layer neural network with nD input and mD output.
PCA reject nD data to mD with the maximum various.
Principle Component Analysis 主成分分析

PCA

We want to reduct data from nD to mD.
Y=AXY = AX
YRm1ARmnXRn1Y - R^{m*1} \\ A- R^{m*n} \\ X-R^{n*1}

A=[a1a2...am]Y=[a1(xxˉ)a2(xxˉ)...am(xxˉ)]A = \begin{bmatrix} -a_1-\\ -a_2- \\ ... \\ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\\ a_2(x-\bar{x}) \\ ... \\ a_m(x-\bar{x}) \end{bmatrix}

{Y=A(XXˉ)xˉ=E(X)=1pp=1Pxp\left\{\begin{array}{l} Y = A(X - \bar{X}) \\ \bar x = E(X) = \frac1p\sum\limits_{p=1}^Px_p \end{array}\right.

maxi=1P(yiyˉi)2\max \sum\limits_{i=1}^P(y_i - \bar y_i)^2
yˉi=1pi=1Pyi=aip(i=1Pxipxˉ)=0\bar y_i = \frac1p\sum\limits_{i=1}^Py_i = \frac {a_i}{p}(\sum\limits_{i=1}^Px_i - p\bar x) = 0
i=1P(yiyˉi)2=i=1P[a1(xixˉ)]2                      =i=1Pa1[(xixˉ)(xixˉ)T]a1T                      =a1i=1P[(xixˉ)(xixˉ)T]a1T                      =a1Σa1T\sum\limits_{i=1}^P(y_i - \bar y_i)^2 = \sum\limits_{i=1}^P[a_1(x_i - \bar x)]^2 \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =\sum\limits_{i=1}^Pa_1[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\sum\limits_{i=1}^P[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\Sigma a_1^T
Σ\Sigma – covariance matrix

Our target:
{max  a1Σa1Ts.t.     a1a1T=a12=1\left\{\begin{array}{l} \max \ \ a_1\Sigma a_1^T\\ s.t. \ \ \ \ \ a_1a_1^T = ||a_1||^2 = 1 \end{array}\right.
apply Lagrange multiplier method:
E(a1)=a1Σa1Tλ1(a1a1T1)Ea1=(Σa1Tλ1a1T)T=0Σa1T=λ1a1TE(a_1) = a_1\Sigma a_1^T - \lambda_1(a_1a_1^T -1) \\ \frac{\partial E}{\partial a_1} = (\Sigma a_1^T - \lambda_1a_1^T)^T = 0 \\ \Sigma a_1^T = \lambda_1a_1^T
λ1\lambda_1 is the largest eigenvalue of Σ\Sigma, and a1a_1 is its eigenvector.

{max  a2Σa2Ts.t.     a2a2T=a22=1\left\{\begin{array}{l} \max \ \ a_2\Sigma a_2^T\\ s.t. \ \ \ \ \ a_2a_2^T = ||a_2||^2 = 1 \end{array}\right.
…(same method)
λ2\lambda_2 is the second largest eigenvalue of Σ\Sigma, and a2a_2 is its eigenvector.

Summary

① find the value of Σ\Sigma
② find the eigenvalues aia_i of Σ\Sigma and sort them
③ normalize aia_i, let a1a1T=1a_1a_1^T=1
A=[a1a2...am]Y=[a1(xxˉ)a2(xxˉ)...am(xxˉ)]A = \begin{bmatrix} -a_1-\\ -a_2- \\ ... \\ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\\ a_2(x-\bar{x}) \\ ... \\ a_m(x-\bar{x}) \end{bmatrix}

Y=A(XXˉ)Y = A(X - \bar{X})