The Math Behind PCA(Principal Component Analysis)

A step towards statistical modelling

The Mathematics

Let the original dataset be of k+1 dimensional, ignoring the class label dimension becomes k dimensional to reduce dimension further to d such that d < k following steps of PCA must be followed….

  • Step2: Calculate the covariance matrix for the features in the dataset.
  • Step3: Calculate the eigenvalues and eigenvectors for the covariance matrix.
  • Step4: Sort eigenvalues and their corresponding eigenvectors.
  • Step5: Pick k eigenvalues and form a matrix of eigenvectors.
  • Step6: Transform the original matrix.

The Implementation…

  1. Define Dataset
The Data-points
  • Covariance: “Covariance indicates the level to which two variables vary together.” To compute it, it’s kind of like the regular variance, except that instead of squaring the deviation from the mean for one variable, we multiply the deviations for the two variables:
Eigenvectors
  • The second(Orange) eigenvector points along the direction of second-biggest variance
  • The third(Green) eigenvector points along the direction of third-biggest variance.
  • The Fourth(Red) eigenvector points along the direction of smallest variance.

Helping you build something Innovative…