Pca

Principal Component Analysis - a dimensionality reduction algorithm

it identifies the hyperplane that lies closest to data
- then projects the data onto it

before you can project the training set onto a lower-dimensional hyperplane
- you need to choose the right hyperplane
  
  e.g. a simple 2D dataset with 3 different axes (1D hyperplanes)
  - result of the projection of the dataset onto each axis

it should seem reasonable to select the axis that preserves the maximum amount of variance

as it will most likely lose less information that the other projections
another way to justify a choice is that it is the axis that minimizes the mean square distance between the original dataset and its projection onto that axis

PCA identifies the axis that accounts for th largest amount of variance in the training set