Principle Component Analysis is a technique used to reduce the number of dimensions of your data. In the process you lose some accuracy but increase the simplicity of your data which can make it easier to use.

Basic Logic

Vocabulary

Steps

  1. Scale your numerical variables
  2. Center all of your points around the origin
  3. Find the line of best fit that goes through the origin but also maximizes the distance from your points projected points on the line, and the origin.
  4. This is your first Principle Component and the slope of this line will tell you how important each variable is
  5. Repeat this process creating as many Principle Components as there are variables which each Principle Component being perpendicular to the existing ones

Implementation