PCA and Factor Analysis

If studying a data set with 100 variables and 100 observations (data objects), you must look into 100 means, 100 variances, and (100 * 100-100)/2 covariances, for a total of 5150 statistics to be studied as the representation of the underlying multivariate normal population sampled. PCA is a method of simplifying this task. The key to the problem is that much of the variability in the data set is not independent, i.e., there is a lot of covariation between the variables. If from all variables under consideration we could extract two variables that captured most of the independent variability in the entire data set, a simple binary scatter diagram would reveal most of the information in the data. Accordingly, data reduction is the primary objective to extract a few uncorrelated variables that may capture most of the variability in the data set, while preserving the orthogonality of these new optimal reference axes/variables (i.e., principal components). The 1st principal component captures the maximum variation in the data set. The 2nd principal component has the next most variation, and so on.

  • The coefficients of these new optimal reference axes are called loadings, and the projections of the original data onto these axes are called scores.
  • In a standard principal component analysis, the new reference represents the eigenvectors of the covariance matrix of the data by default. However, you can use the correlation matrix instead.
  • If you choose to standardize the data prior to the analysis, the new reference will represent the eigenvectors of the correlation matrix of the data.

The PCA output includes the PC scores (that will be placed in the source worksheets) and a matrix plot, as well as PC loadings and the correlation or covariance matrix data (that will be placed in auto-generated worksheets and displayed graphically using Aabel charts and/or table editor).

The graphs displayed in the matrix diagram above and bar charts below are from PCA analysis of some data from Davis, J.C. (2002).

PCA vs. Factor Analysis

In PCA we assume that all variability in an item should be used in the analysis, while in factor analysis, we define a priori the number of factors that we want to extract, and the extracted axes will be scaled to the variance along the new improved axes.

  • In factor analysis, it is often difficult to interpret the loadings, because they may show intermediate correlations with a large number of variables. Rotation of extracted factors attempts to put the factors in a simpler position with respect to the original variables in a manner that either minimizes (move towards one) or maximizes (moves towards zero) individual variable loadings. Aabel provides the Kaiser varimax rotation, which is the most commonly used.
  • The R-mode factor analysis considers the inter-relationships in a matrix of correlation between variables. The Q-mode analyzes the inter-object relationships.

Pre-Processing the Data

PCA and factor analysis methods allow optional pre-processing of the data prior to the main analysis. Examples of data transformations that can be used (as part of PCA or factor analysis) are:

  • Standardizing (PCA only)
  • Normalizing (PCA only)
  • Logarithmisizing
  • Log centering
  • Mean centering
  • Taking square root
  • Ranking variables individually
  • Ranking variables jointly

The factor analysis output includes the scores, loadings, and Correlation matrices.

To represent the factor loadings graphically, you can for example, use the binary scatter chart while using the option of plotting the X and Y axes through zero and connecting the data points to origo. You can use the multiple-Y column graphs to compare the fraction of variance of the variables explained by the model and the fraction that is not. For a graphical representation of the correlation matrices, you can use the heatmap diagram.