Lab 9a -- Factor Analysis

 

The area of Factor Analysis is, in many ways, diffuse and large.  We will consider only a portion of the subject.  Factor Analysis is generally used to find parsimony among several variables.  The quest is usually for underlying constructs or factors that each explain the variation among several variables.  We will consider two mathematical models in this lab, though there are many more, and variants of these.

 

SPSS Steps

Enter the variable names in an SPSS spreadsheet as usual in the "Variable View."  Although it is possible, under some circumstances to use categorical variables in factor analysis, you should use only ordinal, interval, or ratio scaled variables in this lab.  In the "Data View" enter the values for the variables for each case.  Remember a case is a row and a variable is a column.  Also remember to save your data.

 

For the factor analysis lab, you will need a minimum of 6 variables with about 30 subjects.   This is not enough for a serious factor analysis, but will be enough for your lab.  The example herein  is one in which we use eight biometric measures: height, armspan, forearem, lowerleg, weight, diameter, girth,  and width.   The purpose is to attempt to represent these eight variables by a smaller number of dimensions, or factors.  This file, "PCPADEMO.sav" is included in this folder.

 

To run the factor  analysis  lab go to the top of the spreadsheet and click on “Analyze/Data Reduction/Factor.”  A dialogue box will appear.  Highlight the variables in the variable list on the left that you wish to factor analyze and move them out of the variable list on the left  into the “Variables:”  box or the right by clicking on the right arrow between the boxes.

 

Now click on the “Descriptives…” button.  In the Descriptives dialog box that appears check  “Univariate descriptives,” “Coefficients,” “Significance levels,” and “KMO and Bartletts’ test of sphericity."   Using the example data set, your screen should look something like this:

 

 

Now click the “Continue” button in the Descriptives dialog box.

 

Now click the “Extraction…” button in the "Factor Analysis" box.   In the Extraction dialog box (below), all you need to do is to click “Scree plot.”  Before you click  “Continue” however, note that in this dialog box you can set the minimum eigenvalue to retain (SPSS uses "Kaiser's Rule" of larger than 1.00 if you do not select another minimum), and you can also specify the exact number of factors to retain regardless of the eigenvalues.  These are features that I leave for your experimentation.  In addition, note that there is the possibility of selecting different "Methods:" of factor analysis here.  The default method, highlighted in the image below is "Principal components."   As that is, by far, the most frequently used method, start with it.   When you click on the down arrow, you will see that there are many possible methods.  I also include the Principal Axis results (obtained by exactly the same steps as the Principal Component results,  except that the Principal Axis method was selected at this step) which is a Common Factor method for contrast.  Now click "Continue" in the "Extraction" box and it disappears.

 

 

Now click the “Rotation…” button in the Factor Analysis box.   In this Rotation dialog box (below) click “Varimax…”   and "Loading plot(s)."  Again, before you click the "Continue" button, note that a variety of  rotations are offered.  Now click the “Continue” button in the Rotation box and it disappears..

 

 

Now you are ready to click the “OK” button in the Factor Analysis box, and the analysis will run.

 

The “output” file will appear, and for this example both the Principal Components output and Principal Axis (Common Factor) output are  below. 

Principal Components -- Factor Analysis


Descriptive Statistics

Mean Std. Deviation Analysis N
HEIGHT 72.0000 20.03285 305
ARMSPAN 60.0000 15.02465 305
FOREARM 14.0000 5.00822 305
LOWERLEG 18.0000 10.01647 305
WEIGHT 180.0000 30.04926 305
DIAMETER 18.0000 7.01152 305
GIRTH 18.0000 7.01151 305
WIDTH 20.0000 7.01152 305


Correlation Matrix

HEIGHT ARMSPAN FOREARM LOWERLEG WEIGHT DIAMETER GIRTH WIDTH
Correlation HEIGHT 1.000 .846 .805 .859 .473 .398 .301 .382
ARMSPAN .846 1.000 .881 .826 .376 .326 .277 .415
FOREARM .805 .881 1.000 .801 .380 .319 .237 .345
LOWERLEG .859 .826 .801 1.000 .436 .329 .327 .365
WEIGHT .473 .376 .380 .436 1.000 .762 .730 .629
DIAMETER .398 .326 .319 .329 .762 1.000 .583 .577
GIRTH .301 .277 .237 .327 .730 .583 1.000 .539
WIDTH .382 .415 .345 .365 .629 .577 .539 1.000
Sig. (1-tailed) HEIGHT
.000 .000 .000 .000 .000 .000 .000
ARMSPAN .000
.000 .000 .000 .000 .000 .000
FOREARM .000 .000
.000 .000 .000 .000 .000
LOWERLEG .000 .000 .000
.000 .000 .000 .000
WEIGHT .000 .000 .000 .000
.000 .000 .000
DIAMETER .000 .000 .000 .000 .000
.000 .000
GIRTH .000 .000 .000 .000 .000 .000
.000
WIDTH .000 .000 .000 .000 .000 .000 .000


KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .845
Bartlett's Test of Sphericity Approx. Chi-Square 2085.738
df 28
Sig. .000


Communalities

Initial Extraction
HEIGHT 1.000 .877
ARMSPAN 1.000 .903
FOREARM 1.000 .872
LOWERLEG 1.000 .861
WEIGHT 1.000 .850
DIAMETER 1.000 .739
GIRTH 1.000 .717
WIDTH 1.000 .625
Extraction Method: Principal Component Analysis.


Total Variance Explained

Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Component Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 4.673 58.411 58.411 4.673 58.411 58.411 3.497 43.717 43.717
2 1.771 22.137 80.548 1.771 22.137 80.548 2.947 36.832 80.548
3 .481 6.013 86.561





4 .421 5.268 91.829





5 .233 2.915 94.744





6 .187 2.333 97.078





7 .137 1.716 98.794





8 9.646E-02 1.206 100.000





Extraction Method: Principal Component Analysis.

 


Component Matrix(a)

Component
1 2
HEIGHT .859 -.372
ARMSPAN .842 -.441
FOREARM .813 -.459
LOWERLEG .840 -.395
WEIGHT .758 .525
DIAMETER .674 .533
GIRTH .617 .580
WIDTH .671 .418
Extraction Method: Principal Component Analysis.
a 2 components extracted.


Rotated Component Matrix(a)

Component
1 2
HEIGHT .900 .260
ARMSPAN .930 .195
FOREARM .919 .164
LOWERLEG .899 .229
WEIGHT .251 .887
DIAMETER .181 .840
GIRTH .107 .840
WIDTH .251 .750
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 3 iterations.


Component Transformation Matrix
Component 1 2
1 .771 .636
2 -.636 .771
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.


 

______________________________________________________________________________________________

Common Factor Analysis

Communalities

Initial Extraction
HEIGHT .816 .838
ARMSPAN .849 .889
FOREARM .801 .821
LOWERLEG .788 .808
WEIGHT .749 .888
DIAMETER .604 .640
GIRTH .562 .583
WIDTH .478 .492
Extraction Method: Principal Axis Factoring.


Total Variance Explained

Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Factor Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 4.673 58.411 58.411 4.449 55.611 55.611 3.315 41.438 41.438
2 1.771 22.137 80.548 1.510 18.875 74.486 2.644 33.049 74.486
3 .481 6.013 86.561





4 .421 5.268 91.829





5 .233 2.915 94.744





6 .187 2.333 97.078





7 .137 1.716 98.794





8 9.646E-02 1.206 100.000





Extraction Method: Principal Axis Factoring.


Factor Matrix(a)

Factor
1 2
HEIGHT .856 -.324
ARMSPAN .848 -.411
FOREARM .808 -.409
LOWERLEG .831 -.342
WEIGHT .750 .571
DIAMETER .631 .492
GIRTH .569 .510
WIDTH .607 .351
Extraction Method: Principal Axis Factoring.
a 2 factors extracted. 9 iterations required.


Rotated Factor Matrix(a)

Factor
1 2
HEIGHT .872 .278
ARMSPAN .920 .204
FOREARM .887 .182
LOWERLEG .864 .248
WEIGHT .233 .913
DIAMETER .188 .778
GIRTH .129 .753
WIDTH .258 .652
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 3 iterations.

 

Interpretation

 

The descriptive information shows the means and standard deviations for all of the eight variables, as well as all possible bivariate correlations and their p values.  We note that all of the correlations are positive and significant as might be expected of these variables.

 

Barlett's test of spericity is significant, thus the hypothesis that the intercorrelation matrix  involving these eight variables is an identity matrix is rejected.  Thus from the perspective of Bartlett's test, factor analysis is feasible.  As Bartlett's test is almost always significant, a more discriminating index of factor analyzability is the KMO.  For this data set, it is .845, which is  very large, so the KMO also supports factor analysis.

 

Kaiser's rule of retaining factors with eigenvalues larger than 1.00 was used in this analysis as the default.  As the eigenvalues for the first two principal components (no distinction is made in deciding dimensionality by SPSS in the principal component and common factor analysis) with eigenvalues of  4.673 and 1.771 were retained. 

 

The  Principal Component communalities (Extraction, as the Initial are always 1.00) range from .625 to .903, thus most of the variance of these variables was accounted for by this two dimensional factor solution.  One can see that the corresponding Extraction communalities for the Common Factor analysis were a bit smaller (as would be expected) but still show the majority of the variance of all variables represented in the two factor solution.  Note that the "Initial" communality estimates for the SPSS version of a Principal Axis Common Factor Analysis are the R2 s predicting each of the variables from all other variables -- a usual choice.

 

Also note the Scree Plot in the Principal Components output (the same thing is produced in the Common Factor Analysis).  The Scree Plot is a graphic aid proposed by Cattell.  It is simply a plot of the monotonically descending eigenvalues.  It is intended to help in deciding where a the "trivial" dimensions begin.  One might argue that the Kaiser Rule opting for two dimensions is fairly well supported by the Scree Plot.

 

In the Principal Components Output, the Rotated  Component Matrix gives the correlation of each variable with each factor.  From the contribution of the variables (also called a "loading") we can name these factors something like "Lankiness" and "Heaviness."  One might come up with a variety of other names that are equally descriptive.  You will note that the results of the Common Factor analysis are much the same with loadings that are a bit smaller.  One might argue that the two methods, therefore, give the same result.  However, that would be dangerous as it depends on the number of variables, their communalities, and also we are restricting the results to the same dimensionality in this case.