|
Technical Report #8:
Scoring An Oblique Principal Component Bruce Ratner, Ph.D.
Oblique Principal Components Analysis (OPCA) is a powerful exploratory data analysis technique, which can be used to uncover unexpected relationships among many variables. This report provides a SAS-code program for performing OPCA, and for scoring oblique principal components on an external dataset. I provide an illustration of typical OPCA output along with the SAS-code program which produced it. The program should be a welcomed entry in the toolkit of data analysts who frequently work with BIG data. ********** SAS-code Program ********** Data set IN is found in Technical Report #6. PROC VARCLUS data= IN MAXC=4 simple outstat=coef; var GENDER_F GENDER_M MARITAL_M MARITAL_S; run; /* Seeking the Three Cluster Solution */ data Coef3; set Coef; if _ncl_ = . or _ncl_ = 3; drop _ncl_; run; PROC SCORE data=IN score=Coef3 out=scored; var GENDER_F GENDER_M MARITAL_M MARITAL_S; run; /*** Assigning the Individual to the Classified Cluster-Segment ***/ data scored_classified; set scored ; temp=max(clus1, clus2, clus3); if clus1 = temp then predictd = clus1; else if clus2 = temp then predictd = clus2; else if clus3 = temp then predictd = clus3; run; data scored_classified (drop=temp); set scored_classified; temp=max(clus1, clus2, clus3); if clus1 = temp then segment = 'clus1'; else if clus2 = temp then segment = 'clus2'; else if clus3 = temp then segment = 'clus3'; run; 1 800 DM STAT-1, or e-mail at br@dmstat1.com. |
|