DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

Technical Report #8:
Scoring An Oblique Principal Component
Bruce Ratner, Ph.D.

Oblique Principal Components Analysis (OPCA) is a powerful exploratory data analysis technique, which can be used to uncover unexpected relationships among many variables. This report provides a SAS-code program for performing OPCA, and for scoring oblique principal components on an external dataset. 

I provide an illustration of typical OPCA output along with the SAS-code program which produced it. The program should be a welcomed entry in the toolkit of data analysts who frequently work with BIG data.

********** SAS-code Program **********

Data set IN is found in Technical Report #6.


PROC
VARCLUS data= IN MAXC=4 simple outstat=coef;
var  GENDER_F GENDER_M MARITAL_M MARITAL_S;
run;

/* Seeking the Three Cluster Solution */
data Coef3;
set Coef;
if _ncl_ = . or _ncl_ = 3;
drop _ncl_;
run;

PROC SCORE data=IN score=Coef3 out=scored;
var GENDER_F GENDER_M MARITAL_M MARITAL_S;
run;

/*** Assigning the Individual to the Classified Cluster-Segment ***/
data scored_classified;
set scored ;

temp=max(clus1, clus2, clus3);
       if clus1 = temp then predictd = clus1;
else if clus2 = temp then predictd = clus2;
else if clus3 = temp then predictd = clus3;
run;

data
scored_classified (drop=temp);
set    scored_classified;

temp=max(clus1, clus2, clus3);
       if clus1 = temp then segment = 'clus1';
else if clus2 = temp then segment = 'clus2';
else if clus3 = temp then segment = 'clus3'; 
run;



For more information about this article, call Bruce Ratner at 516.791.3544,
1 800 DM STAT-1, or e-mail at br@dmstat1.com.