|
A Very Automatic Coding of
Dummy Variables for Database Response Modeling Bruce Ratner, Ph.D. Qualitative variables, such as gender and marital status, always represent valuable information for the modeling process. However, most modeling techniques cannot directly accept the contextual values (e.g., male or female; married or single) of qualitative variables. Dummy variable coding is the method used to transform qualitative variables into numerical "dummy" variables ready for the modeling process. Manually coding qualitative variables into dummy variables is a tedious task. This article provides a SAS-code program that very automatically creates dummy variables. The program should be a welcomed entry in the toolkit of data analysts who frequently work with qualitative data. Illustration data IN; input ID 2.0 GENDER $1. MARITAL $1.; cards; 01MS 02MM 03M 04 05FS 08FM 07F 08 M 09 S 10MD ; run; data IN; set IN; GENDER_ = GENDER; if GENDER =' ' then GENDER_ ='x'; MARITAL_= MARITAL;if MARITAL=' ' then MARITAL_='x'; run; proc transreg data=IN DESIGN; model class (GENDER_ / ZERO='x'); output out = GENDER_ (drop = Intercept _NAME_ _TYPE_); id ID; run; proc print; run; proc sort data=GENDER_ ;by ID; proc sort data=IN ;by ID; run; data IN; merge IN GENDER_ ; by ID; run; proc print data=IN; run; proc transreg data=IN DESIGN; model class (MARITAL_ / ZERO='x'); output out=MARITAL_ (drop= Intercept _NAME_ _TYPE_); id ID; run; proc print; run; proc sort data=MARITAL_;by ID; proc sort data=IN ;by ID; run; data IN; merge IN MARITAL_; by ID; run; proc print data=IN; run; 1 800 DM STAT-1, or e-mail at br@dmstat1.com. |
|