Simple Data Mining Method
for Variable Assessment Bruce Ratner, Ph.D.
Determining the
relationship between a predictor variable and the target variable is an
essential task in the model building process. If the relationship is
found to be significant, then the predictor variable is included in the
model in a form corresponding to the uncovered relationship. Most
methods of variable assessment are based on the well-known correlation
coefficient, which is often misused. The purpose of this article is to
present a new simple data mining method for assessing the relationship
between predictor and target variables.
I first provide a brief
review of the correlation
coefficient because it is the centerpiece of variable assessment.
Then, I present the smoothed scatterplot and a nonparametric test as
the proposed method for assessing the relationship between two
1 800 DM STAT-1, or e-mail at br@dmstat1.com. |