A Simple Data Mining Method
 for Variable Assessment
Bruce Ratner, Ph.D.

Determining the relationship between a predictor variable and the target variable is an essential task in the model building process. If the relationship is found to be significant, then the predictor variable is included in the model in a form corresponding to the uncovered relationship. Most methods of variable assessment are based on the well-known correlation coefficient, which is often misused. The purpose of this article is to present a new simple data mining method for assessing the relationship between predictor and target variables.

I first provide a brief review of the correlation coefficient because it is the centerpiece of variable assessment. Then, I present the smoothed scatterplot and a nonparametric test as the proposed method for assessing the relationship between two variables.

