DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

A Simple Bootstrap Variable Selection Method for Building
Database Marketing Models
Bruce Ratner, Ph.D.

Variable selection - determining which independent variables to include in a model - is a vital part of the model building process. Most data analysts use the well-known variable selection approaches, such as forward selection that includes one-by-one variables that contribute to the prediction of the target variable (binary/response for logistic regression; continuous/profit for ordinary least squares regression) until no additional variable contributes any significant improvement in the model's prediction. Not as well-known is the variable selection methods produce suboptimal models: either omitting an important (necessary) predictor variable producing biased predictions, or including an unnecessary variable producing large (unstable) prediction errors. The purpose of this article is to use in tandem the bootstrap and the variable selection methods for a less biased and more stable variable selection methodology. Two case studies are presented using response and profit database marketing models.

Related Articles:
1. When Data Are Too Large to Handle in the Memory of Your Computer
2. Creating A Bootstrap Sample

For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at