|
Logistic Regression: An Overview
Bruce Ratner, Ph.D. Logistic regression is a popular technique for classifying individuals into two mutually exclusive and exhaustive categories, for example: buy-not buy or responder-non-responder. It is the workhorse of response modeling as its results are considered the gold standard. Moreover, it is used as the benchmark for assessing the superiority of newer techniques, such as GenIQ, a genetic model, and older techniques, such as CHAID (1) (2), which is a regression tree. In database marketing, response to a prior solicitation is the binary class variable (defined by responder and non-responder), and the logistic regression model is built to classify an individual as either most likely or least likely to respond to a future solicitation. In order to explain logistic regression, I first provide a brief overview of the technique and include the program code for the widely used SASŪ system for building and scoring a logistic regression model. The code is a welcome addition to the techniques used by data analysts working on the two-group classification problem. Next, I present a case study demonstrating the building of a response model for an investment product solicitation. The case study fortuitously allows for the illustrations of a host data mining techniques. 1 800 DM STAT-1; or e-mail at br@dmstat1.com. |