
Expanding Your Statistical Computing Toolbox Bruce Ratner, Ph.D.
Typically, the data analyst approaches a problem directly with an (inflexible) procedure designed specifically for that purpose. For example, the everyday statistical problems of classification (i.e., assigning class membership with a categorical target variable), and prediction of a continuous target variable (e.g., sale or profit) are solved by the “old” standard binary or polynomial logistic regression (LR) models, and the ordinary leastsquares regression (OLS) model, respectively. This is in stark contrast to the newer machine learning “algorithmic” methods, which are nominally statistical models, or more aptly nonstatistical models, in that no effort is made to represent how the data were generated. There are nonparametric, assumptionfree “flexible” procedures that let the data define the form of the model itself. The working assumption that today’s (big) data fit the OLS and LR models – which were formulated within the smalldata setting of the day over 200 years ago, and 50 years ago, respectively – is not tenable. A flexible, anysize data model that is selfdefining clearly offers a potential for building a reliable, highly predictive model, which was unimaginable two centuries ago, even a half century ago. The purpose of this article is to present the algorithmic GenIQ Model©, a flexible, anysize data method that lets the data alone defines the model. I use the GenIQ Model in a real case study to show why it belongs in your statistical toolbox. 1 800 DM STAT1, or email at br@dmstat1.com. 
