DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

Expanding Your Statistical Computing Toolbox Bruce Ratner, Ph.D.

Typically, the data analyst approaches a problem directly with an (inflexible) procedure designed specifically for that purpose. For example, the everyday statistical problems of classification (i.e., assigning class membership with a categorical target variable), and prediction of a continuous target variable (e.g., sale or profit) are solved by the “old” standard binary or polynomial logistic regression (LR) models, and the ordinary least-squares regression (OLS) model, respectively. This is in stark contrast to the newer machine learning “algorithmic” methods, which are nominally statistical models, or more aptly non-statistical models, in that no effort is made to represent how the data were generated. There are nonparametric, assumption-free “flexible” procedures that let the data define the form of the model itself. The working assumption that today’s (big) data fit the OLS and LR models – which were formulated within the small-data setting of the day over 200 years ago, and 50 years ago, respectively – is not tenable. A flexible, any-size data model that is self-defining clearly offers a potential for building a reliable, highly predictive model, which was unimaginable two centuries ago, even a half century ago.

The purpose of this article is to present the algorithmic GenIQ Model©, a flexible, any-size data method that lets the data alone defines the model. I use the GenIQ Model in a real case study to show why it belongs in your statistical toolbox.



For more information about this article, call Bruce at 516.791.3544,
1 800 DM STAT-1, or e-mail at br@dmstat1.com.