|
Model Selection Is A Problem
Bruce Ratner, Ph.D. Model selection is a basal problem the data analyst faces, regardless of her background, e.g., statistics, econometrics, or machine learning. There are many popular predictive methods from which she can choose, say, linear regression, log-linear model, neural networks, and some newer models such as support vector machines. The model selection paradigm is: 1) Given training data consisting of variable-pairs {predictor, target}, a model is built to predict the target variable from a set of predictor variables by “fitting adjustable parameters.” 2) The selection of the optimal model is the model that performs best on the testing (hold-out) data, as well as produces the least “shrinkage,” namely, the smallest difference between the best model’s results on the training vis-à-vis and its results on the hold-out data. But, model selection is a problem - because fitting parameters is the weak-spot of parametric methods, such as those methods mentioned above. The parameters are hard to “fix up” as they openly vary when applied to new larger and perhaps shifting data. The working assumption is that the parameters will “hold up” inasmuch as the new data are like the training/hold-out data. The parameters, which inherently variegate when the model is set about the new data, expectantly produce model shrinkage. Time and again parametric models hold up reasonably well, but the data analyst never knows when, which is also part of the model selection problem. The purpose of this article is to introduce the new assumption-free, nonparametric GenIQ method whose model selection paradigm (inspired by Darwin's Principle of Survival of the Fittest) is: fitness begets structure, which is the element that wholly defines the model itself. Seemingly, GenIQ has a potential advantage over parametric models as it has no parametric weak-spot. GenIQ promises the data analyst to rethink parametric models. A case study is discussed to illustrate the potential of the new method for model selection with GenIQ Software implementation of the new method. For an eye-opening preview of the 9-step modeling process of GenIQ, click here. For FAQs about GenIQ, click here. 1 800 DM STAT-1, or e-mail at br@dmstat1.com. |
|