Using the GenIQ Model to insure the Validation of a Model is Unbiased

The traditional, most popular method of validating a model is to randomly split the original sample at hand into two mutually exclusive parts: a training subsample for developing the model, and a validation or hold-out subsample for assessing the reliability of the model. The working assumption for this type of “split-sample” validation is that the original sample is homogenous enough to yield two “identical” subsamples. If the working assumption is not tenable, then the analyst can be lucky to obtain a hold-out subsample with favorable characteristics, resulting in a better-than-true biased validation. On the other hand, the analyst can be unlucky to obtain a hold-out subsample with unfavorable characteristics, resulting in a worse-than-true biased validation. Of course, if the working assumption is tenable, then the validation is assumed to be unbiased. The purpose of this article is to present the GenIQ Model© as tool for detecting whether the training and hold-out subsamples represent the same universe: to insure that the validation of a model is unbiased, or at least honest.


	Using the GenIQ Model to Insure the Validation of a Model is Unbiased Bruce Ratner Ph.D. The traditional, most popular method of validating a model is to randomly split the original sample at hand into two mutually exclusive parts: a training subsample for developing the model, and a validation or hold-out subsample for assessing the reliability of the model. The working assumption for this type of “split-sample” validation is that the original sample is homogenous enough to yield two “identical” subsamples. If the working assumption is not tenable, then the analyst can be lucky to obtain a hold-out subsample with favorable characteristics, resulting in a better-than-true biased validation. On the other hand, the analyst can be unlucky to obtain a hold-out subsample with unfavorable characteristics, resulting in a worse-than-true biased validation. Of course, if the working assumption is tenable, then the validation is assumed to be unbiased. The purpose of this article is to present the GenIQ Model© as tool for detecting whether the training and hold-out subsamples represent the same universe: to insure that the validation of a model is unbiased, or at least honest. For more information about this article, call Bruce Ratner at 516.791.3544, 1 800 DM STAT-1, or e-mail at br@dmstat1.com. DM STAT-1 CONSULTING / br@dmstat1.com 574 Flanders Drive / North Woodmere, NY 11581 / U S A Voice 1-516-791-3544 / Fax 1-516-791-5075 Toll Free 1 800 DM STAT-1