DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

Technical Report #3:
Creating A Bootstrap Sample
Bruce Ratner, Ph.D.


Bootstrapping alludes to a German legend about Baron Münchhausen, who was able to lift himself out of a swamp by pulling himself up by his own hair. In later versions he was using his own boot straps to pull himself out of the sea which gave rise to the term bootstrapping. A bootstrap was a loop of leather sewn onto the back of each boot to hold onto when pulling boots onto ones feet. Bootstraps were still being used on leather boots during the early 20th century. In popular fiction when a poor boy became wealthy through his own efforts, he was said to have "pulled himself up by his own bootstraps". This metaphor continued into business financing where a highly profitable business might grow rapidly without external financing. [From Wikipedia.]

In statistics, the bootstrap is a method to determine the trustworthiness of a statistic, like the standard deviation is a measure of  trustworthiness (variability) of a mean. In other words, the bootstrap method is a generalized procedure to determine the trustworthiness of any statistic.

The bootstrap is a computer-intensive approach to statistical inference. It is the most popular resampling method, which uses the computer to extensively resample the sample at-hand. Each same-size bootstrap sample will be slightly different from one another. By random selection with replacement from the sample, some individuals occur more than once in a 'bootstrap' sample, and some individuals occur not at all. This variation makes it possible to induce an empirical sampling distribution of the desired statistic, from which estimates of bias and variability are determined. Suffice it to say that much about the bootstrap has value.

This report provides a SAS program for creating a bootstrap sample.


Solution

data sample (drop = i sample_size );
choice = int(ranuni(36830)*n) + 1;
set data_in       point = choice      nobs = n;
i+1;
sample_size=25; 
if i  = sample_size +1 then stop;
run;



Related Articles:
1. When Data Are Too Large to Handle in the Memory of Your Computer 
2. A Simple Bootstrap Variable Selection Method for Building Database Marketing Models 
3. Bootstraping In Direct Marketing: A New Approach for Validating Response Models



For more information about this article, call Bruce Ratner at 516.791.3544,
1 800 DM STAT-1, or e-mail at br@dmstat1.com.