Next: Discrete Populations (Probability Models) Up: Resampling Previous: Introduction

Class Code for Resampling

To effectively use resampling to estimate probabilities of desired events requires many trials, say 1000 to 10,000. No sane person is going to do this with a random number table, but again the computer will not get bored doing 10,000 trials. Further, by setting up the model as we have been doing we do have a correct algorithm for the resampling. We could try to learn to code these algorithms into a computer program. This would necessitate the learning of a computer program which is not the purpose of this class.

So what can we do? Setting up the model as we have been doing is the most important thing here. If you can set up the model correctly then you understand the problem. This is the important idea. Thinking back on the problems we have solved, the tedious thing here is using the random number table to do the trials. So we have constructed class code that will do all this work for us. It requires input but if have modeled the problem correctly this will be easy.

Lets go back to our simple example in the last section. Here's the problem and our resampling model:

On the roll of a fair 6 sided die, determine the probability that a 1 or 2 is the upface. Here's the first 3 steps of the resampling experiment:

1.
Use single digit random numbers 0 through 9. Discard (actually skip) digits 0, 7, 8, and 9.
2.
The event A is a 1 or 2.
3.
Pick at random a starting point in the 10 digit random number table. This is the first outcome. Read the succeeding outcomes one after another going down that column to the end. Then move to the top of the next column and continue until we have N trials.
Now lets obtain 20 trials of our resampling experiment, using the class code. We need the following input:
1.
Number of trials: lets just do 20 the first time.
2.
Minimum value of desired random numbers: 1.
3.
Maximum value of desired random numbers: 6.
4.
Number to be drawn (length of the trial): 1.
5.
With or Without Replacement: With Replacement (although, since the length is one it doesn't matter).
Simply click on the class code (Random number generation for resampling trials) and input these items.

What did you get? Here's what I got. Note that the output is simple: the trial number followed by the outcome of the trial (in this case the upface of a fair die). Our results will differ, since the class code starts at a new place (based on the time of day) for each run.

Trial 1
3
Trial 2
6
Trial 3
1
Trial 4
1
Trial 5
6
Trial 6
3
Trial 7
4
Trial 8
5
Trial 9
4
Trial 10
4
Trial 11
1
Trial 12
1
Trial 13
3
Trial 14
6
Trial 15
6
Trial 16
2
Trial 17
1
Trial 18
1
Trial 19
4
Trial 20
3

I got 7 successes (a 1 or a 2) out of 20 trials. Hence my estimate of the probability of a 1 or a 2 is and my estimate of error is .21 .

Note that we still have to examine the trials to see if the desired event came up or didn't. Hence, it is hard to see us doing 1000 trials to get a good estimate. But again, the main point is SET UP A CORRECT MODEL and if you input the right numbers and understand when the event occurs or doesn't occur on a trial then YOU DO UNDERSTAND THE PROBLEM!

Lets do the urn problem with the class code. Recall the problem and our resampling solution of the last section. Here is a resampling model:

1.
Choose two digit random numbers, 00 through 99. Discard 00 and 81 through 99. The numbers 01 through 30 represent a blue ball while the numbers 31 through 80 represent a red ball. Select 3 numbers and discard ties (sampling without replacement).
2.
If we get 3 numbers from 01 through 30 then 3 blue ball were obtained and if we get 3 numbers 31 through 80 then 3 red balls were obtained. In either case, 3 of the same color occurred. Count these up.
3.
Pick at random a starting point in the 10 digit random number table. Use 2 columns. This is the first outcome. Read the succeeding outcomes one after another going down those 2 columns to the end. Then move to the top of the next 2 columns and continue until we have N trials.
The input for 20 trials via class code is:
1.
Number of trials: lets just do 20 the first time.
2.
Minimum value of desired random numbers: 1.
3.
Maximum value of desired random numbers: 80.
4.
Number to be drawn (length of the trial): 3.
5.
With or Without Replacement: Without Replacement.
Click on the class code and input these items.

What did you get? Here's what I got. Note that the output is simply the trial number followed by the outcome of the trial (in this case the three balls drawn).

Trial 1
31      73      79
Trial 2
1       30      80
Trial 3
15      42      65
Trial 4
52      53      61
Trial 5
30      46      54
Trial 6
17      24      76
Trial 7
10      34      52
Trial 8
69      74      77
Trial 9
2       18      47
Trial 10
4       32      59
Trial 11
24      26      80
Trial 12
1       22      42
Trial 13
33      48      65
Trial 14
42      48      70
Trial 15
30      65      77
Trial 16
30      67      71
Trial 17
2       24      48
Trial 18
9       32      77
Trial 19
42      65      79
Trial 20
18      59      70

Note that all red came up in trials 1, 4, 8, 13 ,14 and 19. All blue never came up. So the estimate of the desired probability is and the standard error of estimation is .204.

Exercise 4.2.1
This exercise uses the class code (Random number generation for resampling trials).
1.
Use the class code to obtain 20 trials of your resampling experiment for Problem #1 in the last set of exercises.
2.
Use the class code to obtain 20 trials of your resampling experiment for Problem #2 in the last set of exercises.
3.
Use the class code to obtain 20 trials of your resampling experiment for Problem #3 in the last set of exercises.
4.
Use the class code to obtain 20 trials of your resampling experiment for Problem #4 in the last set of exercises.
5.
Use the class code to obtain 20 trials of your resampling experiment for Problem #5 in the last set of exercises.
6.
1000 parts are shipped into a factory. Your job is to obtain a random sample of 20 (without replacement) of these parts for inspection. If the parts are tagged 1001 through 2000, use the class code to obtain your sample.
7.
For the last problem, suppose your quality control plan rejects the shipment, if 5 or more of the sampled parts are defective. Suppose that really 20% of the shipped parts are defective. Determine the probability of returning the lot using the quality control plan.

Estimate the desired probability by doing 30 resamplings.

8.
Same as the last problem but now only 10% are defective.
9.
We can solve a problem we have been discussing (opening with a pair, in 5 card poker) but the counting is a bit tedious, (need to count by 13's fast). But if enough of you do, say, 5 poker hands we can combine the results. Use the numbers 1 through 52 to denote the cards. Let
• 1, 14, 27, 30 denote Ace.
• 2, 15, 28, 31 denote a two.
• Etc.
Now sample 5 numbers (length of trial) 1 through 52 without repacement. Do this 5 times (5 trials). Count as a success a pair (not 3 nor 4 of a kind).

Next: Discrete Populations (Probability Models) Up: Resampling Previous: Introduction

2001-01-01