Next: Signed-Rank Wilcoxon Up: Design of Experiments Previous: Completely Randomized Designs

Randomized Paired Design

Noise  is often the villain in the analysis of an experimental design. There is just too much noise to see the target. The design we introduce next is an effective noise reducer . The price is a loss of information (nothing comes free). Also, as you will see, often it is not possible to do.

The setup is the same as the completely randomized design. We have two treatments, T1 and T2, applied to a response. We still want to test an estimate the effect, . The difference is that we can select a pair of experimental units. For example, identical twins on a study involving humans, the same house for a study on two house paints (halve the North wall), the same field for a study on two varieties of wheat, etc. As I said, "this may be impossible to do.''

• Randomized Paired Design:  Randomly select n paired experimental units from the reference population. Within a pair, randomly assign one of the pair to Treatment 1 and the other to Treatment 2. The experiment (study) is run for a pre assigned time and during this time all other variables are kept under control. At the end of the assigned time, we measure the responses for the n paired experimental units. Letting X and Y denote the responses for Treatments 1 and 2, respectively the data are in the paired form: (X1,Y1), ..., (Xn,Yn).
Although, the pairs are independent, within a pair there is dependency. In fact, the more dependency within a pair, the more the noise reduction. Hence, the two-independent-sample analysis of Chapter 9 is out. The key is that is still a typical Y minus a typical X, i.e., read that as Yi minus a typical Xi where the subscript i refers to the ith pair. Thus the sample of interest IS THE DIFFERENCES. That is,

D1 = Y1 - X1, D2 = Y2 - X2, ... , Dn = Yn - Xn

Too wordy! Lets have an example . This is taken from Siegel, Nonparametric Statistics. Ten pairs of identical twins, age 4, were randomly selected for an experiment to investigate how nursery school affects the the social awareness of a 4 year old. For each pair, one twin was randomly assigned to go to nursery school while the other stayed home. At the end of the time period, all 20 took the same test and their scores were recorded (bigger means more socially aware). The data are, pair number in column 1, nursery school twin in column 2, response of stay-at-home twin in column 3, difference in responses in column 4:
  pair   N      H    D

1    74    63   11
2    43    33   10
3    61    41   20
4    79    67   12
5    80    65   15
6    73    80   -7
7    56    43   13
8    98    84   14
9    84    74   10
10    52    48    4

There are two immediate observations form this data set:
1.
The twin who went to nursery school seems to be more socially aware. A dotplot on the differences is given next and a formal analysis is discussed below.

      .                     .           : . . . . .         .
-----+---------+---------+---------+---------+---------+-
-5.0       0.0       5.0      10.0      15.0      20.0


2.
The pairing has really cut the noise. The range of the nursery school scores is 98 -43 = 55, the range of the stay-at-home scores is 84 -33 = 51, but the range of the differences is 20 - (-7) = 27. Hence the noise level has been cut by about 1/2. The reason this reduction in noise takes place here is that four year olds are all over the map on social relationships. Some are ready for school, some are far from ready, some talk continuously while others are very shy, etc. And the scores reflect this, (note the scores 98 and 43 in column 1). But identical twins are alike in social awareness (before the experiment). So if one twin scores high then so does the other while if one twin scores low so does the other. This certainly makes sense for these our identical twins. Within a pair the scores are much more similar and, hence, the differences are smaller.
Alright! I hear you clamoring. This is ad hoc. We want p-values. We WANT estimates and confidence intervals. Put up or shut up.

We can't use Chapter 9 but since we have a single sample, the D's, we can use Chapter 7 for estimates and confidence intervals. For example, we can estimate by the median of the paired-differences which is 11.5. A confidence interval for the median is (10, 14.5) which can be obtained using the class code (One sample bootstrap confidence intervals for the population mean and median) and typing in the paired differences in the big data box. Selecting median and submiting produces the bootstrap confidence interval for the median.

Exercise 11.3.1
1.
Finish the example for the twin data. Recall the paired differences were:
        pair   N      H    D

1    74    63   11
2    43    33   10
3    61    41   20
4    79    67   12
5    80    65   15
6    73    80   -7
7    56    43   13
8    98    84   14
9    84    74   10
10    52    48    4

(a)
Obtain the value of the Wilcoxon test statistic. (Actually determine the number of negative averages (2) and subtract it for 10(11)/2.
(b)
Obtain the p-value for a two sided-test. Use the class code of course (Wilcoxon for paired designs). Conclude in terms of the problem.
(c)
Obtain (from class code) the estimated effect and the associated confidence interval. Conclude in terms of the problem.
2.
From Cushney and Peebles (1905)a, J. of Phisiology: Ten patients were selected for a study. The average number of hours that they slept was deterimed. There were two parts to the study. In Part 1, they were given by a flip of the coin one of two drugs, Laevo and Dextro, and the average (over a week) number of excess hours (over their usual average) was recorded. In Part 2 (after a wash out period), they were given the other drug, and the average (over a week) number of excess hours (over their usual average) was recorded. The data are:
       Patient     Dextro   Laevo
1          0.7      1.9
2         -1.6      0.8
3         -0.2      1.1
4         -1.2      0.1
5         -0.1     -0.1
6          3.4      4.4
7          3.7      5.5
8          0.8      1.6
9          0.0      4.6
10          2.0      3.4

(a)
Obtain the value of the Wilcoxon test statistic, (diff = D - L).
(b)
Compare it what you would expect under H0.
(c)
Obtain the p-value for a two sided-test. Use the class code (Wilcoxon for paired designs) of course. Conclude in terms of the problem.
(d)
Obtain (from class code) the estimated effect and the associated confidence interval. Conclude in terms of the problem.
3.
The data below are some measurements recorded by Charles Darwin in 1878. They consist of 15 pairs of heights in inches of cross-fertilized plants and self-fertilized plants, Zea mays, each pair grown in the same pot.
        POT    CROSS    SELF

1   23.500   17.375
2   12.000   20.375
3   21.000   20.000
4   22.000   20.000
5   19.125   18.375
6   21.550   18.625
7   22.125   18.625
8   20.375   15.250
9   18.250   16.500
10   21.625   18.000
11   23.250   16.250
12   21.000   18.000
13   22.125   12.750
14   23.000   15.500
15   12.000   18.000

(a)
Obtain the value of the Wilcoxon test statistic, (diff = C - S).
(b)
Compare it what you would expect under H0.
(c)
Obtain the p-value for a two sided-test. Use the class code (Wilcoxon for paired designs). Conclude in terms of the problem.
(d)
Obtain (from class code) the estimated effect and the associated confidence interval. Conclude in terms of the problem.

Next: Signed-Rank Wilcoxon Up: Design of Experiments Previous: Completely Randomized Designs

2001-01-01