This text book is the online text for Stat 160. It covers basic descriptive statistical and graphical procedures for analyzing data sets. Some simple inferential procedures, parametric and nonparametric, will also be taught. These procedures will enable students to explore data sets. This book is not an introductory statistical methods book, but knowledge of its content will help prepare students for a course in methods.
The quantitative prerequisite for this book is essentially high school algebra, Math 110 at Western Michigan University. It does not make use of any higher math and it is not formula driven. In the course, the computer does the heavy computation, not the student.
Chapter 1 covers basic descriptive statistical and graphical procedures for analyzing data sets. This is a long chapter and an important one. Most people at some time will find themselves either trying to explain a data set to others or it will be important for them to understand a description of a data set. The plots discussed in this chapter, such as comparison boxplots and dotplots, are used throughout the book when discussing data sets. Ample discussion is presented on outliers and the concept of robust statistical procedures is introduced. Robustness is used throughout the text.
Chapters 2 through 5 discuss probability and population models. For the most part, our discussion of probability is based on resampling not on formulas. Resampling has become a very powerful tool in statistics where it is often called the bootstrap. Using resampling to solve probability problems, requires the successful modeling of the problem, which in itself requires the understanding of the problem. Such exercises will serve the student well in his life. Resampling, though, requires coding. We have short circuited this problem by developing general software for many of the resampling situations called for in the book.
Chapter 6 is a short discussion of the Central Limit Theorem which leads directly into Chapter 7 on confidence intervals. We use the percentile bootstrap confidence intervals for many of the confidence intervals discussed in the book.
A discussion of hypotheses testing is presented in Chapter 8 for two sample location problems. The basic method discussed is the two sample Wilcoxon with observed significance levels determined by resampling. This is followed in Chapter 9 on estimation problems for two sample problems.
Chapter 10 presents experimental designs for two sample situations, both completely randomized and paired designs. Tests on hypotheses and estimation (confidence intervals) are discussed for these designs. Chapter 11 discusses regression designs. Both least squares and a robust procedure are presented.
This text could not have been possible without the help of many individuals, too numerous to thank. Certainly students of previous sections of Stat 160 deserve our thanks. Neither the book nor the course would have been possible without the help provided by the Statistical Computation Lab (SCL) at Western Michigan University. Not only have they provided the statistical and computational expertise to develop the statistical software which accompanies the text but they have supported the entire web page development of the course. Thanks also goes to TLT Presidential funding program at Western Michigan University. Their grant to Professors Kapenga and McKean provided Summer (2000) support for the development of the online part of this course. A grant from Sun Microsystems to Professors Kapenga and McKean was also fundamental to the online development as well as the robust content of the course.