Daily Log for Stat 5660
Applied Nonparametric Statistics
Spring Term 2018

  1. Jan. 9: We begin with Chapters 1 and 2 of the text. Chapter 1 is on R while Chapter 2 is a review of some nonparametric methods some of which you encountered in your mehods course.

    Chapter 1 contains an intro on R.

  2. For Thursday, Jan 11, read Chap 2. Try to get R running on your computer. We will discuss some more R then start working through Chap 2.
    For the class R scripts click here.

    For the first project Click here. It is due 18 Jan 2018. It is a straightforward R project. Do your own work.

  3. For Tuesday, 20180116, we will continuw working on Chapter 2.
  4. For Thursday, 20180116, we will continue working on Chapter 2, (sections 2.2, 2.3).
    For class discussion do exercises: 2.8.13; 2.8.14; and 2.8.25.
    Either hand-in your project during class or e-mail it to Guahao, guohao.zhu@wmich.edu
  5. For the second project Click here. It is due 25 Jan 2018.

    For class discussion on Tuesday do problem 2.8.5 by doing the CI as we have discussed. Here is a simple story on the bootstrap click here.. Read the story and the book and try to do a percentile bootstrap CI.

  6. For the second project Click here. It is due 25 Jan 2018.
    Although I forgot to save the R script for Thursday's class, I have put up a script for Problem 2.8.14 along with a two-sided test HERE.
  7. For Thursday, 20180123, for class discussion: 2.8.7, 2.8.8, 2.8.9 (use sign),

    prob1 20180123: Obtain the sample by

     x = 20*rcauchy(50)+50
    1. Boxplot the data.
    2. Use the sign test and the t-test to test that the median of the population is 60 versus the two-sided alternative. (The true median is 50). Use the R function
      to do the sign analysis. Can download and source under Rfunction's link on the main page.
    3. Obtain estimates and 95% CI's using the sign and t procedures.
    4. Which analysis seems best here? Why
    We will do the signed rank Wilcoxon on Thursday plus Sections 2.4, 2.5, and 2.6
  8. For the third project Click here. It is due 1 Feb 2018.
    We will finish Chapter 2 and begin Chapter 3 on Tuesday.
    For discussion, Exercise 2.8.12, p. 44.
  9. For Thursday install two packages in R. The packages are Rfit and npsm. They are at CRAN so on a pc the following command should work: In R, type the commands:
    If you are running linux, get into R with: sudo R

    This info is also at this site.

    Read Chapter 3, sections 3.1 and 3.2.

    For discussion: 2.8.15.

  10. For the fourth project Click here. It is due 8 Feb 2018.
    Here are some notes on IO (input and output) for R Click here . We will discuss IO on Tuesday but you may want to strating writing out stuff to be used later.
    For Tuesday: Read Chapter 3.
  11. For Thursday, Feb. 8, continue reading Chapter 3 through Section 3.4.
    For class discussion do Exercises 3.7.1 and 3.7.4 (all but bentscores in Part (c)). For the temperature data by gender, using the Wilcoxon analysis, test to see if the temperatures of male and females are the same and determine the estimate of shift in temperature with a 95% CI.
  12. For the fifth project Click here. It is due 15 Feb 2018.
  13. We should finish Chapter 3 on Tuesday, Feb. 13.
  14. The due time of a project is midnight of the day that it is due. After that, 10 points per day will be deducted for each day late.
  15. For the sixth project Click here. It is due 22 Feb 2018.
  16. Tuesday 20180220, read thru Chap 4. For discussion consider Problem 4.9.6.
    The function rank2.test.R is the corrected version of rank.test.R and it's on the Rfuncs page.
  17. No project this week. Tuesday: questions and continue with Chap 4 (nonpar regression and association).
    For test on Thursday, March 1: No computer, No phone, Can use your own (no share) calculator, and one-page (8.5 by 11) cheat sheet. It will be multiple choice, so bring a #2 pencil.
    Answers to the projects are at click here .
  18. *********************
        MIDTERM: The posted score is the score out of 25!!!   If your scores is 20 then that's 80%; i.e., just multiply your score by 4.

    I added 4 points to the midterm test scores and these modified scores are now on e-learning. IT IS THE SCORE OUT OF 25. Here is the fvenumber summary of these modified scores: The person who score 27 has 108%.
     > fivenum(x)
    [1]  9 15 19 22 27
    Also the weights for the final grade are:
    • 50% Projects
    • 20% MidTerm
    • 30% Final
    So the projects count the most.

    Enjoy the rest of your break.

    For Tuesday, we will finish association (continuous case) and start ANOVA (Chap 5).

  19. For the seventh project Click here. It is due 22 March 2018.
    For next Tuesday Read Sections 5.1-5.4.
    One student had trouble loading the data for the first problem on free-fatty acid. You can cut and paste the the load command here (top line).
  20. For the eighth project Click here. It is due 29 March 2018.
    On Tuesday we will look at ANCOVA and the extended Fligner-Killeen test and ordered alternatives. Then onto Sections 7.1-7.3.
  21. On Thursday, March 29th, we will finish chap 5 and begin chap 7. For chap 7 you need to install the package hbrfit. These commands should work:
    Also, we will use the packeage npsmReg2:

    May need the CRAN package quantreg.

    Guohao was successful with the commands

  22. Project 9 is is here. It is due April 5.
    For Tuesday read sections 7.1-7.3, 7.5, 7.7.
  23. The function power585, used in Project 8, is at Rfuncs web page.
    Read the sections on skew-normals and time-series in Chap 7 for Thursday.
  24. Ian Kapenga successfully installed hbrfit on a MAC using the commands:
  25. For the lecture on the skewed normal family, we will make use of the CRAN package sn.
  26. On the project I meant to assign 7.9.2. So I made this change and made 7.9.3 extra credit in case you worked on it.
  27. Project 10 can be found here. Note that I suggest how to install npsmReg2, if you haven't already done so. If you have any problems installing npsmReg2, let I or Guahao know.
  28. You will need the package jrfit which is at github:
  29. No more projects. New matrial will of course be on the fianl.
  30. On tuesday we will go over several more examples of cluster data then discuss Section 6.4. Accelerated failure time models are linear models with usually skewed error distributions. This is the last stop.
  31. For Thursday for class discussion do:
    1. 8.7.7 (data at rda page).
    2. 6.5.11. For Part (c), for a sample of size n the quantiles on the horizontal axis are enerated by the R code:
      u = 1:n/(n+1); q = log(log(1/(1-u)))
    3. Key to project 7, here.