REPORT 1: Hourly Wage and Working Hours The following data are from a national sample of 6000 households with a male head earning less than $15,000 annually in 1966. The households were classified into 39 demographic groups (e.g. factory workers, farmers, school teachers, etc.) Each row in the spreadsheet below corresponds to one demographic group. Hence, the spreadsheet contains 39 rows. The study was undertaken in the context of proposals for a guaranteed annual wage (negative income tax). At issue was the response of labor supply (number of hours) to increasing hourly wages. Does the number of labor hours increase or decrease with wage rates? What other factors are relevant in predicting labor hours? Number of cases: 39 Variable Names: 1. HRS: Average hours worked during the year 2. WAGE: Average hourly wage ($) 3. ERSP: Average yearly earnings of spouse ($) 4. ERNO: Average yearly earnings of other family members ($) 5. NEIN: Average yearly non-earned income 6. ASSET: Average family asset holdings (Bank account, etc.) ($) 7. AGE: Average age of respondent 8. DEP: Average number of dependents 9. RACE: Percent of white respondents 10. SCHOOL: Average highest grade of school completed The Data: HRS WAGE ERSP ERNO NEIN ASSET AGE DEP RACE SCHOOL 2157 2.905 1121 291 380 7250 38.5 2.340 32.1 10.5 2174 2.970 1128 301 398 7744 39.3 2.335 31.2 10.5 2062 2.350 1214 326 185 3068 40.1 2.851 . 8.9 2111 2.511 1203 49 117 1632 22.4 1.159 27.5 11.5 2134 2.791 1013 594 730 12710 57.7 1.229 32.5 8.8 2185 3.040 1135 287 382 7706 38.6 2.602 31.4 10.7 2210 3.222 1100 295 474 9338 39.0 2.187 10.1 11.2 2105 2.493 1180 310 255 4730 39.9 2.616 71.1 9.3 2267 2.838 1298 252 431 8317 38.9 2.024 9.7 11.1 2205 2.356 885 264 373 6789 38.8 2.662 25.2 9.5 2121 2.922 1251 328 312 5907 39.8 2.287 51.1 10.3 2109 2.499 1207 347 271 5069 39.7 3.193 . 8.9 2108 2.796 1036 300 259 4614 38.2 2.040 . 9.2 2047 2.453 1213 297 139 1987 40.3 2.545 . 9.1 2174 3.582 1141 414 498 10239 40.0 2.064 . 11.7 2067 2.909 1805 290 239 4439 39.1 2.301 . 10.5 2159 2.511 1075 289 308 5621 39.3 2.486 43.6 9.5 2257 2.516 1093 176 392 7293 37.9 2.042 . 10.1 1985 1.423 553 381 146 1866 40.6 3.833 . 6.6 2184 3.636 1091 291 560 11240 39.1 2.328 13.6 11.6 2084 2.983 1327 331 296 5653 39.8 2.208 58.4 10.2 2051 2.573 1194 279 172 2806 40.0 2.362 77.9 9.1 2127 3.262 1226 314 408 8042 39.5 2.259 39.2 10.8 2102 3.234 1188 414 352 7557 39.8 2.019 29.8 10.7 2098 2.280 973 364 272 4400 40.6 2.661 53.6 8.4 2042 2.304 1085 328 140 1739 41.8 2.444 83.1 8.2 2181 2.912 1072 304 383 7340 39.0 2.337 30.2 10.2 2186 3.015 1122 30 352 7292 37.2 2.046 29.5 10.9 2108 2.786 1757 . 506 9658 43.4 . 32.6 10.2 2188 3.010 990 366 374 7325 38.4 2.847 30.9 10.6 2203 3.273 . . 430 8221 38.2 2.324 22.1 11.0 2077 1.901 350 209 95 1370 37.4 4.158 61.3 8.2 2196 3.009 947 294 342 6888 37.5 3.047 31.8 10.6 2093 1.899 342 311 120 1425 37.5 4.512 62.8 8.1 2173 2.959 1116 296 387 7625 39.2 2.342 31.0 10.5 2179 2.971 1128 312 397 7779 39.4 2.341 31.2 10.5 2200 2.980 1126 204 393 7885 39.2 2.341 31.0 10.6 2052 2.630 . . 154 3331 40.5 . 45.8 10.3 2197 3.413 1078 300 512 10450 39.1 2.297 15.5 11.3 Instructions: Conduct an analysis of the response of labor supply (number of hours) to increasing hourly wages. Do labor hours increase or decrease with wage rates? What other factors affect the number of hours that people work? What variables are correlated with each other? How will this affect your model? Are there potentially important variables missing from the data set? 1. Find the best fitting simple linear regression between HRS (Y) and WAGE (X). You may consider transformations of variables to uncover linear relationships. 2. Find the best multiple regression model that you think describes the relationship between HRS and the other variables in the study. Report recommendations: 1. I recommend the following outline: I. Introduction (Statement of problem, brief description of data) II. Methods (Description of statistical methodology) III.Results (Competing models, interpretation) IV. Conclusion (Final model(s), interpretation, recommendations) 2. Aim for 5-15 double-spaced pages. DO NOT submit SAS printouts. You may include important plots as figures in your report, and you may include portions of the SAS printout as tables. 3. Be BRIEF and PRECISE about your statistical justifications. For instance, say ``Y versus log(X1) gives a better linear fit than Y versus X1 based on the scatterplot and R-square value'' instead of ``Y versus log(X1) looks better than Y versus X1''. 4. Your report will be evaluated according to (a) readability (30%) i. did you clearly state your conclusions? ii. did you clearly state your justifications? iii. did you clearly describe your statistical methods? iv. did you make good use of tables and plots? (b) strength of conclusions (30%) i. did you arrive at reasonable conclusions? (Some decisions are necessarily judgement calls - in these cases the justification is as important as the decision itself.) ii. did you miss any important findings? (c) statistical correctness (30%) i. did you use appropriate tools and models? ii. are you interpreting SAS output correctly? (d) overall look (neatness, organization, professionalism) (10 5. I plan to spend no more than 15 minutes reading each report, so don't bury your main points.