The number in the the starting parenthesis indicate the corresponding exercise number in “Applied Linear Statistical Models”.
(7 points) (3.12)8 A student does not understand why the sum of squares SSPE is called a pure error sum of squares “since the formula looks like the one for an ordinary sum of squares”. Explain.
(8 points) (Computer project, 3.3) Refer to the GPA data from the previous homework assignments.
(8 points) (Computer project) Crime rate data set is available here. A criminologist studies the relationship between level of education and crime rate in medium-sized U.S. counties. She collected data from a random sample of 84 counties; X is the percentage of individuals in the county having at least a high-school dipoma, and Y is the crime rate (crimes reported per 100,000 residents) last year. A linear regression of Y on X is then fit to these data. Test:
(12 points) For the “toy” example, consider a small data set
X | 0 | 0 | 1 | 2 |
---|---|---|---|---|
Y | 0 | 2 | 2 | 3 |
Try to do as much as you can by hand, without the use of a computer. The numbers are quite simple!
\[SSTot = \sum_i (Y_i - \bar Y)^2\]
\[SSReg = \sum_i (\hat y_i - \bar Y)^2\]
\[SSErr = SSTot - SSReg = \sum_i (Y_i - \hat Y_i)^2\]
\[SS_{PE} = \sum_j \sum_i (Y_{ij} - \bar Y)^2\]
\[SS_{LOF} = SSErr - SS_{PE} = \sum_j \sum_i (\bar Y_j - \bar Y)^2\]
Then conduct the lack-of-fit test. Explain the result.