Assignment 3

All questions are worth 7 points. The number in the the starting parenthesis indicate the corresponding exercise number in “Applied Linear Statistical Models”.

  1. (2.10) For each of the following questions, explain whether a confidence interval for a mean response or a prediction interval for a new observation is appropriate.

    1. What will be the humidity level in this greenhouse tomorrow when we set the temperature level at \(31^oC\)?
    2. How much do families whose disposable income is $23,500 spend, on the average, for meals away from home?
    3. How many kilowatt-hours of electricity will be consumed next month by commercial and industrial users in the Twin Cities service area, given that the index of business activity for the area remains at its present level?
  2. (2.11) A person asks if there is a difference between the “mean response at \(X = x\)” and the “mean of \(m\) new observations at \(X = x\)”. Reply.

  3. (Continued from Assignment 2) The time it takes to transmit a file always depends on the file size. Suppose you transmitted 30 files, with the average size of 126 Kbytes and the standard deviation of 35 Kbytes. The average transmittance time was 0.04 seconds with the standard deviation of 0.01 seconds. The correlation coefficient between the time and the size was 0.86. In the previous homework, we fit a regression model that predicted the time it will take to transmit a 400 Kbyte file. According to this model, the standard deviation of responses is \(s = s_y\sqrt{\dfrac{n-1}{n-2}(1-r^2)} = 0.0052\).

    1. Construct a 95% confidence interval for the regression slope.
    2. Based on this interval, is the slope significant at the 5% level?
    3. State the null and alternative hypotheses in (b). Calculate the test statistic and the p-value.
    4. When you answered questions (b) and (c), it was correct to conduct a two-sided a two- sided test. However, in this given example, why does it make more sense to consider a one-sided, right-tail alternative?
  4. At a gas station, 180 drivers were asked to record the mileage of their cars and the number of miles per gallon. The results are summarized in the table.

    Sample mean Standard deviation
    Mileage 24,598 14,634
    Miles per gallon 23.8 3.4

    The sample correlation coefficient is \(r = -0.17\). In the previous homework, we fit a regression model that described how the number of miles per gallon depends on the mileage. According to this model, the standard deviation of responses is estimated by \(s = s_y\sqrt{\dfrac{n-1}{n-2}(1-r^2)} = 3.36\).

    1. You purchase a used car with 35,000 miles on it. Predict the number of miles per gallon. Give a 95% prediction interval for your car and a 95% confidence interval for the average number of miles per gallon of all cars with such a mileage.
    2. Do the given data present a significant evidence that cars with higher mileage are less economic? Formulate appropriate null hypothesis and alternative and conduct the test.
  5. Grade point average, data can be found here. Same as in assignment 2.

    1. Obtain a 99% confidence interval for \(\beta_1\). Does it include zero? Why might the director of admissions be interested in whether the confidence interval includes zero?

    2. Test whether or not a linear association exists between student’s ACT score (X) and GPA at the end of the freshman year (Y). Use a level of significance of 0.01. State the alternatives, decision rule, and conclusion.

    3. What is the P-value of your test in part (b)? How does it support the conclusion reached in part (b)?

    4. Obtain a 95 percent confidence interval for the mean freshman GPA for students whose ACT test score is 28. Interpret your confidence interval.

    5. Mary Jones obtained a score of 28 on the ACT. Predict her freshman GPA using a 95 percent prediction interval. Interpret your prediction interval.

    6. On the same graph, plot

      • the data
      • the least squares regression line for ACT scores
      • the 95 percent confidence band for the true regression line for ACT scores between 20 and 30.

      Does the confidence band suggest that the true regression relation has been precisely estimated? Discuss.