Stat 553.413/613 Midterm Exam 2
统计期中代考 THE EXAM CONTAINS 4 QUESTIONS. You will have 75 minutes to complete the exam. You will need to have your CAMERA ON in Zoom for the whole
- THE EXAM CONTAINS 4 QUESTIONS.
- You will have 75 minutes to complete the exam.
- You will need to have your CAMERA ON in Zoom for the whole duration of the exam. You will need to have your MIC OFF for the whole duration of the exam.
- During this Exam, you can review the Seber & Lee, Kutner, and Faraway texts, your course notes, the lecture videos and lecture notes, and your homework and homework solutions. You are permitted to use a calculator and, where appropriate, R. You are not allowed to consult with other people, share or discuss Exam topics or Exam questions during the examination period, nor to use any educational resources other the ones listed above. 统计期中代考
- If you have technical issues when submitting the exam to Gradescope, email me the screen-shots/images of all your work before the deadline. Then submit the exam once you sort out the technical issues.
- If you have questions during the exam, you can send a private message to the TA monitoring you in the Zoom chat. Do not unmute yourself to speak.
- By submitting this Exam, you certify that 1) you understand the rules and agree to abide by them; and 2) you understand that violating these rules constitutes academic dishonesty.
Question 1 (30 points). 统计期中代考
THE QUESTION HAS 4 PARTS: (a)-(d)
Consider the linear model
Y = Xβ + ε,
where Y is a n-by-1 vector of response variables, X is an n-by-p design matrix, β is a p-by-1 vector of coefficients, ε is a multivariate normal Nn(0, σ2In).
(a) (9pts) Write down the distributions with the corresponding means andvariance-covariance matrices of the following random vectors. You do not need to justify what you wrote.
(i) Y
(d) (6pts) Does your answer to (b) change if only Gauss-Markov conditions are met? If yes, state the modified results, if no, explain why they do not change.
Question 2 (25 points). 统计期中代考
THE QUESTION HAS 2 PARTS: (a)-(b)
A beverage company is currently interested in finding the effect of milkshakes and other drinks on weight gain. The company performs a designed study in which X is the amount of milkshakes (in pints) consumed per week, Z is the amount of soda (in pints) consumed per week, and W is the amount of coffee (in pints) consumed per week. The response variable Y is the weight gain in kilogram after one month. The number of data points is n = 50. The company then runs a linear regression according to the model
Yi = β0 + β1Wi + β2Xi + β3Zi + ϵi
where the error terms ϵi are independent normal random variables with mean 0 and constant variance σ2 (σ2 is unknown).
Specify vectors aj and scalars dj for simultaneous 95% confidence intervals for E[Yj ] for all choices of the predictor variables Wj = 0.5, Xj ∈ {1, 2, 3} and Zj ∈ {3, 5}. That is to say, your 95% confidence intervals for E[Yj ] should cover the cases where (Wj , Xj , Zj ) = (0.5, 1, 3), (Wj , Xj , Zj ) = (0.5, 1, 5), . . . , (Wj , Xj , Zj ) = (0.5, 3, 5) simultaneously. You DO NOT need to evaluate dj fully, just show the formula and specify all the values involved.
(b) (15pts) To test the following hypothesis
What is the distribution of your test statistic (specify the degrees of freedom if appropriate)? Specify explicitly the numerical values for C, γ, SSE, and fill in the dashed boxes with appropriate numerical values. You don’t have to conduct the test.
Question 3 (30 points). 统计期中代考
THE QUESTION HAS 5 PARTS: (a)-(e)
The R output is given below. Answer the following questions.
> lmod=lm(y~x1+x2+x3+x4+x5,data=df) > summary(lmod) Call: lm(formula = y ~ x1 + x2 + x3 + x4 + x5, data = df) Residuals: Min 1Q Median 3Q Max -15.2743 -5.2617 0.5032 4.1198 15.3213 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 66.91518 10.70604 6.250 1.91e-07 *** x1 -0.17211 0.07030 -2.448 0.01873 * x2 -0.25801 0.25388 -1.016 0.31546 x3 -0.87094 0.18303 -4.758 2.43e-05 *** x4 0.10412 0.03526 2.953 0.00519 ** x5 1.07705 0.38172 2.822 0.00734 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 7.165 on 41 degrees of freedom Multiple R-squared: 0.7067,Adjusted R-squared: 0.671 F-statistic: 19.76 on 5 and 41 DF, p-value: 5.594e-10
(a) (4 points) How many data points are there (what is n, the sample size)? What is p, number of unknown parameters?
(b) (5 points) What is the result of the overall model fit test, at level α = 0.05? State the null and alternative hypotheses and the result of the test.
(c) (5 points) Based on all of the R output, do you reject the null hypothesis that β2 = 0 at level α = 0.05 significance? Should you conclude that the model y = β0 + β2x2 is not an appropriate model for the data? Explain why you should or why you shouldn’t.
(d) (6 points) Compute SSE from the R output.
Question 4. (20 pts total)
THE QUESTION HAS 3 PARTS: (a)-(c)
(a) (12pts) Write the formulas for internally studentized residuals and externally studentized residuals. Explain each term being used.
(b) (4pts) What do we use internally studentized residuals for? What do we use externally studentized residuals for?
(c) (4pts) Can we compute externally studentized residuals based on the output provided in Question 3? Explain why yes, or why no. You do not need to compute anything.
更多代写:新加坡cs网课托管 托福助考 英国宏观经济代考 paper代写 留学生research paper代写 essay文章写作代写