STAT0029 STATISTICAL DESIGN OF INVESTIGATIONS — ASSESSMENT 2 (2018/19 SESSION)
STATISTICAL DESIGN代写 Your solutions are to be handed in by yourself to the Statistical Science Teaching & Learning Office
- Your solutions are to be handed in by yourself to the Statistical Science Teaching & LearningOffice (room 138) by 4pm on TUESDAY, 8th JANUARY 2019.
Your work should be submitted as a single securely stapled Statistical Design代写
- Before you hand in your work, complete and sign the bottom of this page and attach the page firmly to your
- When you submit your work, please make sure that the Teaching & Learning Office recordson the list of students that you have handed in your
- Latesubmission will incur a penalty unless there are extenuating circumstances (e.g. medical) supported by appropriate Penalties are set out in the latesteditions of the Statistical Science Department Student Handbooks, available from the departmental web pages.Statistical Design代写
- Failure to submit this in-course assessment will mean that your overall examination markis recorded as “non-complete”, e. you will not obtain a pass for the course.
- ForPart B in this assessment, a maximum word count is specified, submitted work that exceeds this word count will be Penalties are set out in the latest editions of the Statistical Science Department Student Handbooks.
Toensure anonymous marking, you must provide only your student ID number on Part A and B.
- The work for this in-course assessment should be entirely your own. Any plagiarism or collusion will normally result in zero marks for all students involved, and mayalso mean that your overall examination mark is recorded as non-complete. Guidelines as to what constitutes plagiarism and collusion may be found in the Departmental Student Handbooks. The Turn-It-In plagiarism detection system may be used to scan your submission for evidence of plagiarism and collusion.
- Your grade will be provisional until confirmed by the Statistics Examiners’ Meeting in June2019.Statistical Design代写
- Duringoffice hours (or at a time by e-mail appointment) you can study the comments written on your work. You should return the marked work to the
Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism.
I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.
Write your name in block letters:
Your student ID number:
STAT0029 In-course Assessment 2 Statistical Design代写
This assessment consists of two parts. For Part A you can submit hand-written solutions. For Part B you are required to write a report: use a text-editor and hand in a printed text.
Parts A and B are both marked on a 0–100 scale, but part A counts for 40% and Part B for 60% towards the final mark for this assessment.Marks for the constituent parts are listed in bold face. For Part A, marks are given for correct answers, but also for clarity of explanation. This assessment as a whole counts for 50% towards the final course mark.
On the cover sheet, write your name and student ID number. To allow anonymous mark- ing, provide your student ID number at the top of each of the sheets that you use for Part A and B (and not your name).
Part A Statistical Design代写
1.For a Latin square design, the fixed-effects modelis
Yijk = µ + αi + βj + γk + sijk , (1) for i, j, k = 1, . . . , p. Assume sijk ∼ N (0, σ2) independently for all triplets (i, j, k).
(i)Write down the sum-to-zero constraints for the effect parameters. How many parameters in total are specified by the model? Give your answer as a function of p. How many of those are estimated independently given data for this Latin squaredesign? 
Data are available for a design with p = 5. Assume that the data are from an experiment regarding the effect of compost on the height of seedlings 30 days after sowing. The five compost treatments A, B, C, D, E are denoted by i = 1, 2, 3, 4, 5, respectively. The seedlings are sowed in a large square tray which is divided in 25 little squares of equal size. Compost treatment i for row j and column k is determined by the Latin square design given by
In each little square, Statistical Design代写
10 seedlings are sowed, and the mean height after 30 days is the response for the experiment.
The data are uploaded to Moodle in R format; see QuestionA1 Data.txt. In the data, Y is the response variable, col and row are the columns and rows in the above design, and compost is the treatment variable.
For the following questions, you can use software but do not hand in the code. Handed-in code will be ignored in the marking.
(ⅱ)Fit model (1) and print out two graphs for model diagnostics: (a) residuals againstfitted values, and (b) a quantile-quantile plot for the Hand in these printouts. Briefly discuss the graphs with reference to model assumptions.
(ⅲ)Report the ANOVA Assuming significance level 5%, is there a significant effect of the treatment? Briefly explain in words what this means for the current experiment. 
(ⅳ)Derivethe mathematical expression for a 95% confidence interval for the contrast α5 − α4. Estimate this confidence interval using the data, and explain how you obtained the relevant quantities for this estimation. Interpret the estimated interval for the current experiment taking into account your answer to (iii).
For the following question, a mathematical derivation is expected. Do not hand in numerical output.
(ⅴ)Considerthe contrast of treatment means Y¯1· 1 (Y¯2·· + Y¯3·· + Y¯4·· + Y¯5··). Derive the variance of this contrast as a function of σ2. Provide the details of the derivation. Hint: LeΣt Si be the set with pairs (j, k) for which treatment i is
2.For this question, you are not allowed to use software output to justify your Statistical Design代写
Data for a one-way ANOVA with random effects are given by
(ⅰ)Write down the model equation, stating clearly any assumptions that are needed to complete the model specification. Estimate by hand calculations the residual variance and the variance of the random 
(ⅱ)Write down the expression for the statistic for testing whether the variance of the random effect is significantly different from zero. Give the value of this statisticand its reference Do not report a p-value. 
(ⅲ)Assume that the treatment levels are a significant source of variation and that the effects of the levels are on a scale from weak to strong. Why does a strict interpretation of the random-effects model imply that there is always a positive probability (> 0) that there is a treatment level with an effect that is stronger than the levels A, B, C, and D? 
3.A student in a Faculty of Education wants Statistical Design代写
to compare three teaching methods for one-to-one She decides to investigate the methods by teaching a topic from history to pupils aged 10-12. The faculty has a large representative pool of schools that are willing to participate in such a project. The student selects randomly five schools from this pool. After contacting the schools, she selects three pupils aged 10-12 randomly within each of the five schools. She will teach the topic herself to each pupil in a one-hour session. At each school, there will be three sessions each with a different teaching method and a different pupil.
At the end of a session, the student will measure what the pupil has learned with a test. All the pupils in the investigation get the same test.
(ⅰ)Explainthe additional randomisation that is needed in this 
(ⅱ)Why is it not possible in this investigation to measure an interaction effect between teaching methodand school? 
(ⅲ)Assume that the normal distribution is appropriate for the error term. Choosethe appropriate two-way ANOVA for this investigation. Write down the model equation, stating clearly any assumptions that are needed to complete the model specification. Explain how the model parameters should be 
4.(i) Consider Theorem 3 in Chapter 4 of the lecture notes.
Using the notation in Section3.6, provide the final details of the proof of Theorem 3. That is, derive that E(Y¯CL) = µ, and that
is indeed an unbiased estimator of V ar(Y¯CL). Provide the details of the deriva- tion. You do not have to explain the equations in the proof of Theorem 3 in the lecture notes, but be clear which of the equations you use in your derivation.
(ⅱ)There are ten schools in a particular area. As part of an investigation into teaching standards, an inspection team proposes to visit three of the schoolsand administer a test to all of the 14-year old students in each school visited. The school sizes (in hundreds of pupils) are as follows:
School 1 2 3 4 5 6 7 8 9 10
Size 22 18 17 21 11 23 16 22 26 24
(a)Three pseudo-random numbers, distributed uniformly on (0, 1), havebeen obtained using R. They are 0.821, 0.228 and 0.307. Use these to select a PPS sample of three schools, explaining your procedure 
(b)Suppose that schools 4, 7 and 2 were selected (note that these are not necessarily the schools that would be chosen using the random numbers provided above), and that the average test results (out of 20) for these three schools were 14.5, 16.7 and 13.6 respectively. Use these data to estimatethe average test result across all ten schools. Provide an estimated standard error for your 
For this part you are required to write a brief report evaluating an observational study which is published in a journal. You can download the journal article that is allocated to you from the Moodle page for this ICA (subject heading “Published study for Part B”).
In evaluating the study, you should aim for a report length between 800 and 1000 words (one typed page is typically around 500 words).
Reports longer than 1000 words will be penalised. Your report should:
- Give, in full, the reference for thearticle
- State clearly what was the aim of thestudy
- Explain why the investigation was conducted as an observational study ratherthan an experiment
- Describe the study design, paying particular attention to the issues discussed in Section 3.4 of the lecture notes for thiscourse
- Briefly (i.e. without going into technical details) describe the method of analysis used, and how this relates both to the objectives and to the design of thestudy
- Summarise succintly the conclusions of thestudy
- Highlight any features of the study that you think are particularly good, as well as areas where you think the study could have been donebetter
- Include a wordcount Statistical Design代写
The aim of this question is to see whether you are able to assess critically the quality of an observational study, with emphasis on the study design. You need to engage with the aim of the study and to make sensible suggestions based on your own knowledge and understanding of the subject matter. If you only summarise the content of your article, then you will fail this part of the assessment.
Some of the papers use sophisticated analysis methods that you have not seen before — Statistical Design代写
you do not have to understand all the details of these methods to evaluate the design of the study. You do, however, need to understand broadly the underlying intuition. Therefore, if you feel that either the subject matter or the analysis method prevents you from com- menting usefully on your allocated paper, you may ask for an alternative research paper.
Marking criteria: adherence to the above bullet points, and the quality of the pre- sentation (readability, structure, language). 
Submission check: make sure that your name is only on the cover sheet (and your student ID is on the other sheets), that you answered all the (sub)questions, and that you hand in the signed cover sheet, Part A (with the graphs), and Part B.