统计问题集代写 [Question 1] In this question, you will explore data about whether couples observed kissing in an airport tilt their heads to the left or the right.
Problem set grading
There are two parts to your problem set. One is largely R-based with short written answers and the other is more focused on writing. We recommend you use a word processing software like Microsoft Word to check for grammar errors in your written work. Note: there can be issues copying from Word to R Markdown so it may be easier to write in this fifile fifirst and then copy the text to Word. Then you can make any changes flagged in Word directly in this file.
[Question 1] 统计问题集代写
In this question, you will explore data about whether couples observed kissing in an airport tilt their heads to the left or the right. The data is adapted from a real study, Gunturkun (2003) (link: https://www.nature.com/articles/421711a).
This is the opening of the article:
“I observed kissing couples in public places (international airports, large railway stations, beaches and parks) in the United States, Germany and Turkey. The head-turning behaviour of each couple was recorded for a single kiss, with only the fifirst being counted in instances of multiple kissing. The following criteria had to be met to qualify: lip contact, face-to-face positioning, no hand-held objects (as these might induce a side preference), and an obvious head-turning direction during kissing. Subjects’ ages ranged from about 13–70 years.
Of 124 kissing pairs, 80 (64.5%) turned their heads to the right and 44 (35.5%) turned to the left.”
Here is a data frame with the data from the kissing study:
# Create a data frame direction <- c( rep("right", 80), rep("left", 124-80) ) kissdata <- tibble(direction)
Are the observations in kissdata the entire population or a sample from a population?
Simulate 1000 bootstrap samples and calculate the proportion of couples who kiss to the left in each of these bootstrap samples. Produce a histogram of the bootstrap sampling distribution of the proportion of people who kiss to the left. Set the seed as the last three digits of your student number.
set.seed(333) # change to the last THREE digits of your student number
Calculate a 95% confifidence interval for the proportion of people who kiss to the left based on the bootstrap sampling distribution you generated in (b).
Indicate whether or not each of the following statements is a correct interpretation of the confifidence interval constructed in part (c) and justify your answers.
(i) We are 95% confifident that between 27% and 44% of kissing couples in this sample tilt their head to the left when they kiss.
(ii) There is a 95% chance that between 27% and 44% of all kissing couples in the population tilt their head to the left when they kiss.
(iii) We are 95% confifident that between 27% and 44% of all kissing couples in the population tilt their head to the left when they kiss.
(iv) If we considered many random samples of 124 couples, and we calculated 95% confifidence intervals for each sample, 95% of these confifidence intervals will include the true proportion of kissing couples in the population who tilt their heads to the left when they kiss.
If we want to be more confifident about capturing the proportion of all couples who tilt their heads to the left when kissing, should we use a wider confifidence level or a narrower confifidence level? Explain your answer.
We could carry out an hypothesis test to investigate whether or not couples are equally likely to tilt their heads to the right or to the left when they kiss. Our hypotheses would be:
Ho : p = 0.5
HA : p ≠ 0.5
where p is the proportion of couples who tilt their heads to the left when they kiss. Using Gunturkun’s data, we would get a P-value of 0.003. Do this hypothesis test and the confifidence interval you produced in (c) tell a similar story? Why or why not?
[Question 2] 统计问题集代写
Assume the data set auto_claims_population.csv includes ALL claims paid (in USD) to auto insurance claimants 50 years of age and older in a specifific year. In other words, it represents a ‘population’ of car insurance claims in that year.
Produce appropriate data summaries (i.e. a summary table and relevant visualization) of paid claims (PAID) and comment the shape, centre and spread of this distribution.
(i) Select a random sample of size n=10 from the population in (a) and produce appropriate data summaries
(summary table and relevant visualization) of the paid claims in this sample. Set the seed as the last three digits of your student ID number.
(ii) Select a random sample of size n=100 from the population in (a) and produce appropriate data summaries of the paid claims in this sample. Again, set the seed as the last three digits of your student ID number.
(iii) How do the distributions in b(i) and b(ii) compare to the distribution in (a)? Which one of b(i) and b(ii) resembles the distribution in (a) more? Why is this so?
Estimate the sampling distributions of sample median of paid claims by taking 1000 samples of (i) size n=10 and (ii) size n=100 from the distribution in (a) and produce appropriate data summaries. Set the seed as the last three digits of your student number for each set of simulations.
(iii) How do these two estimated sampling distributions (i.e., the one for the sample medians when the sample size is 10 versus the one for sample medians when the sample size is 100) compare?
Explain how and why the distributions you estimated in part (c) are difffferent from the distributions you estimated in part (b) above.
Part 2 统计问题集代写
You are once again chatting on the phone to your friend. Your friend enjoyed your previous conversation about data visualization so much that they asked you if you had learned anything new in your STA130 course.
You decided to tell them about the fancy new technique you just learned: bootstrapping! Be sure to include at least 3 vocabulary words from this week and explain them in simple terms for a lay audience.
Other things to consider:
- Try to not spend more than 20 minutes on the prompt.
- Aim for more than 200 but less than 500 words.
- Use full sentences.
- Grammar is not the main focus of this assessment, but it is important that you communicate in a clear and professional manner (i.e., no slang or emojis should appear).
- Sampling distribution
- Random sampling
- Percentile (quantile)
- Confifidence interval
- Confifidence level