Stat 151: Introduction to Applied Statistics
Assignment 1 Total Marks: 100
Please read the following instructions carefully:
- Assignment 1 is composed of three parts: Part 1: Lecture Problems, Part 2: Lab Problems, and Part 3: Practice Problems (not forgrades).
- For Part 1, please complete all of the assigned exercises by For Part 2, use MINITAB commands toobtain output, and either hand write or type up answers using procedures taught in lab and lecture.
- Please provide complete solutions to both Part 1 and Part 2; be sure that calculations and work are shown, and units are included where
- Pleaseorganize and clearly label your solutions to each problem and staple ALL of your solutions together and hand them in altogether as
Part 1: Lecture Problems
From the Course Textbook: 1.8, 1.18, 1.36, 2.10, 2.52, 2.104, 3.8, 3.16, 3.28, 3.30, 3.72, 3.174, 3.180
1.8 Professional Athlete Salaries. From the Statistical Abstract of the United States and the article “Average Salaries in the NBA, NFL, MLB and NHL” by J. Dorish, published on the Yahoo! Contributor Network, we obtained the following data on average professional athletes’ salaries for the years 2005 and 2011. Is this study descriptive or inferential? Explain your answers.(2 Marks)
Sport |
Average salary ($millions) |
|
2005 | 2011 | |
Baseball (MLB) | 2.48 | 3.31 |
Basketball (NBA) | 4.04 | 5.15 |
Football (NFL) | 1.40 | 1.90 |
1.18 Do Left-Handers Die Earlier? According to a study published in the Journal of the American Public Health Association, left-handed people do not die at an earlier age than right-handed people, contrary to the conclusion of a highly publicized report done 2 years earlier. The investigation involved a 6-year study of 3800 people in East Boston older than age 65. Researchers at Harvard University and the National Institute of Aging found that the “lefties” and “righties” died at exactly the same rate. “There was no difference, period,” said Dr. J. Guralnik, an epidemiologist at the institute and one of the coauthors of the report. Is this investigation an observational study or a designed experiment? Justify your answer. (3 Marks)
- The members of a population are numbered 1 – (6 Marks; 2 Marks For EachPart)
- Listthe 10 possible samples (without replacement) of size 3 from this
- If an SRS of size 3 is taken from the population, what are the chances of selecting 1, 3, and 5? Explain your
- UseTable I in Appendix A to obtain an SRS of size 3 from the Start at the single-digit number in line number 5 and column number 20, read down the column, up the next, and so on.
TABLE I
Random numbers
2.10 World University Rankings. The Times Higher Education “World University Rankings” presents global university performance tables to judge world class universities on the basis of teaching, research, knowledge transfer, and international outlook. The rankings employ 13 carefully calibrated performance indicators to provide comprehensive and balanced comparisons. The top 10 universities for 2012–2013 are as shown in the following table. Classify the data in each column as either qualitative or quantitative; if quantitative, further classify it as discrete or continuous. (3 Marks: 1 Mark For Each Column)
Rank |
Institution |
Overall score |
1 | California Institute of Technology | 95.5 |
2 | Stanford University | 93.7 |
3 | University of Oxford | 93.7 |
4 | Harvard University | 93.6 |
5 | Massachusetts Institute of Technology | 93.1 |
6 | Princeton University | 92.7 |
7 | University of Cambridge | 92.6 |
8 | Imperial College London | 90.6 |
9 | University of California, Berkeley | 90.5 |
10 | University of Chicago | 90.4 |
2.52 Suppose that you have a data set that contains a large number of observations. Which graphical display is generally preferable: a histogram or a stem-and-leaf diagram? Explain your answer. (2 Marks)
- Hospital Beds. The number of hospital beds available in community hospitals is decreasing. Through advancement in care and technology, hospitals are getting more efficient. However, the aging and in- creasingly obese and diabetic population intensifies the need. The American Hospital Association An- nual Survey, published by the American Hospital Association (AHA), gave the number of hospital beds available per 1000 people for each of the 50 states plus the District of Columbia for 2010. Following is a stem-and-leaf diagram of the data. The stems represent units and the leaves tenths (e.g., the first entry is 1.7). (4 Marks: 2 Marks For Each Part)
1 | 7 | 7 | 8 | 9 | 9 | 9 | ||
2 | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 2 3 3 4 4 4 4 4 4 4 |
2 | 6 | 6 | 6 | 7 | 7 | 7 | 8 | 9 |
3 | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 3 3 3 4 |
3 | 5 | 5 | 7 | 9 | ||||
4 | 0 | 4 | ||||||
4 | ||||||||
5 | 0 | 0 | ||||||
5 | 7 |
How many states (including the District of Columbia) had
- at least three but less than four hospital beds per 1000 peopleavailable?
- atleast four and a half hospital beds per 1000 people available?
3.8 Net Worth. The Board of Governors of the Federal Reserve System publishes information on family net worth in the Survey of Consumer Finances. In 2010, the mean net worth of families in the United States was $498.8 thousand and the median net worth was $77.3 thousand. Which measure of center do you think is more appropriate? Explain your answer. (2 Marks)
3.16 For a particular population, is the population mean a variable? (2 Marks)
- Monthly Motorcycle Casualties. The Scottish Executive, Analytical Services Division Transport Statistics, compiles data on motorcycle casualties. During one year, monthly casualties from motorcycle accidents in Scotland for built-up roads and non–built-up roads were as follows. (8 Marks: 3 Marks For Each Statistic in a. and b., 2 Marks for c.)
Month |
Built-up |
Non–built-up |
January | 25 | 16 |
February | 38 | 9 |
March | 38 | 26 |
April | 56 | 48 |
May | 61 | 73 |
June | 52 | 72 |
July | 50 | 91 |
August | 90 | 69 |
September | 67 | 71 |
October | 51 | 28 |
November | 64 | 19 |
December | 40 | 12 |
- Findthe mean, median, and mode of the number of motorcycle casualties for built-up
- Findthe mean, median, and mode of the number of motorcycle casualties for non–built-up
- If you had a list of only the month of each casualty, what month would be the modal month for each type ofroad?
- Popular destinations for the newlyweds of today are the Caribbean and Hawaii. Accord- ing to a recent American Wedding Study by the Conde Nast Bridal Group, a honeymoon, on average, lasts 9.4 days and costs $5111. A sample of 12 newlyweds reported the following lengths of stay of their honeymoons. (4 Marks: 1 Mark for a., 1 Mark for b., 2 Marks Forc.)
- Findn.
- ComputeΣ xi.
5 14 7 10 6 8
12 9 10 9 7 11
- Determine the sample mean. Round your answer to one more decimal place than that used for the observations.
- An article by D. Schaefer et al. (Journal of Tropical Ecology, Vol. 16, pp. 189–207) reported on a long-term study of the effects of hurricanes on tropical streams of the LuquilloExperimental Forest in Puerto Rico. The study showed that Hurrican Hugo had a significant impact on stream water chemistry. The following table shows a sample of 10 ammonia fluxes in the first year after Hugo. Data are in kilograms per hectare per year. (8 marks: 2 marks for part a. and 6 marks for part b.)
- Determine the range of the
- Determine the sample standard deviation of the
96 66 147 147 175 116 57 154 88 154
- Medieval Cremation Burials. In the article “Material Culture as Memory: Combs and Cremations in Early Medieval Britain” (Early Medieval Europe 12, Issue 2, pp. 89–128), H. Williams discussed the frequency of cremation burials found in 17 archaeological sites in eastern England. Here are the data. (15Marks: 4 Marks for a., 1 Mark for b., 3 Marks for c., 3 Marks for d., 4 Marks for e.)
83 64 46 48 523 35 34 265 2484
46 385 21 86 429 51 258 119
- Obtain and interpret the
- Determine and interpret the interquartile
- Find and interpret the five-number
- Identify potential outliers, if
- Construct and interpret a
- Cuckoo Care. Many species of cuckoos are brood The females lay their eggs in the nests of smaller bird species, who then raise the young cuckoos at the expense of their own young. Data onthe lengths, in millimeters (mm), of cuckoo eggs found in the nests of three bird species–the Tree Pipit, Hedge Sparrow, and Pied Wagtail–were collected by the late O. M. Latter in 1902 and used by L. H.
- Tippett in his text The Methods of Statistics (New York: Wiley, 1952, p. 176). Use the following boxplots to compare the lengths of cuckoo eggs found in the nests of the three bird species, payingspecial attention to center and variation. (3 Marks)
Pipit Sparrow Wagtail
Species
Part 2: Lab Problems
Complete the following: LAB 2.1, 3.1, 3.2
If your version of MINITAB will not allow you to edit the titles and labels of your graphs, you may
edit these by hand after pasting your MINITAB output to your assignment.
LAB 2.1 The file STATISTICSSTUDENTSSURVEY contains raw data about how students planned to vote in the last Alberta Election (ALBVOTE) and how they plan to vote in the upcoming Federal Election (FEDVOTE). Load this dataset into MINITAB. (18 Marks: 3 Marks for each of a.-d., 1 Mark each for each of e.-j.)
- Use MINITAB to create an appropriate visual representation to show the counts of the parties for studentspreferred platforms in the Alberta Title and label appropriately.
- Use MINITAB to an appropriate visual representation to show the counts of the parties for student votingchoices in the Alberta Title and label appropriately.
- Use MINITAB to create an appropriate visual representation to show the counts of the parties for studentpreferred platforms in the Federal Title and label appropriately.
- Use MINITAB to create an appropriate visual representation to show the counts of the parties for student voting choices in the Federal Title and labelappropriately.
- Which party do students see as having the best platform inAlberta?
- Which party do students see as having the best platformfederally?
- Give the mode for the Alberta voting plans
- Give the mode for the Federal voting plans
- Avote is viewed as “strategic” when people vote for a party that they think can get elected rather than the party that they think has the best platform. Which party in Alberta appears to benefit most (gathers more votes) from strategic voting?
- Avote is viewed as “strategic” when people vote for a party that they think can get elected rather than the party that they think has the best platform. Which federal party appears to benefit most (gathers more votes) from strategic voting?
LAB 3.1 Eleven individuals were randomly selected for a study observing workout routines.
The dataset EXERCISEMINUTES contains the amount of time (in minutes) each of these randomly selected
individuals spent working out on the day of observation. (5 Marks: 2 Marks for a., 2 Marks for b., 1 Mark for c.)
- Use MINITAB to obtain the minimum, Q1, median, Q3, maximum, mean, standard deviation and range for your
- Use MINITAB to create a boxplot for his data. Title and label it appropriately. Indicates any values that MINITAB identifies as potential
- Is the distribution symmetric? If not, classify it as either right skewed or left
LAB 3.2 The file PULSE contains a small population of data collected from a group of students.
- Pulse1is the resting pulse rate (measured in beats per minute, or bpm) for each
- Pulse2 is their pulse rate in bpm after 5 minutes of moderately strenuous
Although only MINITAB output was asked for in this problem, and concluding sentences were not asked for in this problem, students should take a moment to note that all pulse rates calculated are in beats per minute (bpm). (15 Marks: 2 Marks for a., 3 Marks for b. and c., 1 Mark for d., 3 Marks for e., 1 Mark for f., 2 Marks for g.)
- Use MINITAB to calculate the mean and standard deviation of the resting pulse rate for our population data.
- Use MINITAB to create a boxplot of the resting pulse rate for our population data. Title and label appropriately.
- UseMINITAB to create a frequency histogram of the resting pulse rate for our population
- Based on the boxplot you constructed in , are there any outlyingvalues?
- Based on the boxplot you constructed in , can you conclude whether or not the distribution of this data set has one, two, or more modes? Why or whynot?
- Based on the frequency histogram you created in , describe the shape of the distribution ofthis data with regard to both modality (peaks) and skewness.
- Now suppose the data in PULSE is a random sample of 92 students from a much larger population of students. Consider again the histogram you created in part , can the histogram created from one sample provide irrefutable information about the shape of the population from which the sample was taken? Why or whynot?
Part 3: Practice Problems Not for Marks
P1. In-Demand College Majors. In a June 2013 article, published online by The Street, B. O’Connell discussed the results of a survey on opportunities for graduating college students. In one aspect of the survey, the following percentage estimates were reported on which college majors were in demand among
U.S. firms [SOURCE: “The Most In-Demand College Majors This Year.” Published by Career-Builder, LLC, © 2013.] Is this study descriptive or inferential? Explain your answers.
Major |
Percentage of U.S. ftrms |
Business studies | 31% |
Computer sciences | 24% |
Engineering | 17% |
Health care studies | 10% |
Engineering technologies | 9% |
Math and statistics | 9% |
Communications | 7% |
Education | 7% |
Science technology | 6% |
Liberal arts | 6% |
P2. Identify two methods for obtaining a simple random sample.
P3. U.S. Regions. The U.S. Census Bureau divides the states in the United States into four regions: Northeast (NE), Midwest (MW), South (SO), and West (WE). The following tables gives the region of each of the 50 states.
SO | WE | WE | MW | NE | WE | WE | SO | MW | SO |
WE | NE | WE | SO | MW | MW | NE | WE | SO | WE |
WE | SO | MW | SO | MW | WE | SO | NE | SO | SO |
SO | SO | MW | NE | SO | NE | MW | NE | WE | MW |
WE | SO | MW | SO | MW | NE | MW | SO | NE | WE |
- Determine a frequency
- Obtain a relative-frequency
- Draw a pie
- Draw a bar
P4. Construct a stem-and-leaf diagram for the following data. Use one line per stem. Comment on the shape of distribution with regard to modality and symmetry/skweness.
112 60 97 99 106 117 108 100 125 118
For each data set in P5. and P6., use the specified grouping method to
- determine a frequency
- obtain a relative-frequency
- construct a frequency histogram based on your result from part(a).
- construct a relative-frequency histogram based on your result from part(b).
- identify the shape of the distribution with regard to
- identify the shape of the distribution with regard tosymmetry/skewness.
P5. Cottonmouth Litter Size. In the paper “The Eastern Cottonmouth” (Agkistrodon piscivorus) at the Northern Edge of Its Range” (Journal of Herpetology, Vol. 29, No. 3, pp. 391–398), C. Blem and L. Blem examined the reproductive characteristics of the eastern cottonmouth, a once widely distributed snake whose numbers have decreased recently due to encroachment by humans. A simple random sample of 24 female cottonmouths in Florida yielded the following data on number of young per litter. Use single-value grouping.
8 6 7 7 4 3 1 7 5 6 6 5
6 8 5 5 7 4 6 6 5 5 5 4
P6. Residential Energy Consumption. The U.S. Energy Information Administration collects data on residential energy consumption and expenditures. Results are published in the document Residential Energy Consumption Survey: Consumption and Expenditures. The following table gives one year’s energy consumption for a sample of 50 households in the South. Data are in millions of BTUs. Use limit grouping with a first class of 40–49 and class width of 10.
130 | 55 | 45 | 64 | 155 | 66 | 60 | 80 | 102 | 62 |
58 | 101 | 75 | 111 | 151 | 139 | 81 | 55 | 66 | 90 |
97 | 77 | 51 | 67 | 125 | 50 | 136 | 55 | 83 | 91 |
54 | 86 | 100 | 78 | 93 | 113 | 111 | 104 | 96 | 113 |
96 | 87 | 129 | 109 | 69 | 94 | 99 | 97 | 83 | 97 |
P7. Tablet Computers. Tablet personal computers such as iPad and Kindle Fire are one-piece portable computers that typically offer a touchscreen. Tablets are available in different sizes but generally offer a screen that is greater than 7 inches diagonally. Consumer Reports reviews tablets of different sizes and Internet capabilities. The battery lives, in hours, for a sample of tablets with screen sizes between 9 and 12 inches are provided in the following table. Find the mean, median, and mode(s) of the data. For the mean and the median, round each answer to one more decimal place than that used for the observations.
8.4 5.6 10.9 10.8
11.5 8.0 8.5 12.9
P8. Robbery Locations. The Department of Justice and the Federal Bureau of Investigation publish a compilation on crime statistics for the United States in Crime in the United States. The following table provides a frequency distribution for robbery type during a one-year period.
Robbery type |
Frequency |
Street/highway | 127,403 |
Commercial house | 37,885 |
Gas or service station | 7,009 |
Convenience store | 14,863 |
Residence | 49,361 |
Bank | 5,777 |
Miscellaneous | 48,878 |
- Determine the mode of the
- Decidewhether it would be appropriate to use either the mean or the median as a measure of Explain your answer.
P9. Explain the difference between the quantities xiyi and ( xi yi). Provide an example to support your explanation.
P10. Why is the standard deviation preferable to the range as a measure of variation?
P11. Identify each quantity as a parameter or a statistic.
- µ b. s c. x¯ d. σ