ECO520
Topic: COVID19 in Wisconsin by Census Tract (25 points total)
ECO520代写 Wisconsin COVID-19 data by census tract boundary. Data is updated at 2:00PM CDT daily. All data are laboratory-confirmed cases of COVID-19
Wisconsin COVID-19 data by census tract boundary
Data is updated at 2:00PM CDT daily. All data are laboratory-confirmed cases of COVID-19 that we freeze once a day to verify and ensure that we are reporting accurate information. The number of people with positive/negative test results includes only Wisconsin residents who had their results reported electronically to DHS. Here are descriptions of the variables in the data.
Variable Name | Variable Description |
GEOID | Geographic ID |
State | State |
CENSUS_TRACT | Census Tract Number |
COUNTY | County Name |
DATE | Last Date of Report |
POSITIVE | Number of Positive on COVID19 Test |
NEGATIVE | Number of Negative on COVID19 Test |
DEATHS | Number of Deaths by COVID19 |
HOSP_YES | Number of Hospitalized by COVID19 |
HOSP_NO | Number of Not Hospitalized by COVID19 |
HOSP_UNKNOWN | Unknown Number of Hospitalized by COVID19 |
AREA_LAND | Land Area Size |
AREA_WATER | Water Area Size |
POPULATION | Total Population ECO520代写 |
POP_LT18 | Percent of Population that is Less Than 18 Years |
POP_65P | Percent of Population that is 65 Years and Over |
HOUS_NO_VEH | Percent of households with no vehicle available |
ADULT_LIMITED_ENGLISH | Percent of adults 18 years and over who have limited English ability |
ADULT_SPANISH_LENG | Percent of adults 18 years and over who speak Spanish and have limited English ability |
POP_BELOWPOV | Percent of Population whose income in the past 12 months is below poverty level |
POP_DISABILITY | Percent of Population with a Disability |
POP_MEDICAD | Percent of Population with Medicaid/Means-Tested Public Coverage |
POP_MEDICARE | Percent of Population with Medicare Coverage |
POP_HEALTHINS | Percent of Population with No Health Insurance Coverage |
HOUS_NOSMARTPHN | Percent of Households that Have No Smartphone |
HOUS_NOINTERNET | Percent of Households with No Internet Access |
Here is the SAS code to load the data into your SAS program.
filename webdat url "http://bigblue.depaul.edu/jlee141/econdata/eco520/COVID19_WI.csv" ; proc import datafile=webdat out = COVID19 DBMS = csv replace ; run ; run ; /*Select 500 randomly selected census tracts in WI using YourDePaulID */ proc surveyselect data= COVID19 method=srs seed= YourDePaulID N=500 out= MYCOVID19 ; run; proc contents data=MYCOVID19 ; run ;
Use SAS code to answer the following questions using MYCOVID19 data
1. Descriptive Analytics Questions. ECO520代写
Use descriptive statistics and plots to answer the following questions. ( 9 points).
(Confirmed cases means the test results show positive from the COVID19 test.)
1)Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by Census tract.
2)Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by County.
3)Find the five highest Census tracts and the five highest counties in terms of the number of confirmed cases.
4)The five Census tracts that have the highest probability to be hospitalized among the confirmed cases.
5)Find the correlation coefficients relate to the number of confirmed cases with demographic, social status, and health care service variables. Find any nonlinear relationship using scatter plots.
6)Suppose you are working as a consultant for the state of Wisconsin and make some recommendation. According to the descriptive statistics, where are the most venerable areas that the gov’t agent need to spend their resources? Explain clearly.
2.Hierarchical and Non-hierarchical Clustering Analysis on ( 7 points) ECO520代写
1)Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY
2)Non-Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY
3)Compare the results 1) and 2), and define or name the clusters using descriptive statistics and plots.
4)Test if the clusters are significant to the number of confirmed cases per 1000 persons using the ANOVA test
5)Test if the clusters are significant to the number of hospitalized per 1000 persons using the ANOVA test.
3.Predictive Analytics using Regression Model (9 points) ECO520代写
Use only TRAIN data (80%) to estimate the models and use the TEST data (20%) to perform the out-of-sample prediction
Suppose you are trying to predict the number of COVID19 confirmed cases using the demographic variables from Census data. (none of the variables related to COVID19 can be used to predict the COVID19 related dependent variables).
1)Find the regression models to explain the variation of the number of confirmed cases per 1000 persons. You can use any variables such as nonlinear variables and cluster variables to make the best models.
Model 1: Your own choices of variables 1
Model 2: Your own choices of variables 2
Model 3: Stepwise
Model 4: adjusted R square
2)Perform the out of sample prediction using the observation that were not used in the estimation. Find the following statistics and compare the results. Which model is the best performing model in terms of the following statistics? ECO520代写
a. MSE (mean square error)
b. RMSE (root mean square error)
c. MPE (mean percentage error)
d. MAE (mean absolute error)
3)Do the regression analysis for the number of hospitalized per 1000 persons as the dependent variable following the step 1) and 2). Explain what you find.
其他代写:data代写 code代写 analysis代写 homework代写 algorithm代写 作业加急 北美代写 CS代写 Data Analysis代写 澳大利亚代写 essay代写 assignment代写 英国代写 作业代写 report代写 paper代写
合作平台:essay代写 论文代写 写手招聘 英国留学生代写