ISE 529 Predictive Analytics
Homework 4 – MLR and Cross Validation
代做预测分析作业 Then remove column highway. The reduced data set should have 485 rows. Remember that style should be categorical for this homework.
A real estate appraiser is interested in predicting residential home prices as a function of various features.Therefore regression models are to be constructed to predict houses prices. The homes.csv data set(available from blackboard) is a sample of 522 residential houses. Reduce the dataset to houses with two to fifive bedrooms, style 1 to 7, and houses not close to a highway. Then remove column highway. The reduced data set should have 485 rows. Remember that style should be categorical for this homework.
1.(30 pts.) Fit a full multiple regression model (with all numerical variables as predictors only). 代做预测分析作业
a) Find the largest outlier (in absolute value).
b) Plot y vs .ˆy labeling the largest outlier.
c) Find the predicted price when all predictors are equal to their median values.
Now consider all variables (categorical and numerical) as predictors from the reduced data set (code for the following questions may take around 10 minutes of CPU time).
2.(20 pts.) Find the single best and single worst predictor.
3.(20 pts.) Report a dataframe showing the best AIC models by number of features. This dataframe should include the features names of each model. Report the row of the best AIC model.
4.(20 pts.) BIC is another another information criteria given by 代做预测分析作业
Report a dataframe showing the best BIC models by number of features. This dataframe should include the features names of each model. Report the row of the best BIC model.
5.(10 pts.) Report the predictions of the best AIC and best BIC models when predicting the price of a high quality, style 3 house with ac, garage for two cars, area of 2100 square feet, built in 1992, 24500 square feet lotsize, no pool, three bedrooms and three bathrooms.
Submit a your report in a single pdf fifile (convert your ipynb fifile into a pdf).