当前位置:天才代写 > 作业代写,留学生作业代写-北美、澳洲、英国等靠谱代写 > ISE 529 Predictive Analytics代写 考试助攻代写

ISE 529 Predictive Analytics代写 考试助攻代写

2021-06-07 17:20 星期一 所属: 作业代写,留学生作业代写-北美、澳洲、英国等靠谱代写 浏览:875

ISE 529 Predictive Analytics代写

ISE 529 Predictive Analytics

Midterm Exam

Submit on October 16, 2020 by 1 p.m.

 

ISE 529 Predictive Analytics代写 1.(10 pts.)In Lecture 5 Introductory Example 2, a pivot table found that non-USA cars are on average more expensive than ···

 

1.(10 pts.)

In Lecture 5 Introductory Example 2, a pivot table found that non-USA cars are on average more expensive than USA cars by around $2000. But fitting model

m1 = smf.ols(formula = ‘Price ~ MPG_city + Origin’,data = df).fit()

it was found that non-USA cars were on average more expensive than USA cars by around $5264. This looks like a contradiction. Which dollar amount is correct? Why?

2.(20 pts.)  ISE 529 Predictive Analytics代写

The file csv has monthly demand from 2000 to 2011.  You  are asked to predict it for 2012.  Use library statsmodels.formula.api to build a linear regression model to predict the demand using one-hot encoding and

a)year(numerical) and months (categorical) as predictors.

b)year(numerical) and months (categorical) and their interaction as predictors.

For both models plot the demand with the predictions on all years. For (b) add to plot, 95% CIs for 2012.

3.(20 pts.)  ISE 529 Predictive Analytics代写

In Homework 4 you built regression models to predict houses prices on the csv data set. You reduced the dataset to houses with two to five bedrooms, style 1 to 7, and houses not close to a highway. Then removed column highway to find the reduced data set with 485 rows.

Now use 10-fold cross validation,  KFold(n_splits=10,random_state=2,shuffle=True), to  find  MSPE of the best AIC model and the best BIC model (the argument shuffle requests to shuffle the rows before splitting it into folds). Find the square root of the MSPE values. Which model (best AIC or best BIC) predicts best the house prices?

ISE 529 Predictive Analytics代写
ISE 529 Predictive Analytics代写

4.(50 pts.)  ISE 529 Predictive Analytics代写

Download from org/datasets/movielens/1m/ the  file  ml-1m.zip. Extract  the  files and read the README file. The files can be open with

pd.read_csv(’users.dat’, sep=’::’,engine =’python’). There are 3 files: movie ratings, movie data (genres and year), and users (age,zip code,gender,id,and, occupation). Column title from the file movies.dat shows the movie title and year. Split that column into two new columns using

movies[’name’] = movies[’title’].str.slice(start=0,stop=-7)

movies[’year’] = movies[’title’].str[-5:-1]

movies[’year’] = movies[’year’].astype(int)

del movies[’title’]

For some of the questions you may want to merge the three files into one by using

pd.merge(pd.merge(ratings, users), movies).

The resulting dataframe is of interest in developing recommendation systems. For that purpose, answer the following questions

a)Reportthe name and year of the five movies with largest number of ratings.

b)Findnames of the top-rated movies by females from 1995 to 2000.

c)Formovies with at least 250 ratings, find the average  Show the names of the 5 movies with the largest average rating.

d)Consider users dissagreement as the standard deviation of each movie ratings. Find the five movies with the largest rating  dissagreement.

e)Consider gender dissagreement as the difference between the average rating of males minus that of  females(in absolute value). Find the names of the 5 movies with the largest gender dissagreement.

Only one submission is allowed.

ISE 529 Predictive Analytics代写
ISE 529 Predictive Analytics代写

其他代写:assignment代写 homework代写 essay代写 algorithm代写 analysis代写 code代写 app代写 assembly代写 CS代写 Exercise代写 C++代写 C/C++代写 course代写 Data Analysis代写 data代写

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

 

天才代写-代写联系方式