当前位置:天才代写 > 考试助攻 > 预测分析考试代考 ISE 529代写 回归模型代写

预测分析考试代考 ISE 529代写 回归模型代写

2022-05-09 11:30 星期一 所属: 考试助攻 浏览:488

ISE 529 Predictive Analytics Exam 2

预测分析考试代考 1.Visit the Titanic https://www.kaggle.com/c/titanic page on the Kaggle site. Read the Description and Evaluation items,

1.Visit the Titanic https://www.kaggle.com/c/titanic page on the Kaggle site. Read the Description and Evaluation items, then use the Data tab to download the csv files. Read the Overview.

The objective is to predict if a passenger would survived based on the features data.

Visit the Titanic Data Science Solutions page:

https://www.kaggle.com/startupsci/titanic-data-science-solutions

Spend sometime reading and running the Jupyter Notebook provided in that page.

 

a) (10 pts.)  预测分析考试代考

Use the train set to answer true or false to each of the following

  • More than 75% passengers did not travel with parents or children
  • 30 to 33% of passengers had siblings and/or spouse aboard
  • Less than 1% of passengers paid a fare as high as 500 dollars
  • Less than 1% of passengers are 65+ years old

 

b) (20 pts.)

Fill NAs values as follows

  • Drop columns PassengerID, Name, Ticket, Cabin.
  • Fill NAs values in Embark with the most common category
  • Fill NAs values in Fare with the median value in that column
  • Fill NAs values in Age with the median value in each of Pclass x Gender combination

 

c) (20 pts.)  预测分析考试代考

Perform the following Data cleaning and Feature Engineering steps for both the train and test files.

  • Split column Age into 5 intervals (0, 16, 32, 48, 64, 100) (now categorical)
  • Split column Fare into 4 intervals (0, 7.9, 14.5, 31, 600) (now categorical)
  • Create column Size by adding values from columns SibSp, Parch, then drop them keeping the created column.
  • Create binary column Alone with value 1 (if passenger travels alone) and 0 otherwise.

Use get_dummies to convert categorical to binary columns.

 

 

d) (40 pts.)  预测分析考试代考

The test.csv file from kaggle does not include Survived. So split the train set into new train and test subsets. Use the train subset to fit the following models

  • KNN
  • Support Vector Classifier
  • Logistic Regression
  • Random Forest
  • Gradient Boosting

Use GridSearchCV to fifind best hyperparameter values. Report the test accuracy rate (using the test subset). Try to improve the test accuracy rate (include polynomial and/or interaction terms or use other machine/satistical method or any other mean).

 

e) (10 pts.)  预测分析考试代考

Submit your best predictions, on to Kaggle. Report your kaggle ID name, the date submitted and, the Score provided by Kaggle.

Make sure your report includes your name and your Section (Tuesday or Friday).

Report should be clean and well formatted (do not truncate tables, plots, python commands, no screen captures). Use random_state = 0 wherever is needed.

 

预测分析考试代考
预测分析考试代考

 

 

更多代写:澳洲算法网课全包  pte代考  英国SAS代写   北美论文作业代写  论文格式代写推荐  写作小妙招代写

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

 

天才代写-代写联系方式