exercises代写 assignment代写 ADR prediction代写 homeworks代写

Hands-on assignment #3, due Monday 3/4/2019 @ 8am

exercises代写 Credit: The data and ideas behind these exercises and homeworks are from the NIH LINCS DCIC Crowdsourcing Portal and Ma’ayan Lab

Credit: The data and ideas behind these exercises and homeworks are from the NIH LINCS DCIC Crowdsourcing Portal and Ma’ayan Lab @ Mt Sinai, New York. http://www.maayanlab.net/crowdsourcing/megatask1.php

The overarching goal is to predict adverse drug reactions. This assignment builds on the in- class examples on the ADR (adverse drug effect) prediction and hands-on HW2.exercises代写

This is a group assignment. You can work in a group consisting of 1, 2, 3 or 4 members. Each group will make one submission via canvas. Please state the names of all your team members in your submission.

This assignment focuses on classification and feature selection methods, and will be graded out of 10 points.

Upload 3 files for this assignment: exercises代写

A Jupyter notebook file named hw3.ipynb containing the R code and answers for the 5 questions.

Please use “#” (comment lines) and markdown cells in your notebook to indicate the question number and to extensively document your code.

A spreadsheet in tab-delimited text format representing the cross validation results of your methods for each side effect.exercises代写

Using the data “gene_expression_n438x978.txt” and “ADRs_HLGT_n438x232.txt” to answer all questions in this assignment. You can assume the files “gene_expression_n438x978.txt” and “ADRs_HLGT_n438x232.txt” are in your working directory.

In class, we discussed many techniques for classification and feature selection in the context of personalized medicine. We illustrated how to apply these methods to the breast cancer data in class.exercises代写

Feature selection methods	Classification methods
none	k-nearest neighbor (k-NN)
t-test	Support vector machine (SVM)
Signal-to-noise (S2N)	Bayesian Model Averaging (BMA)
BSS/WSS exercises代写	Decision trees
Correlation with the class vector	Boosting, bagging and other ensemble methods
	Golub’s method on the AML/ALL data

(8 points)exercises代写

Experiment combinations of the above feature selection and classification methods and apply to the ADR data to predict side effects. Evaluate the performance using

fold cross validation, repeated 3 times. Note that you need to perform feature selection in each fold and each run of your cross validation results. In other words, you will perform feature selection and classification a total of 30

Each combination of feature selection + classification is worth 1 point. For example,

t-test with p-value < 0.01 as feature selection and k-NN with k=10 as classification method will earn you 1point.exercises代写
t-test with p-value < 0.001 as feature selection and k-NN with k=10 as classification method will earn you another 1point.
No feature selection and k-NN with k=12 will earn you an additional 1

So, your group can try 8 combinations to earn up to 8 points. Different input parameter settings count as different combinations.exercises代写

Submit a spreadsheet in tab-delimited text format representing a table that consists of 232 rows and 8 columns. Each column represents a combination of the methods you tried. Each row represents a side effect. Each entry in this table is the average prediction accuracy from 10-fold cross validation, repeated 5 times.

(2 points) exercises代写

Compare the prediction accuracy of the methods you tried in your report. In particular, address the following questions:

Which side effects you can predict with the highest accuracy in eachcombination?
Which side effects you predict with the lowest accuracy in eachcombination?
Some side effects have unbalanced class sizes. Did you do anything aboutthat? Why? If so, is your method effective?exercises代写
Which feature selection and/or classification method would you consider as the “winner” in your empirical study? You can include the results from HW2 in your discussion.
Any interesting negativeresults?

其他代写：考试助攻计算机代写 java代写 function代写 paper代写 web代写编程代写 report代写数学代写 algorithm代写 python代写 java代写 code代写 project代写 Exercise代写 dataset代写 analysis代写 C++代写代写CS 金融经济统计代写 C语言代写

合作平台：天才代写幽灵代写写手招聘 Essay代写

Hands-on assignment #3, due Monday 3/4/2019 @ 8am

Upload 3 files for this assignment: exercises代写

(8 points)exercises代写

(2 points) exercises代写

关键字：