﻿ 数据科学算法代考 data science algorithms代写 - 算法代写, 考试助攻

# 数据科学算法代考 data science algorithms代写

2022-03-13 09:18 星期日 所属： 算法代写 浏览：309

## ExamPaper

### 1.Given a Classification problem and a dataset, where each record has several attributes and a class label, a learning algorithm can be applied to the data in order to determine a classification   数据科学算法代考

The model is thenused to classify previously unseen data (data without a class label) to predict the class label.

(a)Hunt’s algorithm is the general approach to learn a classification model in the form of a decision tree. Provide its pseudocode.What are the three main design choices in any ‘specific’ decision tree induction algorithm? Provide the definition of the GINI index for a single node and the GINI index for a binary split.(8 marks)

#### (b)Howdo you measure the performance of a Decision Tree? What are the generalisation error and the resubstitution error?(3 marks)

(c)A golf player keeps a record of the weather condition of days in which they went to play. Consider the set of records with four features (O, T, H, W) and a class (“play”) shown in Table Q1-1. Whatdata type (nominal, ordinal, binary) are the attributes and the class?

Compare the two decision trees shown in Figure Q1-1 by computing the two estimates of the generalization error based on the re-substitution error:  数据科学算法代考

• the optimistic estimateand
• the pessimistic estimate with penalty term of 0.9. (5 marks)

(d)What is the meaning of the penalty term in estimating the generalisation error? For which value of the penalty term in the decision tree in Figure Q1-1.a would have a smaller pessimistic estimateof the generalisation error than the one in Figure Q1-1.b?(4 marks)

 ID Outlook (O) Temperature (T) Humidity (H) Windy (W) play 1 Overcast Cool High Yes No 2 Overcast Cool Low No Yes 3 Overcast Cool Low Yes No 4 Overcast Cool Normal Yes No 5 Overcast Hot High No No 6 Overcast Hot Normal No Yes 7 Overcast Mild Low Yes Yes 8 Rainy Cool High No No 9 Rainy Hot High No No 10 Rainy Hot High Yes No 11 Rainy Mild Normal No Yes 12 Rainy Mild Normal Yes No 13 Sunny Cool Normal No Yes 14 Sunny Cool Normal Yes No 15 Sunny Mild High No Yes 16 Sunny Mild High Yes No

Table Q1-1. Golf data

Figure Q1-1. Decision Trees

### 2.(a) Brieflydiscuss Cluster Analysis in general and, in particular, two types of clustering: partitional and hierarchical.(4 marks)  数据科学算法代考

(b)Compare and contrast one algorithm for partitional clustering and one for hierarchical clustering in terms of advantages (at leasttwo) and disadvantages (at least two), including their computational complexity.(6 marks)

(c)Consider the set of 6 data points in 2 dimensions (x,y) in the table in Figure Q2-1. Apply one iteration of the k-means algorithm (for k=2)to find the cluster allocation (C0 or C1) of each data point and the values of the centroids at the end of the iteration (it-1).  数据科学算法代考

Which of the two alternative initialisations of the centroids (c0 and c1 at it-0) given below produced the best clustering according to the cost function (SSE) optimised by k-means?

Provide the results in the following page as well as a worked solution (formulas and your arithmetic calculations) to compute the values of the centroids at the end of the iteration (it-1), the cluster allocations and the values of the cost function before and after the iteration (it-0 and it-1).

Figure Q2-1. The input data points

### 3.An Association Rule is an implication expression of the form X àY, where X and Y are disjoint itemsets.

(a)Howmany possible non-empty itemsets can be generated from a list of 12 unique items? How many non-redundant association rules can be generated from them? (5 marks)   数据科学算法代考

(b)Is Association Rule Mining a descriptive or predictive data mining approach? Explain ARM and the meaning of the data model it provides. (3marks)

(c)What are the support and the confidence of an association rule? Describe these measures and provide their formula. Given the transactionsin Table Q3-1, what are the support and the confidence of the following four rules?

1. {steak} à{wine}