Final Examination, Semester 1, 2018
COMP30027 Machine Learning
机器学习代考 Allquestions should be interpretted as referring to the concepts as described in this subject, whether or not it is explicitly stated.
Reading Time: 15 minutes. Writing Time: 2 hours. This paper has 6 pages including this cover page. Instructions to Invigilators:
Students should be provided with script books, and should answer all questions in the
provided script book. Students may not remove any part of the examination paper from the examination room.
Instructions to Students:
- There are 9 questions in the exam worth a total of 80 marks, making up 60% of the total assessment for the subject. Note that all questions should be answered, and questions are not of equal room.
- Allquestions should be interpretted as referring to the concepts as described in this subject, whether or not it is explicitly stated. Unless otherwise stated, you are required to show all working for numerical questions; please indicate your final answer clearly. 机器学习代考
- Please answer all questions on the ruled pages in the script book provided, starting each numbered question on a new page.
- Please write your student ID in the space above and also on the front of each script book you use. When you are finished, place the exam paper inside the front cover of the script book.
- Your writing should be clear; illegible answers will not be marked.
Authorised Materials: No materials are authorised.
Calculators: Students are permitted to use calculators.
Library: This paper may be held by the Baillieu Library.
Examiners’ use only | |||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Total |
21 | 9 | 5 | 9 | 7 | 5 | 7 | 4 | 13 | 80 |
Section A: Short Answer Questions [21 marks] 机器学习代考
Answer each of the questions in this section as briefly as possible. Expect to answer each sub- question in a couple of lines.
Question 1: Short Answer Questions [21 marks]
- Explain the difference between “supervised” and “unsupervised” learning, and give an ex- ample of a typical method for each. [2marks]
- Whatis the relationship between “instances” and “attributes” (also known as “features”)?[1 mark]
- Some Machine Learning methods rely on having “numerical” data. Give an example of how we can use such a learner with “categorical” data. [1mark]
- When might the “softmax” function be used in a Machine Learning context? [1mark]
- What might it look like, if we had a Machine Learning system with low “model bias”, but high “evaluation bias”? Why would such a situation be undesirable? [3marks]
- Underwhat circumstances would a “neural network” be equivalent to a “logistic regression” model? [2 marks]
- How is “active learning” similar to “self-training”, and how are they different? [2marks] 机器学习代考
- How could a “linear regression” model be used for “classification”? Explain any important data transformations in such a context. [2marks]
- We would like to evaluate a Machine Learning system on a development data set with 100 instances,where 80 instances are truly N and the rest are truly Y. Our system has labelled 20 of the truly N instances as Y, and 15 of the truly Y instances as N. Calculate the F-score, with respect to the Y [2 marks]
- For what kind of “structured classification” task would one typically use a “hidden Markov model”? What assumption(s) do we make about the accompanying data when using such a model? [3marks]
- Use a diagram to show how “hierarchical clustering” is different to “partitional clustering”. Which does an “Expectation–Maximisation” method typically produce? [2marks]
Section B: Methodological Questions [23 marks] 机器学习代考
In this section you are asked to demonstrate your conceptual understanding of a subset of the methods that we have studied in this subject.
Question 2: Decision Trees [9 marks]
- “Entropy” is a key concept when building a Decision Explain what it measures, and how it is used to build a tree. [3marks]
- Explainhow we use a (trained) Decision Tree to predict the class of a test [2 marks]
- Naive Bayes and Nearest Prototype have a somewhat similar basis on which they predict the class of a test instance. Decision Trees can be quite different; explain [2 marks]
- Decision Trees are the basis for a number of “ensemble methods”. Choose one, andexplain:[2 marks]
(a)How is it different to a plain DecisionTree;
(b)What improvement(s) it would typically provide, over using a plain Decision Tree.
Question 3: Gradient Descent [5 marks] 机器学习代考
- Givenan example of where we might use the method of Gradient Descent, according to how it was described in this subject. [1 mark]
- Explain what Gradient Descent is used for, and what problem it is designed to solve, based on your answer to Question 3(1). [2marks]
- The “learning rate” is a value that must be chosen; what is the risk of choosing a very small value for the learning rate? A very large value? [2marks]
Question 4: Support Vector Machines [9 marks]
- What is a “support vector”? How are support vectors used to predict the class of a test instance? [3marks]
- Forsome datasets, a Support Vector Machine (SVM) can make the same predictions as Near- est Prototype (NP). Explain why, and explain under what circumstances an SVM would produce a superior model to [3 marks] 机器学习代考
- By referring to the “parameters” and/or “hyper-parameters” in a typical model, underwhat circumstances would an SVM be equivalent to Logistic Regression? (Assume that the data is non-trivial and not pathological.) [3 marks]
Section C: Numeric Questions [23 marks]
In this section you are asked to demonstrate your understanding of a subset of the methods that we have studied in this subject, in being able to perform numeric calculations. Questions 5 through 8 make use of the following training data set, with a single test instance labelled as ?:
ebc ibu | label |
L 1.00
L 0.10 M 0.15 M 0.45 H 0.30 H 0.45 M 0.11 |
ale ale ale stout stout stout
stout |
M 0.80 | ? |
Question 5: Nearest Neighbour [7 marks]
- Choose a sensible distance metric based on the data above, and use it to predict the label of the test instance according to the method of 1-Nearest Neighbour. [4marks]
- Usingthe results of the previous question, use the method of 5-Nearest Neighbour to predict the test instance, employing an inverse linear distance voting scheme (or you may use a different voting scheme for partial marks). [3 marks]
Question 6: Discretisation [5 marks] 机器学习代考
Discretisation is used to transform a “numerical” attribute into a “categorical” attribute. According to the given training data, discretise ibu into three categories, using the following strategies:
- Equal-width [2marks]
- k-means, with seeds 0.10, 0.30, and 0.45 [3marks]
(You do not have to write all of the calculations for this question; a depiction of the process con- sistent with the correct calculations is sufficient.)
Question 7: Naive Bayes [7 marks]
- Using the data set before discretisation, predict the label of the test instance, using the method of Naive Bayes, consistent with this subject. (n.b. The use of “gaussians” is not) [4marks]
- Using a discretised representation of the data, predict the test instance, using Naive Bayes with “Laplace smoothing”. [3marks]
Question 8: Feature Selection [4 marks] 机器学习代考
- Given this two-class problem, and a discretised representation of the data, construct the contigency matrices for ebc and ibu.[2 marks]
- Which one of these attributes would be preferred by a typical “feature filtering” method? (You do not have to show all of your working; a description consistent with the methods from this subject is sufficient.) [2marks]
Section D: Design and Application Questions [13 marks]
In this section you are asked to demonstrate that you have gained a high-level understanding of the methods and algorithms covered in this subject, and can apply that understanding.
Expect to respond using about one-third of a page to one full page, for each of the three points below. These questions will require significantly more thought than those in Sections A–C and should be attempted only after having completed the earlier sections.
Question 9: Do customers like our services? [13 marks] 机器学习代考
You have been contracted by a medium–sized business to help them to assess public sentiment regarding several of the services they provide. The business has noticed that many people mention their services in short messages, posted to a public micro-blogging platform.
You aren’t a text-processing expert (or are you?), but other data scientists on the team have pro- vided you with a representation of the messages as an “embedding” (a dense, real–valued vec- tor). At the start of your contract, a large number of messages have been collected, each of which is related to one of the corresponding services. However, only a handful of messages have been labelled as either positive, negative, or neutral in nature.
Each day, you have access to a few interns, who are capable of reading a few more messages and labelling the corresponding sentiment. 机器学习代考
Your objective is to build a system that can allow the business to reliably assess current public opinion toward all of their services. You conjecture that a Machine Learning system can be con- structed, which can label a large number of new messages related to each of the services, according to their sentiment as one the three labels (positive, neutral, negative).
Detail the following points:
- the type of machine learner(s) that you will use, and why, making sure to state anyrequired assumptions; [4 marks]
- how you will evaluate such a system, to demonstrate its effectiveness to–date; [5marks]
- howyou will best make use of the interns (beyond having them purchase you coffee).[4 marks]
x««««««« End of Exam »»»»»»»x