当前位置:天才代写 > 作业代写,留学生作业代写-北美、澳洲、英国等靠谱代写 > Engineering Sciences代写 Computer Science代写 module code代写

Engineering Sciences代写 Computer Science代写 module code代写

2020-10-30 15:43 星期五 所属: 作业代写,留学生作业代写-北美、澳洲、英国等靠谱代写 浏览:1284

Engineering Sciences代写

UNIVERSITY COLLEGE LONDON

Engineering Sciences代写 Assignment Release Date: 2nd October 2019Assignment Hand-in Date: 16th October 2019 at 11.55amFormat: Problems

Faculty of Engineering Sciences

Department of Computer Science

COMP0036: LSA – Assignment 2

Dr. Dariush Hosseini (dariush.hosseini@ucl.ac.uk)

Overview Engineering Sciences代写

  • Assignment Release Date: 2nd October2019
  • AssignmentHand-in Date: 16th October 2019 at 55am
  • Format:Problems

Guidelines

  • You should answer all THREE
  • Note that not all questions carry equal
  • You should submit your final report as a pdf using Turnitin accessed via themodule’s Moodle page.
  • Within your report you should begin each question on a newpage.
  • You should preface your report with a single page containing, on twolines:
    • The module code:‘COMP0036’
    • The assignment title: ‘LSA – Assignment2’
  • Your report should be neat andlegible.Engineering Sciences代写

You are strongly advised to use LATEX to format the report, however this is not a hard requirement, and you may submit a report formatted with the aid of another system, or even a handwritten report, provided that you have converted it to a pdf (see above).

  • Please attempt to express your answers as succinctly as succinctly as possible.
  • Please note that if your answer to a question or sub-question is illegible orincomprehen- sible to the marker then you will receive no marks for that question or sub-question.Engineering Sciences代写
  • Please remember to detail your working, and state clearly any assumptions which you make.
  • Failure to adhere to any of the guidelines may result in question-specific deduction of marks. If warranted these deductions may be punitive.
  1. Assumean unlabelled dataset, , with sample mean, , and an orthonormal basis set,where d < m.
Engineering Sciences代写
Engineering Sciences代写

We have investigated the ‘Projected Variance Maximisation’ approach to PCA in which we are interested in finding the d-dimensional subspace spanned by

for which the sum of the sample variance of the data projected onto this subspace is max-imised.

Engineering Sciences代写
Engineering Sciences代写

This leads to a formulation of the PCA problem as:

(1)

However there are several other approaches to PCA. One is the ‘Reconstruction Error Minimisation’ approach.

Here we are interested in finding the d-dimensional subspace spanned by .u[j]Σd

Engineering Sciences代写
Engineering Sciences代写

which minimises the reconstruction error, i.e.:(2)

(a) [6 marks]

Show that problems (1) and (2) are equivalent.

(b) [4 marks]

In each analysis we were careful to centre our input data by effectively subtracting off the mean. Why is it important to centre the data in this way?Engineering Sciences代写

(c) [6 marks]

Show that we can re-write the objective of problem (1) as follows, and provide an expression for the matrix S:

(3)

(d) [4 marks]

If we were to replace S with the sample correlation matrix and then proceed to perform PCA with this objective, under what circumstances would this form of PCA differ from the covariance matrix version?

  1. Assume a set of unlabelledpoints, , which we wish to separate into k clusters.Engineering Sciences代写

We wish to use a ‘Spherical’ Gaussian Mixture Model (GMM) for this clustering task. Here  we  assume  that each  point has  an  unknown,  latent,  cluster  assignment  associated  with  it, z ∈ {1, …, k}, which is the outcome of a multinomial random variable, Z:

Engineering Sciences代写
Engineering Sciences代写
Furthermore, contingent on the cluster assignment, z, we assume that each point, x, is the outcome of a Gaussian random variable, X :

Where:

µ[j] Rd is the mean associated with cluster j, and;

Σ[j] = sI is the covariance associated with each cluster j. Here s is a constant, and I denotes the d × d identity matrix.

(a) [6 marks]

Give an expression for the log likelihood of the dataset, {x(i)}n

in terms of s and{µ[j], π[j]}k=1.Engineering Sciences代写

(b) [5 marks]

We wish to maximise this expression subject to the usual constraints using the EM algorithm.

Give an expression for the responsibilities, γi[j], in this case.

(c) [8 marks]

Explain why, as s  0, the solution we generate from the spherical GMM EM clustering algorithm will tend towards the solution we would generate from the k-means clustering algorithm.Engineering Sciences代写

(d) [3 marks]

Describe and explain the form of the boundary which discriminates between clusters in this case.

(e) [3 marks]

Now consider the more usual GMM EM clustering algorithm, in which we assume that the covariances, Σ[j] > 0, are no longer constrained to take the isotropic form which we assumed earlier and are in general distinct j.

Describe and explain the form of the boundary which discriminates between clusters in this case in general.

  1. (a) [4marks]

Explain the importance of the Representer Theorem?

(b) [3 marks]

Consider the A1-regularised Logistic regression optimisation problem:

Here:

Engineering Sciences代写
Engineering Sciences代写

represents a set of training data, where x Rm are input attributes,

while y ∈ {0, 1} is the output label;

w Rm is the weight vector of the linear discriminant which we seek, and;

λ > 0 is some constant.

Is this problem susceptible to the kernel trick? Explain.

(c) [4 marks]

Consider the following clustering problem:

Here:

n i=1represents a set of unlabelled points, where x Rm are input attributes;

{µ[j]  Rm}krepresent the k cluster centroids, and;

{ρi[j]  {0, 1}}n,k represent a set of assignment variables.

Let us seek to perform this optimisation using the k-means algorithm.

Is the M-step of this procedure susceptible to the kernel trick? Explain.

(d) [4 marks]

Assuming 2-dimensional input attributes, x = [x1, x2]T , what kernel function, κ, is associated with the following feature map, φ (express your answer in terms of vector dot products):

(e) [5 marks]

Assuming 1-dimensional input attributes, x, what feature map, φ : x ›→ φ(x), is associated with the following kernel, κ:

κ(x(i), x(j)) = exp .σ2x(i)  x(j)

Engineering Sciences代写
Engineering Sciences代写

更多其他:C++代写 java代写 r代写 代码代写 金融代写  物理代写 考试助攻 C语言代写 finance代写 code代写 lab代写 计算机代写 data代写

合作平台:天才代写 幽灵代写 写手招聘 Essay代写

 

天才代写-代写联系方式