大数据计算作业代写 COMP5434代写

COMP5434 (Fall 2019) Big Data Computing

大数据计算作业代写 A sample input file is given below. Each line corresponds to a point-of-interest (POI), which contains a keyword, coordinate values x and y

Individual Assignment 2 Due Date: 10:00am, 2^nd December, 2019

Please submit your assignment in Blackboard
and follow our requirements in Section 2.

1. Problem statement 大数据计算作业代写

A sample input file is given below. Each line corresponds to a point-of-interest (POI), which contains a keyword, coordinate values x and y (separated by white space).大数据计算作业代写

park 3 5

lake 2 3

mall 1 4 大数据计算作业代写

park 2 4

lake 9 8

mall 2 7

We measure the distance between two points p₁=(x₁,y₁) and p₂=(x₂,y₂) by:

_________________

dist(p₁, p₂) = Ö(x₁ – x₂)² + (y₁ – y₂)²

Each keyword k is associated with a group G(k) of points.

[Example] The group of “park” contains two points: (3,5) and (2,4).

There are 2 questions in this programming assignment.
You should write a MapReduce program to solve each of them.大数据计算作业代写

Question Q1: Find the centroid (i.e., the mean position of points) of each group.

[Example]

Input: the sample input above

Output:

lake 5.5 5.5

mall 1.5 5.5

park 2.5 4.5

Question Q2: Find the diameter (i.e., the maximum distance between any two points inside a group) of each group.

[Example]

Input: the sample input above

Output:

lake 8.602 大数据计算作业代写

mall 3.162

park 1.414

2. Requirements 大数据计算作业代写

Though MapReduce support multiple languages, in this assignment, you should use Java (Java 8) for implementation.
You submission should be organized as follows

<YourStudentID> // your folder name, [Example] 19001234g

— Q1.java // source file for question 1

— Q1.jar // jar file for question 1, compiled and archived from Q1.java 大数据计算作业代写

— Q2.java // source file for question 2

— Q2.jar // jar file for question 2, compiled and archived from Q2.java

Archive the above structure as <YourStudentID>.zip and submit this .zip file in blackboard. [Example]zip
Make sure that you can compile your source file and run with the latest Hadoop version’s (i.e., Hadoop 3.2.1) pseudo-distributed mode.大数据计算作业代写
Your jar file should be directly runnable on Linux platform with the following call:

bin/hadoop jar Q1.jar Q1 <input path> <output path>

bin/hadoop jar Q2.jar Q2 <input path> <output path>

Your output result should preserve double precision.
You should only use one MapReduce round to solve each sub-question.
[Hint] You may use the Ubuntu image we provided for this assignment.

-Google drive:

https://drive.google.com/file/d/1lMqmTAj2sC2gVqkVWW-MDUR24vv-a3Si/view?usp=sharing

-The Y drive in COMP Lab: Y:\Subject\COMP5434
Note: These files will get expired on November 7!

3. Grading criteria 大数据计算作业代写

20 marks will be given if your program can be compiled.

-for each .java file, 10 marks

80 marks will be given if your program is correct. We will test the correctness of your program by using 8 test cases (4 for each sub-question). 大数据计算作业代写

-For each test case, 10 marks

Notice this is an individual assignment. Plagiarism will result in 0 mark!

其他代写：代写CS C++代写 java代写 r代写金融经济统计代写 matlab代写 web代写 app代写作业代写物理代写澳大利亚代写考试助攻

合作平台：essay代写论文代写写手招聘英国留学生代写