当前位置:天才代写 > 数据库代写 > 大数据计算作业代写 COMP5434代写

大数据计算作业代写 COMP5434代写

2021-07-29 17:26 星期四 所属: 数据库代写 浏览:88

大数据计算作业代写

COMP5434 (Fall 2019) Big Data Computing

 

大数据计算作业代写 A sample input file is given below. Each line corresponds to a point-of-interest (POI), which contains a keyword, coordinate values x and y

 

Individual Assignment 2       Due Date: 10:00am, 2nd December, 2019

Please submit your assignment in Blackboard
and follow our requirements in Section 2.

1. Problem statement 大数据计算作业代写

A sample input file is given below. Each line corresponds to a point-of-interest (POI), which contains a keyword, coordinate values x and y (separated by white space).大数据计算作业代写

park 3 5

lake 2 3

mall 1 4 大数据计算作业代写

park 2 4

lake 9 8

mall 2 7

We measure the distance between two points p1=(x1,y1) and p2=(x2,y2) by:

_________________

dist(p1, p2) = Ö(x1 – x2)2 + (y1 – y2)2

Each keyword k is associated with a group G(k) of points.

[Example] The group of “park” contains two points: (3,5) and (2,4).

There are 2 questions in this programming assignment.
You should write a MapReduce program to solve each of them.大数据计算作业代写

Question Q1: Find the centroid (i.e., the mean position of points) of each group.

[Example]

Input: the sample input above

Output:

lake  5.5  5.5

mall  1.5  5.5

park  2.5  4.5

大数据计算作业代写
大数据计算作业代写

Question Q2: Find the diameter (i.e., the maximum distance between any two points inside a group) of each group.

[Example]

Input: the sample input above

Output:

lake  8.602 大数据计算作业代写

mall  3.162

park  1.414

2. Requirements 大数据计算作业代写

  1. Though MapReduce support multiple languages, in this assignment, you should use Java (Java 8) for implementation.
  2. You submission should be organized as follows

<YourStudentID> // your folder name, [Example] 19001234g

— Q1.java              // source file for question 1

— Q1.jar                // jar file for question 1, compiled and archived from Q1.java 大数据计算作业代写

— Q2.java              // source file for question 2

— Q2.jar                // jar file for question 2, compiled and archived from Q2.java

  1. Archive the above structure as <YourStudentID>.zip and submit this .zip file in blackboard. [Example]zip
  2. Make sure that you can compile your source file and run with the latest Hadoop version’s (i.e., Hadoop 3.2.1) pseudo-distributed mode.大数据计算作业代写
  3. Your jar file should be directly runnable on Linux platform with the following call:

bin/hadoop jar Q1.jar Q1 <input path> <output path>

bin/hadoop jar Q2.jar Q2 <input path> <output path>

  1. Your output result should preserve double precision.
  2. You should only use one MapReduce round to solve each sub-question.
  3. [Hint] You may use the Ubuntu image we provided for this assignment.

-Google drive:

https://drive.google.com/file/d/1lMqmTAj2sC2gVqkVWW-MDUR24vv-a3Si/view?usp=sharing

-The Y drive in COMP Lab: Y:\Subject\COMP5434
       Note: These files will get expired on November 7!

3. Grading criteria 大数据计算作业代写

20 marks will be given if your program can be compiled.

-for each .java file, 10 marks

80 marks will be given if your program is correct. We will test the correctness of your program by using 8 test cases (4 for each sub-question). 大数据计算作业代写

-For each test case, 10 marks

Notice this is an individual assignment. Plagiarism will result in 0 mark!

大数据计算作业代写
大数据计算作业代写

其他代写:代写CS C++代写 java代写 r代写 金融经济统计代写 matlab代写 web代写 app代写 作业代写 物理代写 澳大利亚代写 考试助攻

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

 

天才代写-代写联系方式