首页 > 其他分享 >ISIT312 Big Data Management

ISIT312 Big Data Management

时间:2024-10-17 19:34:58浏览次数:9  
标签:files HDFS Management Big application file ISIT312 your speed

School of Computing & Information TechnologySession: 4, 2024

ISIT312 Big Data Management

SIM S4 2024

Assignment 1

Scope

The objectives of Assignment 1 include implementation of HDFS applications,implementation of simple MapReduce applications, and describing an implementation ofcomplex MapReduce applications.This assignment is due on 20 October 2024 by 9:00 pm Singaporean Time (SGT).This assignment is worth 10% of the total evaluation in the subject.The assignment consists of 3 tasks and the specification of each task starts from a new page.Only electronic submission through Moodle at:https://moodle.uowplatform.edu.au/login/index.phpwill be accepted. A submission procedure is explained at the end of Assignment 1

specification.

A policy regarding late submissions is included in the subject outline. Only one submission ofAssignment 1 is allowed and only one submission per student is accepted.A late submission penalty (25% of the total mark) will be applied for every 24 hours late.A submission that contains an incorrect file attached is treated as a correct submission with allconsequences coming from the evaluation of the file attached.All files left on Moodle in a state "Draft(not submitted)" will not be evaluated.

An implementation that does not compile well due to one or more syntactical and/or run timeerrors scores no marks.The first assignment is an individual assignment and it is expected that all its tasks will besolved individually without any cooperation with the other students. However, it is allowedto declare in the submission comments that a particular component or task of this assignmenthas been implemented in cooperation with another student. In such a case evaluation of a taskor component may be shared with another student. In all other cases plagiarism will result in

a FAIL grade being recorded for entire assignment. If you have any doubts, questions, etc.please consult your lecturer or tutor during laboratory/tutorial classes or over e-mail.Task 1 (3 marks)

Iplementation of HDFS application Implement a HDFS application that merges two files located in HDFS into one file also locatedin HDFS.

The application must have the following parameters.

(1) A path to, and a name of the first input file in HDFS.

(2) A path to, and a name of the second input file in HDFS.

(3) A path to, and a new name of an output file to be created in HDFS. The file is supposedto contain the contents of the first input file followed by the contents of the second inputfile.Perform the following steps.

Implement the application and save its source code in a file solution1.java.Upload two files to HDFS. The contents, the name, and the locations of the files in HDSF areup to you.When ready, compile, create jar file, and process your application. Display the resultscreated by the application.Use Hadoop to provide a piece of evidence that two files uploaded into HDFS have beensuccessful merged into one file in HDFS.

Deliverables

A file solution1.java with a source code of the application that merges two HDFS files.A file solution1.pdf that contains the contents of Terminal window with a report fromcompilation, creation of jar file, uploading to HDFS two small files for testing, processing of

the application, and evidence that two files uploaded into HDFS has been successful mergesin one file in HDFS with explanation of how the statements work.Task 2 (4 marks)

Implementation of MapReduce application Assume, that a speed camera records the speed of passing cars and saves the measurements ina text file. The speed of each car is measured in kilometres per hour. Asingle row in the filecontains a car registration number, a location of the camera, a date when the speed has beenmeasured, and the speed of a car with the recorded registration number. The values are always

separated with a single blank.For example, a sample file (SpeedCamera.txt) with the speed measurements contains the

following lines:PKR856 AYE 14-NOV-2021 80

UPS234 CTE 20-FEB-2022 110

PKR856 PIE 20-MAR-2020 90

PKR856 PIE 17-JUN-2021 100

UPS234 CTE 22-SEP-2022 100

UPS234 CTE 03-AUG-2020 90

Assume, that a speed limit in a location of the speed camera is 90 kilometres per hour.Your task is to implement a MapReduce application, that finds an average speed of all cars,

that exceeded a speed limit in the location of the speed camera.An input file with the speed measurements must include the lines listed above and it mustcontain at least 20 measurements. All additional measurements are up to you.Save your solution in a file solution2.java.When ready, compile, create a jar file, and process your application. Display the resultscreated by the application. The result of your application includes (1) the content of your inputfile, (2) the car registration number, the location of the camera, and the average speed that

exceed the speed limit. When finished, Copy and Paste the messages from a Terminal screeninto a file solution2.pdf.

A sample output of the application is as follows:

Deliverables

A file solution2.java with a source code of the application that implement the

functionality of the problem statement specified above. A file solution2.pdf with a report

from the compilation of your code, the creation of the jar file, the processing of your

application, the listing of your input file with the speed measurements , and the results of

processing the solution2.java.Task 3 (3 marks)

Implementation of MapReduce application

An application MinMax described in an Exercise 2 has the functionality the same as the

following SQL statement.

SELECT key, MIN(value), MAX(value)

FROM Sequence-of-key-value-pairs

GROUP BY key;

Extend Java code of the application such that it implements the functionality the same as the

following SQL statement.

SELECT key, MAX(value), MIN(value), AVG(value), SUM(value)

FROM Sequence-of-key-value-pairs

GROUP BY key;

Save your solution in a file solution3.java.When ready, compile, create the jar file, and process your application. To test yourapplication, you can use a file sales.txtincluded in the zipped file of this specification.

Display the results created by the application. When finished, Copy and Paste the messages

rom a Terminal screen into a file solution3.pdf.A sample output of the application is as follows:

Deliverables

A file solution3.java with a source code of the application that implement thefunctionality of SELECT statement given above. A file solution3.pdf with a report fromcompilation, creation of the jar file, processing of your 代 写ISIT312 Big Data Management application, and screen captures of

the results of processing solution3.java.Submission of Assignment 1 Note, that you have only one submission. So, make absolutely sure that you submit the

correct files with the correct contents. Please submit an Academic Consideration in SOLS if an extension (1 week maximally) is required.

Please combine the files solution1.pdf, solution2.pdf, and solution3.pdf as a single pdf (solutions.pdf) first, then zip the files

solutions.pdf, solution1.java, solution2.java, and

solution3.java into a single zipped file (A1-solutions.zip). Please submit thezipped file through Moodle in the following way:

(1) Access Moodle at http://moodle.uowplatform.edu.au/

(2) To login use a Login link located in the right upper corner the Web page or in themiddle of the bottom of the Web page

(3) When logged select a site ISIT312 (SP424) Big Data Management

(4) Scroll down to a section SUBMISSIONS

(5) Click at Assignment 1 link.

(6) Click at a button Add Submission

(7) Move the zipped file A1-solutions.zip into an area You can drag and drop files here to add them. You can also use a link Add…

(9) Click at a button Save changes

(10)Click at a button Submit assignment

(11)Click at the checkbox with a text attached: By checking this box, I confirm that this submission is my own work, … in order toconfirm authorship of your submission.

(12)Click at a button Continue End of specification

标签:files,HDFS,Management,Big,application,file,ISIT312,your,speed
From: https://www.cnblogs.com/comp9313/p/18471830

相关文章

  • Code-Projects Hospital Management System SQL注入漏洞(CVE-2024-8368)复现
    参考文献:code-projects使用PHP的医院管理系统,源代码v1.0/hms/doctor/index.phpSQL注入·问题#1·青銮机器人/CVE·GitHub的国家信息安全漏洞库(cnnvd.org.cn)免责声明本文仅用于安全研究和学习目的。请勿将文中提供的漏洞复现方法、脚本或其他信息用于未经授......
  • BigDecimal 常用方法
    文章目录BigDecimal常用方法1.初始化BigDecimal2.创建BigDecimal对象3.BigDecimal类中定义好的常量4.BigDecimal值之间的转换5.取当前值的相反数、绝对值、幂函数、保留数值的精度6.BigDecimal之间的运算:加减乘除方法7.两数相除保留精度BigDecimal常用方法1.初......
  • FINANCE 251: Financial Management
    FINANCE 251: Financial Management2024 Semester 2 (1245)Assignment PART 1Instructions:This isaGroupassignment(Part 1) which also includes an Individual component (Part 2). You mustform.yourowngroups (min 2, max5 people per......
  • ENGR90037 Earned Value Management
    ENGR90037–2024Semester2,Assignment2Part1– ProjectManagement(EVM–Earned ValueManagement)Due Date:11:59 PM on Friday of Week 12 Worth 10 MarksFor Project Management,the“word limit” gives only a roughideaof the amou......
  • 【常用API】Math,System,Runtime,BigDecimal
    Math代表数学,是一个工具类,提供的都是对数据进行操作的一些静态方法。Math类提供的常见方法方法名说明publicstaticintabs(inta)获取参数的绝对值publicstaticdoubleceil(doublea)向上取整publicstaticdoublefloor(doublea)向下取整publicstaticintround(......
  • INA865-2024V2 Financial Risk Management
    FinancialRiskManagement(FINA865-2024V2)GroupProjectDueat 23:59PM,Wednesday,02 October2024Yourassignedindexwillbeemailedtoyou in due course.TASKS: SubmitExcelSheetto Canvas.Preparationsteps: [2marks]Fortheassignedindex (v......
  • INF80028 - Business Process Management
    INF80028- Business Process ManagementSemester2,2024Assignment2AnalysingandDesigningTo-BeBusinessProcess forSwinburneCaresFoundationAssignment2dueon Week12Friday18th Oct.at23:59 AEDST Assessment2 Value=40%Tobecompletedi......
  • 第六届经济管理与文化产业国际学术会议 2024 6th International Conference on Econom
    文章目录一、会议详情二、重要信息三、大会介绍四、出席嘉宾五、征稿主题六、咨询一、会议详情二、重要信息大会官网:https://ais.cn/u/vEbMBz提交检索:EICompendex、IEEEXplore、Scopus大会时间:2024年10月25-27日大会地点:中国-大连三、大会介绍抓住数字经济的......
  • 第五届经济管理与大数据应用国际学术会议 2024 5th International Conference on Econ
    文章目录一、会议详情二、重要信息三、大会介绍四、出席嘉宾五、征稿主题六、咨询一、会议详情二、重要信息大会官网:https://ais.cn/u/vEbMBz提交检索:EICompendex、IEEEXplore、Scopus会议时间:2024年10月25日-27日会议地点:中国-大连三、大会介绍第五届经济管......
  • CIV6746 Design and Management of Sewer Systems
    CIV6746DesignandManagementofSewerSystemsIntroduction - CIV6746 Re-assessmentThis module has one component for there-assessment:(i)awritten reportcontainingtwo parts (each 2 partsdescribed below in detail)- a single reportwo......