请到本教程官网的“下载专区”的“数据集”中下载 chapter5-data1.txt,该数据集包含 了某大学计算机系的成绩,数据格式如下所示:
(1)该系总共有多少学生;
(2)该系共开设来多少门课程;
(3)Tom 同学的总成绩平均分是多少;
(4)求每名同学的选修的课程门数;
(Ford,3) (Enoch,3) (Kim,4) (Conrad,2) (Marvin,3) (Michael,5) (Ernest,5) (Marsh,4) (Duke,4) (Armand,3) (Lester,4) (Broderick,3) (Hayden,3) (Bertram,3) (Bart,5) (Duncann,5) (Colby,4) (Bowen,5) (Elmer,4) (Elvis,2) (Adair,3) (Roderick,4) (Walter,4) (Jonathan,4) (Jo,5) (Rod,4) (Scott,3) (Elliot,3) (Alvis,6) (Joseph,3) (Geoffrey,4) (Todd,3) (Wordsworth,4) (Wright,4) (Adam,3) (Sandy,1) (Ben,4) (Clyde,7) (Mark,7) (Dempsey,4) (Rock,6) (Ellis,4) (Edward,4) (Eugene,1) (Samuel,4) (Gerald,4) (Luthers,5) (Virgil,5) (Bradley,2) (Dick,3) (Bevis,4) (Merlin,5) (Armstrong,2) (Ron,6) (Archer,5) (Nick,5) (Hogan,4) (Len,5) (Benson,4) (Colbert,4) (John,6) (Saxon,7) (Marico,6) (Kevin,4) (Uriah,1) (Aldrich,3) (Jeffrey,4) (Brook,4) (Nicholas,5) (Elijah,4) (Bill,2) (Greg,4) (Payne,6) (Colin,5) (Gordon,4) (Tracy,3) (Alston,4) (George,4) (Griffith,4) (Andrew,4) (Egbert,4) (Bishop,2) (Beck,4) (Gilbert,3) (Phil,3) (Antony,5) (Nelson,5) (Christ,2) (Bruce,3) (Rodney,3) (Boris,6) (Marlon,4) (Don,2) (Aaron,4) (Sean,6) (Truman,3) (Solomon,5) (Blake,4) (Christopher,4) (Clare,4) (Milo,2) (Victor,2) (Nigel,3) (Jonas,4) (Jason,4) (Hilary,4) (Woodrow,3) (William,6) (Dennis,4) (Jeff,4) (Dominic,4) (Merle,3) (Elroy,5) (Harvey,7) (Clark,6) (Herman,3) (Bert,3) (Alger,5) (Hiram,6) (Leonard,2) (Kenneth,3) (Leopold,7) (Eric,4) (Basil,4) (Martin,3) (Clarence,7) (Bernie,3) (Vincent,5) (Christian,2) (Winfred,3) (Lionel,4) (Bob,3) (Bartholomew,5) (Lennon,4) (Joshua,4) (Tom,5) (Vic,3) (Eli,5) (Alva,5) (Brady,5) (Derrick,6) (Willie,4) (Bennett,6) (Boyce,2) (Elton,5) (Sidney,5) (Jay,6) (Meredith,4) (Harold,4) (Jim,4) (Adonis,5) (Max,3) (Abel,4) (Barton,1) (Peter,4) (Matthew,2) (Alexander,4) (Donald,4) (Raymondt,6) (Devin,4) (Kerwin,3) (Borg,4) (Roy,6) (Harry,4) (Abbott,3) (Miles,6) (Baron,6) (Francis,4) (Lewis,4) (Aries,2) (Glenn,6) (Cleveland,4) (Mick,4) (Will,3) (Henry,2) (Jesse,7) (Alvin,5) (Ivan,4) (Monroe,3) (Hobart,4) (Leo,5) (Louis,6) (Randolph,3) (Sid,3) (Blair,4) (Abraham,3) (Lucien,5) (Benedict,6) (Montague,3) (Giles,7) (Kerr,4) (Berg,4) (Simon,2) (Lou,2) (Ronald,3) (Pete,3) (Harlan,6) (Arlen,4) (Maxwell,4) (Kennedy,4) (Bernard,2) (Spencer,5) (Andy,3) (Jeremy,6) (Alan,5) (Bruno,5) (Jerry,3) (Donahue,5) (Barry,5) (Kent,4) (Frank,3) (Noah,4) (Mike,3) (Tony,3) (Webb,7) (Ken,3) (Philip,2) (Robin,4) (Amos,5) (Chapman,4) (Valentine,8) (Angelo,2) (Boyd,3) (Chad,6) (Benjamin,4) (Allen,4) (Evan,3) (Albert,3) (Alfred,2) (Newman,2) (Winston,4) (Rory,4) (Dean,7) (Claude,2) (Booth,6) (Channing,4) (Ward,4) (Chester,6) (Webster,2) (Marshall,4) (Cliff,5) (Emmanuel,3) (Jerome,3) (Upton,5) (Corey,4) (Perry,5) (Herbert,3) (Maurice,2) (Drew,5) (Brandon,5) (Adolph,4) (Levi,2) (Bing,6) (Antonio,3) (Stan,3) (Les,6) (Charles,3) (Clement,5) (Blithe,3) (Brian,6) (Matt,4) (Archibald,5) (Horace,5) (Sebastian,6) (Verne,3)
(5)该系 DataBase 课程共有多少人选修;
(6)各门课程的平均分是多少;
(7)使用累加器计算共有多少人选了 DataBase 这门课。
2.编写独立应用程序实现数据去重 对于两个输入文件 A 和 B,编写 Spark 独立应用程序,对两个文件进行合并,并剔除其 中重复的内容,得到一个新文件 C。下面是输入文件和输出文件的一个样例,供参考。 输入文件 A 的样例如下: 20170101 x 20170102 y 20170103 x 20170104 y 20170105 z 20170106 z 输入文件 B 的样例如下: 20170101 y 20170102 y 20170103 x 20170104 z 20170105 y 根据输入的文件 A 和 B 合并得到的输出文件 C 的样例如下: 20170101 x 20170101 y 20170102 y 20170103 x 20170104 y 20170104 z 20170105 y 20170105 z 20170106 z
标签:文件,编程,20170105,20170104,20170103,20170102,20170101,RDD,初级 From: https://www.cnblogs.com/ruipengli/p/17979946