1. countByKey
- 定义:countByKey():scala.collection.Map(K,Long)按照key值计算每一个key出现的总次数
- 案例:
val rdd:RDD[(String,Int)] = sc.makeRDD(Array(("zs",60),("zs",70),("zs",80),("ls",66),("ls",60),("ls",77)))
val countByKey: collection.Map[String, Long] = rdd.countByKey()
println(countByKey) // Map(zs -> 3, ls -> 3)
2. saveAsSequenceFile
- 解释:saveAsSequenceFile():rdd有几个分区在HDFS上保存几个文件
- 案例:
val rdd:RDD[(String,Int)] = sc.makeRDD(Array(("zs",60),("zs",70),("zs",80),("ls",66),("ls",60),("ls",77)))
rdd.saveAsTextFile("hdfs://node1:9000/a")
rdd.saveAsSequenceFile("hdfs://node1:9000/b")
标签:countByKey,行动,60,rdd,键值,ls,算子,saveAsSequenceFile,zs
From: https://www.cnblogs.com/jsqup/p/16621046.html