Scala集合实现WordCount代码实现
生活随笔
收集整理的這篇文章主要介紹了
Scala集合实现WordCount代码实现
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
實現思路分析
代碼實現
package com.zxl.chapter10/*** 使用scala集合實現WordCount*/ object Scala09_WordCount {def main(args: Array[String]): Unit = {val list: List[(String, Int)] = List(("Hello Scala World", 4), ("Hello World", 3), ("Hello Hadoop", 2), ("Hello HBase", 1))/** 將一行一行的數據拆分成一個一個的單詞數據 flatMap* ("Hello Scala World", 4)* => [(Hello),(Scala),(World)]* => [(Hello,4),(Scala,4),(World,4)]*/val flatMapList: List[(String, Int)] = list.flatMap(t => {val words: Array[String] = t._1.split(" ")words.map(w => (w, t._2))})println("拆分成單個單詞:"+flatMapList)/*** 將單詞進行分組 groupBy*/val groupByMap: Map[String, List[(String, Int)]] = flatMapList.groupBy(t => t._1)println("按照單詞進行分組:"+groupByMap)/*** 將分組后的數據進行結構的轉換 map*/val wordToSumMap: Map[String, Int] = groupByMap.map(t => {val countList: List[Int] = t._2.map(tt => tt._2)(t._1, countList.sum)})println("將分組后的數據進行結構的轉換:"+wordToSumMap)//下面是Scala提供的更簡單的寫法(已經過時,不推薦使用)/*val wordToSumMap: MapView[String, Int] = groupByMap.mapValues(datas => datas.map(tt => tt._2).sum)println("將分組后的數據進行結構的轉換:"+wordToSumMap)*//*** 將統計的結果進行降序排列*/println("map轉換為list:"+wordToSumMap.toList)val sortList: List[(String, Int)] = wordToSumMap.toList.sortWith((left, right) => {left._2 > right._2})println("將統計的結果進行降序排列:"+sortList)/*** 從排序后的集合中獲取前3條*/val resultList: List[(String, Int)] = sortList.take(3)println("從排序后的集合中獲取前3條:"+resultList)} }測試結果
D:\develop\Java\jdk-8u101\bin\java.exe "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2020.1\lib\idea_rt.jar=51749:D:\Program Files\JetBrains\IntelliJ IDEA 2020.1\bin" -Dfile.encoding=UTF-8 -classpath D:\develop\Java\jdk-8u101\jre\lib\charsets.jar;D:\develop\Java\jdk-8u101\jre\lib\deploy.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\access-bridge-64.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\cldrdata.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\dnsns.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\jaccess.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\jfxrt.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\localedata.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\nashorn.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\sunec.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\sunjce_provider.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\sunmscapi.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\sunpkcs11.jar;D:\develop\Java\jdk-8u101\jre\lib\ext\zipfs.jar;D:\develop\Java\jdk-8u101\jre\lib\javaws.jar;D:\develop\Java\jdk-8u101\jre\lib\jce.jar;D:\develop\Java\jdk-8u101\jre\lib\jfr.jar;D:\develop\Java\jdk-8u101\jre\lib\jfxswt.jar;D:\develop\Java\jdk-8u101\jre\lib\jsse.jar;D:\develop\Java\jdk-8u101\jre\lib\management-agent.jar;D:\develop\Java\jdk-8u101\jre\lib\plugin.jar;D:\develop\Java\jdk-8u101\jre\lib\resources.jar;D:\develop\Java\jdk-8u101\jre\lib\rt.jar;D:\develop\workspace\scala-demo\Scala-atguigu-study\out\production\Scala-atguigu-study;D:\develop\scala-2.13.1\lib\scala-library.jar;D:\develop\scala-2.13.1\lib\scala-reflect.jar com.zxl.chapter10.Scala09_WordCount 拆分成單個單詞:List((Hello,4), (Scala,4), (World,4), (Hello,3), (World,3), (Hello,2), (Hadoop,2), (Hello,1), (HBase,1)) 按照單詞進行分組:HashMap(Scala -> List((Scala,4)), HBase -> List((HBase,1)), Hello -> List((Hello,4), (Hello,3), (Hello,2), (Hello,1)), Hadoop -> List((Hadoop,2)), World -> List((World,4), (World,3))) 將分組后的數據進行結構的轉換:HashMap(Scala -> 4, HBase -> 1, Hello -> 10, Hadoop -> 2, World -> 7) map轉換為list:List((Scala,4), (HBase,1), (Hello,10), (Hadoop,2), (World,7)) 將統計的結果進行降序排列:List((Hello,10), (World,7), (Scala,4), (Hadoop,2), (HBase,1)) 從排序后的集合中獲取前3條:List((Hello,10), (World,7), (Scala,4))總結
以上是生活随笔為你收集整理的Scala集合实现WordCount代码实现的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Scala mapValues踩坑记:
- 下一篇: Scala集合:reduce(化简)方法