生活随笔
收集整理的這篇文章主要介紹了
Hadoop 统计单词字数的例子
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
?hadoop 的核心還是 Map-Reduce過(guò)程和 hadoop分布式文件系統(tǒng)
?
第一步:定義Map過(guò)程
????????public?class?MyMap?extends?Mapper<Object,?Text,?Text,?IntWritable>?{??????????private?static?final?IntWritable?one?=?new?IntWritable(1);?????private?Text?word;???????????????public?void?map(Object?key?,Text?value,Context?context)??????????????throws?IOException,InterruptedException{??????????????????String?line=value.toString();?????????StringTokenizer?tokenizer?=?new?StringTokenizer(line);?????????while(tokenizer.hasMoreTokens()){?????????????word?=?new?Text();?????????????word.set(tokenizer.nextToken());?????????????context.write(word,?one);?????????}??????????????}??}? 第二步: 定義 Reduce 過(guò)程
?
????????public?class?MyReduce?extends?Reducer<Text,?IntWritable,?Text,?IntWritable>?{??????????public?void?reduce?(Text?key,Iterable<IntWritable>?values,Context?context)?????????throws?IOException?,InterruptedException{??????????????????int?sum=0;?????????for(IntWritable?val:?values){?????????????sum+=val.get();?????????}??????????????????context.write(key,?new?IntWritable(sum));?????}??}? ?
編寫一個(gè)Driver 來(lái)執(zhí)行Map-Reduce過(guò)程
?
public?class?MyDriver?{??????public?static?void?main(String?[]?args)?throws?Exception{?????????????????????Configuration?conf?=?new?Configuration();?????????conf.set("hadoop.job.ugi",?"root,root123");??????????????????Job?job?=?new?Job(conf,"Hello,hadoop!?^_^");??????????????????job.setJarByClass(MyDriver.class);?????????job.setMapOutputKeyClass(Text.class);?????????job.setMapOutputValueClass(IntWritable.class);?????????job.setMapperClass(MyMap.class);?????????job.setCombinerClass(MyReduce.class);?????????job.setReducerClass(MyReduce.class);?????????job.setInputFormatClass(TextInputFormat.class);?????????job.setOutputFormatClass(TextOutputFormat.class);??????????????????FileInputFormat.setInputPaths(job,?new?Path(args[0]));?????????FileOutputFormat.setOutputPath(job,new?Path(args[1]));??????????????????job.waitForCompletion(true);?????}?}???? ?
轉(zhuǎn)載于:https://blog.51cto.com/supercharles888/840723
總結(jié)
以上是生活随笔為你收集整理的Hadoop 统计单词字数的例子的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。