生活随笔
收集整理的這篇文章主要介紹了
Hadoop之道--MapReduce之Hello World实例wordcount
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
adoop版本:1.1.2
集成開發(fā)平臺:Eclipse SDK 3.5.1
原創(chuàng)作品,轉(zhuǎn)載請標明:http://blog.csdn.net/yming0221/article/details/9013381
1. 首先定義DFS Location(具體的環(huán)境搭建請看前面的博文)
2.下面即是Hello World實例
[java]?view plaincopy
import?java.io.IOException;?? import?java.util.StringTokenizer;?? ?? import?org.apache.hadoop.conf.Configuration;?? import?org.apache.hadoop.fs.Path;?? import?org.apache.hadoop.io.IntWritable;?? import?org.apache.hadoop.io.Text;?? import?org.apache.hadoop.mapreduce.Job;?? import?org.apache.hadoop.mapreduce.Mapper;?? import?org.apache.hadoop.mapreduce.Reducer;?? import?org.apache.hadoop.mapreduce.lib.input.FileInputFormat;?? import?org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;?? import?org.apache.hadoop.util.GenericOptionsParser;?? ?? public?class?wordcount?{?? ?? ??public?static?class?TokenizerMapper?extends?Mapper<Object,?Text,?Text,?IntWritable>{?? ?????? ????private?final?static?IntWritable?one?=?new?IntWritable(1);?? ????private?Text?word?=?new?Text();?? ???????? ????public?void?map(Object?key,?Text?value,?Context?context)?throws?IOException,?InterruptedException?{?? ?????????? ??????StringTokenizer?itr?=?new?StringTokenizer(value.toString());?? ??????while?(itr.hasMoreTokens())?{?? ????????word.set(itr.nextToken());?? ????????context.write(word,?one);?? ??????}?? ????}?? ??}?? ???? ??public?static?class?IntSumReducer?extends?Reducer<Text,IntWritable,Text,IntWritable>?{?? ????private?IntWritable?result?=?new?IntWritable();?? ?? ????public?void?reduce(Text?key,?Iterable<IntWritable>?values,?Context?context)?throws?IOException,?InterruptedException?{?? ??????int?sum?=?0;?? ??????for?(IntWritable?val?:?values)?{?? ????????sum?+=?val.get();?? ??????}?? ??????result.set(sum);?? ??????context.write(key,?result);?? ????}?? ??}?? ?? ??public?static?void?main(String[]?args)?throws?Exception?{?? ???????? ????Configuration?conf?=?new?Configuration();?? ????String[]?otherArgs?=?new?GenericOptionsParser(conf,?args).getRemainingArgs();?? ????if?(otherArgs.length?!=?2)?{?? ??????System.err.println("Usage:?wordcount?<in>?<out>");?? ??????System.exit(2);?? ????}?? ?????? ????Job?job?=?new?Job(conf,?"word?count");?? ????job.setJarByClass(wordcount.class);?? ????job.setMapperClass(TokenizerMapper.class);?? ????job.setCombinerClass(IntSumReducer.class);?? ????job.setReducerClass(IntSumReducer.class);?? ????job.setOutputKeyClass(Text.class);?? ????job.setOutputValueClass(IntWritable.class);?? ?????? ????FileInputFormat.addInputPath(job,?new?Path(otherArgs[0]));?? ????FileOutputFormat.setOutputPath(job,?new?Path(otherArgs[1]));?? ?????? ????System.exit(job.waitForCompletion(true)???0?:?1);?? ??}?? }??
3. 運行結(jié)果
[plain]?view plaincopy
13/06/03?14:45:52?INFO?input.FileInputFormat:?Total?input?paths?to?process?:?2?? 13/06/03?14:45:52?WARN?snappy.LoadSnappy:?Snappy?native?library?not?loaded?? 13/06/03?14:45:52?INFO?mapred.JobClient:?Running?job:?job_local_0001?? 13/06/03?14:45:52?INFO?util.ProcessTree:?setsid?exited?with?exit?code?0?? 13/06/03?14:45:52?INFO?mapred.Task:??Using?ResourceCalculatorPlugin?:?org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2b96021e?? 13/06/03?14:45:52?INFO?mapred.MapTask:?io.sort.mb?=?100?? 13/06/03?14:45:53?INFO?mapred.MapTask:?data?buffer?=?79691776/99614720?? 13/06/03?14:45:53?INFO?mapred.MapTask:?record?buffer?=?262144/327680?? 13/06/03?14:45:53?INFO?mapred.MapTask:?Starting?flush?of?map?output?? 13/06/03?14:45:53?INFO?mapred.MapTask:?Finished?spill?0?? 13/06/03?14:45:53?INFO?mapred.Task:?Task:attempt_local_0001_m_000000_0?is?done.?And?is?in?the?process?of?commiting?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:??? 13/06/03?14:45:53?INFO?mapred.Task:?Task?'attempt_local_0001_m_000000_0'?done.?? 13/06/03?14:45:53?INFO?mapred.Task:??Using?ResourceCalculatorPlugin?:?org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3621767f?? 13/06/03?14:45:53?INFO?mapred.MapTask:?io.sort.mb?=?100?? 13/06/03?14:45:53?INFO?mapred.MapTask:?data?buffer?=?79691776/99614720?? 13/06/03?14:45:53?INFO?mapred.MapTask:?record?buffer?=?262144/327680?? 13/06/03?14:45:53?INFO?mapred.MapTask:?Starting?flush?of?map?output?? 13/06/03?14:45:53?INFO?mapred.MapTask:?Finished?spill?0?? 13/06/03?14:45:53?INFO?mapred.Task:?Task:attempt_local_0001_m_000001_0?is?done.?And?is?in?the?process?of?commiting?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:??? 13/06/03?14:45:53?INFO?mapred.Task:?Task?'attempt_local_0001_m_000001_0'?done.?? 13/06/03?14:45:53?INFO?mapred.Task:??Using?ResourceCalculatorPlugin?:?org.apache.hadoop.util.LinuxResourceCalculatorPlugin@76d6d675?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:??? 13/06/03?14:45:53?INFO?mapred.Merger:?Merging?2?sorted?segments?? 13/06/03?14:45:53?INFO?mapred.Merger:?Down?to?the?last?merge-pass,?with?2?segments?left?of?total?size:?53?bytes?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:??? 13/06/03?14:45:53?INFO?mapred.JobClient:??map?100%?reduce?0%?? 13/06/03?14:45:53?INFO?mapred.Task:?Task:attempt_local_0001_r_000000_0?is?done.?And?is?in?the?process?of?commiting?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:??? 13/06/03?14:45:53?INFO?mapred.Task:?Task?attempt_local_0001_r_000000_0?is?allowed?to?commit?now?? 13/06/03?14:45:53?INFO?output.FileOutputCommitter:?Saved?output?of?task?'attempt_local_0001_r_000000_0'?to?output?? 13/06/03?14:45:53?INFO?mapred.LocalJobRunner:?reduce?>?reduce?? 13/06/03?14:45:53?INFO?mapred.Task:?Task?'attempt_local_0001_r_000000_0'?done.?? 13/06/03?14:45:54?INFO?mapred.JobClient:??map?100%?reduce?100%?? 13/06/03?14:45:54?INFO?mapred.JobClient:?Job?complete:?job_local_0001?? 13/06/03?14:45:54?INFO?mapred.JobClient:?Counters:?22?? 13/06/03?14:45:54?INFO?mapred.JobClient:???File?Output?Format?Counters??? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Bytes?Written=25?? 13/06/03?14:45:54?INFO?mapred.JobClient:???FileSystemCounters?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????FILE_BYTES_READ=18029?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????HDFS_BYTES_READ=63?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????FILE_BYTES_WRITTEN=213880?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????HDFS_BYTES_WRITTEN=25?? 13/06/03?14:45:54?INFO?mapred.JobClient:???File?Input?Format?Counters??? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Bytes?Read=25?? 13/06/03?14:45:54?INFO?mapred.JobClient:???Map-Reduce?Framework?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Reduce?input?groups=3?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Map?output?materialized?bytes=61?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Combine?output?records=4?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Map?input?records=2?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Reduce?shuffle?bytes=0?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Physical?memory?(bytes)?snapshot=0?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Reduce?output?records=3?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Spilled?Records=8?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Map?output?bytes=41?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????CPU?time?spent?(ms)=0?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Total?committed?heap?usage?(bytes)=683409408?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Virtual?memory?(bytes)?snapshot=0?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Combine?input?records=4?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Map?output?records=4?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????SPLIT_RAW_BYTES=226?? 13/06/03?14:45:54?INFO?mapred.JobClient:?????Reduce?input?records=4 ?
總結(jié)
以上是生活随笔為你收集整理的Hadoop之道--MapReduce之Hello World实例wordcount的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。