當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

hadoop程序MapReduce之SingletonTableJoin

發布時間：2023/11/27 生活经验 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 hadoop程序MapReduce之SingletonTableJoin 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

需求：單表關聯問題。從文件中孩子和父母的關系挖掘出孫子和爺奶關系

樣板：child-parent.txt?

? ? ? ? ?xiaoming daxiong

? ? ? ? ?daxiong alice

? ? ? ? ?daxiong jack

輸出：xiaoming alice

? ? ? ? xiaoming jack

分析設計：

mapper部分設計：

1、<k1,k1>k1代表：一行數據的編號位置，v1代表：一行數據。

2、左表：<k2,v2>k2代表：parent名字，v2代表：(1,child名字)，此處1：代表左表標志。

3、右表：<k3,v3>k3代表：child名字，v3代表：(2，parent名字)，此處2：代表右表標志。

reduce部分設計：

4、<k4,v4>k4代表：相同的key,v4代表：list<String>

5、求笛卡爾積<k5,v5>:k5代表：grandChild名字，v5代表：grandParent名字。

程序部分：

SingletonTableJoinMapper類

package com.cn.singletonTableJoin;import java.io.IOException;
import java.util.StringTokenizer;import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;public class SingletonTableJoinMapper extends Mapper<Object, Text, Text, Text> {@Overrideprotected void map(Object key, Text value, Mapper<Object, Text, Text, Text>.Context context)throws IOException, InterruptedException {String childName = new String();String parentName = new String();String relationType = new String();String[] values=new String[2]; int i = 0;StringTokenizer itr = new StringTokenizer(value.toString());while(itr.hasMoreElements()){values[i] = itr.nextToken();i++;}if(values[0].compareTo("child") != 0){childName  = values[0];parentName = values[1];relationType = "1";context.write(new Text(parentName), new Text(relationType+" "+childName));relationType = "2";context.write(new Text(childName), new Text(relationType+" "+parentName));}} 
}

SingletonTableJoinReduce類：

package com.cn.singletonTableJoin;import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;public class SingletonTableJoinReduce extends Reducer<Text, Text, Text, Text> {@Overrideprotected void reduce(Text key, Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context)throws IOException, InterruptedException {List<String> grandChild = new ArrayList<String>();List<String> grandParent = new ArrayList<String>();Iterator<Text> itr = values.iterator();while(itr.hasNext()){String[] record = itr.next().toString().split(" ");if(0 == record[0].length()){continue;}if("1".equals(record[0])){grandChild.add(record[1]);}else if("2".equals(record[0])){grandParent.add(record[1]);}}if(0 != grandChild.size() && 0 != grandParent.size()){for(String grandchild : grandChild){for(String grandparent : grandParent){context.write(new Text(grandchild), new Text(grandparent));}}}}
}

SingletonTableJoin類

package com.cn.singletonTableJoin;import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;/*** 單表關聯* @author root**/
public class SingletonTableJoin {public static void main(String[] args) throws Exception {Configuration conf = new Configuration();String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();if (otherArgs.length != 2) {System.err.println("Usage: SingletonTableJoin  ");System.exit(2);}//創建一個jobJob job = new Job(conf, "SingletonTableJoin");job.setJarByClass(SingletonTableJoin.class);//設置文件的輸入輸出路徑FileInputFormat.addInputPath(job, new Path(otherArgs[0]));FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));//設置mapper和reduce處理類job.setMapperClass(SingletonTableJoinMapper.class);job.setReducerClass(SingletonTableJoinReduce.class);//設置輸出key-value數據類型job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);//提交作業并等待它完成System.exit(job.waitForCompletion(true) ? 0 : 1);}
}

把總結當成一種習慣。

轉載于:https://www.cnblogs.com/xubiao/p/5759422.html

總結

以上是生活随笔為你收集整理的hadoop程序MapReduce之SingletonTableJoin的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

生活经验

hadoop程序MapReduce之SingletonTableJoin

總結