當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

人民大学云计算编程的网上评估平台--解题报告 1004-1007

發(fā)布時(shí)間：2025/6/15 编程问答 13 豆豆

生活随笔收集整理的這篇文章主要介紹了人民大学云计算编程的网上评估平台--解题报告 1004-1007 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

因?yàn)橐淮螌?道題，文章太長了，為了方便大家閱讀，我分成了兩篇。

接著上一篇文章，我們繼續(xù)mapreduce編程之旅~~

1004：題目

Single?Table?Join

描述

輸入文件是一個(gè)包含有子女-父母表的文件。請編寫一個(gè)程序，輸入為此輸入文件，輸出是包含在子女-父母表中的孫子女-祖父母關(guān)系表。

輸入

輸入是包含有子女-父母表的一個(gè)文件

輸出

輸出是包含有孫子女-祖父母關(guān)系的一個(gè)文件，孫子女-祖父母關(guān)系是從子女-父母表中得出的。

樣例輸入

child?parent
Tom?Lucy
Tom?Jack
Jone?Lucy
Jone?Jack
Lucy?Mary
Lucy?Ben
Jack?Alice
Jack?Jesse
Terry?Alice
Terry?Jesse
Philip?Terry
Philip?Alma
Mark?Terry
Mark?Alma

樣例輸出

grandchild??grandparent?
Jone????????Alice?
Jone????????Jesse?
Tom?????????Alice?
Tom?????????Jesse?
Jone????????Mary?
Jone????????Ben?
Tom?????????Mary?
Tom?????????Ben?
Mark????????Jesse?
Mark????????Alice?
Philip??????Jesse?
Philip??????Alice

1004：解題思路

單表的連接，這個(gè)比較有味道~~當(dāng)然有可能是我水平有問題，所以寫的比較復(fù)雜。

首先，我定義了一個(gè)自定義數(shù)據(jù)類型TextPair?關(guān)于自定義數(shù)據(jù)類型我這里也不多說了，大家可以百度一下，或者看看hadoop權(quán)威指南上面都會(huì)講解。

接著：我們從輸入可以看出，孩子和雙親都寫在同一個(gè)文件，而我們要求的是祖孫關(guān)系，所以雙親類也會(huì)出現(xiàn)在孩子列。為了正確區(qū)分，所以我們借助自定義數(shù)據(jù)類型來完成。

我先上代碼，在代碼中我會(huì)詳細(xì)注釋：

[java]?view plaincopy

public?class?MyMapre?{??

public?static??class?wordcountMapper?extends??

Mapper{??

public?void?map(LongWritable?key,?Text?value,?Context?context)throws?IOException,?InterruptedException{??

String?key1?=?"";??

String?value1?=?"";??

StringTokenizer?itr?=?new?StringTokenizer(value.toString());??

//從讀入得行中?取出?孩子、雙親??

if?(itr.hasMoreElements()){??

key1?=?itr.nextToken();??

}??

if?(itr.hasMoreElements()){??

value1?=?itr.nextToken();??

}??

//使用自定義的數(shù)據(jù)類型，作為key-value??

//0-孩子，?1-代表雙親??

//我這里將孩子和雙親進(jìn)了交換輸出，方便reduce進(jìn)行?孩子-祖父的配對(duì)??

context.write(new?TextPair(key1,?0),?new?TextPair(value1,?1));??

context.write(new?TextPair(value1,?1),?new?TextPair(key1,?0));??

}??

public?static??class?wordcountReduce?extends??

Reducer{??

public?void?reduce(TextPair?key,?Iterablevalues,?Context?context)throws?IOException,?InterruptedException{??

//上面定義了兩個(gè)list，保存孩子和雙親??

List?child?=?new?ArrayList();??

List?parent?=?new?ArrayList();??

for?(TextPair?str?:?values){??

//通過比對(duì)?0?或者?1?就可以直接是孩子還是雙親??

//具有同一個(gè)key值，表示這是雙親，而與雙親有關(guān)系的就是孩子和雙親的雙親，所以通過判斷就是可以孩子和祖父??

if?(str.second.get()?==?0){??

child.add(str.first.toString());??

}??

else{??

parent.add(str.first.toString());??

}??

if?(child.size()?!=?0?&&?parent.size()?!=?0){??

//一個(gè)孩子可能對(duì)應(yīng)多個(gè)祖父、所以采用了雙重循環(huán)，孩子作為外層循環(huán)??

for?(int?i?=?0;?i?<?child.size();?i++){??

for?(int?j?=?0;?j?<?parent.size();?j++){??

context.write(new?Text(child.get(i)),?new?Text(parent.get(j)));??

}??

//自定義數(shù)據(jù)類型，這個(gè)我就不多說了。??

public?static?class?TextPair?implements?WritableComparable?{??

private?Text?first;??

private?IntWritable?second;??

public?TextPair()?{??

set(new?Text(),?new?IntWritable());??

}??

public?TextPair(String?first,?int?second)?{??

set(new?Text(first),?new?IntWritable(second));??

}??

public?TextPair(Text?first,?IntWritable?second)?{??

set(first,?second);??

}??

public?void?set(Text?first,?IntWritable?second)?{??

this.first?=?first;??

this.second?=?second;??

}??

public?Text?getFirst()?{??

return?first;??

}??

public?String?toString()?{??

return?(first.toString());??

}??

public?IntWritable?getSecond()?{??

return?second;??

}??

public?void?write(DataOutput?out)?throws?IOException?{??

first.write(out);??

second.write(out);??

}??

public?void?readFields(DataInput?in)?throws?IOException?{??

first.readFields(in);??

second.readFields(in);??

}??

public?int?compareTo(TextPair?tp)?{??

//注意這里排序時(shí)，只對(duì)first排序，不對(duì)進(jìn)行判斷的0、1進(jìn)行排序??

int?cmp?=?first.compareTo(tp.first);??

return?cmp;??

}??

public?static??void?main(String?args[])throws?Exception{??

Configuration?conf?=?new?Configuration();??

Job?job?=?new?Job(conf,?"SingleJoin");??

job.setJarByClass(MyMapre.class);??

job.setMapOutputKeyClass(TextPair.class);??

job.setMapOutputValueClass(TextPair.class);??

job.setOutputKeyClass(Text.class);??

job.setOutputValueClass(Text.class);??

job.setMapperClass(wordcountMapper.class);??

job.setReducerClass(wordcountReduce.class);??

FileInputFormat.setInputPaths(job,?new?Path(args[0]));??

FileOutputFormat.setOutputPath(job,?new?Path(args[1]));??

job.waitForCompletion(true);??

}??

1005：題目

Multi-table?Join

描述

輸入有兩個(gè)文件，一個(gè)名為factory的輸入文件包含描述工廠名和其對(duì)應(yīng)地址ID的表，另一個(gè)名為address的輸入文件包含描述地址名和其ID的表格。請編寫一個(gè)程序輸出工廠名和其對(duì)應(yīng)地址的名字。

輸入

輸入有兩個(gè)文件，第一個(gè)描述了工廠名和對(duì)應(yīng)地址的ID，第二個(gè)輸入文件描述了地址名和其ID。

輸出

輸出是一個(gè)包含工廠名和其對(duì)應(yīng)地名的文件。

輸入樣例

input:?
factory:
factoryname?addressID
Beijing?Red?Star?1
Shenzhen?Thunder?3
Guangzhou?Honda?2
Beijing?Rising?1
Guangzhou?Development?Bank?2
Tencent?3
Bank?of?Beijing?1
address:
addressID?addressname
1?Beijing
2?Guangzhou
3?Shenzhen
4?Xian

輸出樣例

output:
factoryname??addressname
Bank?of?Beijing?Beijing
Beijing?Red?Star?Beijing?
Beijing?Rising?Beijing?
Guangzhou?Development?Bank?Guangzhou?
Guangzhou?Honda?Guangzhou
Shenzhen?Thunder?Shenzhen?
Tencent?Shenzhen
1005解題思路：

這題跟1004的思路都差不多，能做出1004，那么1005也就不在話下了。

我們已經(jīng)使用1004的自定義數(shù)據(jù)類型TextPair?，因?yàn)槲覀儚囊粋€(gè)文件中讀入得數(shù)據(jù)分為兩類，所以使用TextPair?對(duì)其進(jìn)行區(qū)分。

還是上代碼吧，我在代碼里詳細(xì)注釋：

[java]?view plaincopy

public?class?MyMapre?{??

public?static??class?wordcountMapper?extends??

Mapper{??

public?void?map(LongWritable?key,?Text?value,?Context?context)throws?IOException,?InterruptedException{??

//這里比較特殊，因?yàn)橐粋€(gè)工廠名中包含了空格，所以我們要正確分割就要注意了。??

String?str?=?"";??

String?id?=?"";??

String?value1?=?"";??

//分割??

StringTokenizer?itr?=?new?StringTokenizer(value.toString());??

while?(itr.hasMoreElements()){??

str?=?itr.nextToken();??

//如果第一個(gè)域不包含了0-9就證明是factory文件的內(nèi)容??

if?(!str.matches("[0-9]")){??

value1?+=?str;??//包含多個(gè)str??

value1?+=?"?";??

}else{?//否則是address文件的內(nèi)容??

id?=?str;??//第一個(gè)域就是Id??

//如果value1不為空則是factor，已經(jīng)分解完全?factor-1??

if?(!value1.isEmpty())?{???

context.write(new?Text(id),?new?TextPair(value1,?1));??

return;??

}???

}??

//如果前面都沒return?那么就是address文件的內(nèi)容?adress-0??

context.write(new?Text(id),?new?TextPair(value1,?0));?}??

}??

public?static??class?wordcountReduce?extends??

Reducer{??

public?void?reduce(Text?key,?Iterablevalues,?Context?context)throws?IOException,?InterruptedException{??

//依舊定義兩個(gè)list來保存。??

List?factor?=?new?ArrayList();??

List?address?=?new?ArrayList();??

for?(TextPair?str?:?values){??

//1-factor??

if?(str.second.get()?==?1){??

factor.add(str.first.toString());??

}??

else{??

//0-adress??

address.add(str.first.toString());??

}??

//因?yàn)橐粋€(gè)地方可能對(duì)應(yīng)多個(gè)工廠，所以將adress作為外層循環(huán)??

if?(factor.size()?!=?0?&&?address.size()?!=?0){??

for?(int?i?=?0;?i?<?address.size();?i++){??

for?(int?j?=?0;?j?<?factor.size();?j++){??

context.write(new?Text(factor.get(j)),?new?Text(address.get(i)));??

}??

//自定義數(shù)據(jù)類型，不多說了。??

public?static?class?TextPair?implements?WritableComparable?{??

private?Text?first;??

private?IntWritable?second;??

public?TextPair()?{??

set(new?Text(),?new?IntWritable());??

}??

public?TextPair(String?first,?int?second)?{??

set(new?Text(first),?new?IntWritable(second));??

}??

public?TextPair(Text?first,?IntWritable?second)?{??

set(first,?second);??

}??

public?void?set(Text?first,?IntWritable?second)?{??

this.first?=?first;??

this.second?=?second;??

}??

public?Text?getFirst()?{??

return?first;??

}??

public?String?toString()?{??

return?(first.toString());??

}??

public?IntWritable?getSecond()?{??

return?second;??

}??

public?void?write(DataOutput?out)?throws?IOException?{??

first.write(out);??

second.write(out);??

}??

public?void?readFields(DataInput?in)?throws?IOException?{??

first.readFields(in);??

second.readFields(in);??

}??

public?int?compareTo(TextPair?tp)?{??

int?cmp?=?first.compareTo(tp.first);??

return?cmp;??

}??

public?static??void?main(String?args[])throws?Exception{??

Configuration?conf?=?new?Configuration();??

Job?job?=?new?Job(conf,?"MultiTableJoin");??

job.setJarByClass(MyMapre.class);??

job.setMapOutputKeyClass(Text.class);??

job.setMapOutputValueClass(TextPair.class);??

job.setOutputKeyClass(Text.class);??

job.setOutputValueClass(Text.class);??

job.setMapperClass(wordcountMapper.class);??

job.setReducerClass(wordcountReduce.class);??

FileInputFormat.setInputPaths(job,?new?Path(args[0]));??

FileOutputFormat.setOutputPath(job,?new?Path(args[1]));??

job.waitForCompletion(true);??

}??

1006：題目

Sum

描述

輸入文件是一組文本文件，每個(gè)輸入文件中都包含很多行，每行都是一個(gè)數(shù)字字符串，代表了一個(gè)特別大的數(shù)字。需要注意的是這個(gè)數(shù)字的低位在字符串的開頭，高位在字符串的結(jié)尾。請編寫一個(gè)程序求包含在輸入文件中的所有數(shù)字的和并輸出。

輸入

輸入有很多文件組成，每個(gè)文件都有很多行，每行都由一個(gè)數(shù)字字符串代表一個(gè)數(shù)字。

輸出

輸出時(shí)一個(gè)文件，這個(gè)文件中第一行的第一個(gè)數(shù)字是行標(biāo)，第二個(gè)數(shù)字式輸入文件中所有數(shù)字的和。

輸入樣例

input:?
file1:
1235546665312
112344569882
326434546462
21346546846
file2:
3654354655
3215456463
21235465463
321265465
65465463
32
file3:
31654
654564564
3541231564
351646846
3164646
3163

輸出樣例

output:
1?8685932816082

注意:
1?只有一個(gè)輸出文件;
2?輸出文件的第一行由行標(biāo)"1"和所有數(shù)字的和組成;
3?每個(gè)數(shù)字都是正整數(shù)或者零。每個(gè)數(shù)字都超過50位，所以常用數(shù)據(jù)類型是無法存儲(chǔ)的;
4?數(shù)字的低位在數(shù)字字符串的左側(cè)，高位在數(shù)字字符串的右側(cè)。比如樣例輸入第一個(gè)輸入文件的第一行代表的數(shù)字是2135666455321。

1006解題思路：1006主要解決兩個(gè)問題，一：大數(shù)加法。二：將所有數(shù)據(jù)歸一

第一個(gè)問題是常規(guī)解法，我不多說。第二，因?yàn)槲覀冏詈笮枰蟪鲆粋€(gè)總結(jié)果，所以就需要將所有的key歸成一個(gè)group。當(dāng)然我們可以自定義group的劃分,這個(gè)可以參考hadoop權(quán)威指南，以后如果有需要，我會(huì)寫出來的。我這里用了一個(gè)簡單解決辦法。（能用簡單的辦法，當(dāng)然用簡單的辦法）

我結(jié)合代碼給大家講解吧：

[java]?view plaincopy

public?class?MyMapre?{??

public?static??class?wordcountMapper?extends??

Mapper{??

public?void?map(LongWritable?key,?Text?value,?Context?context)throws?IOException,?InterruptedException{??

//注意這里的key,這就是我所謂的簡單辦法，用同一個(gè)key,那么在reduce階段就可以加所有數(shù)據(jù)歸到一個(gè)group??

context.write(new?LongWritable(1),?value);??

}??

public?static??class?wordcountReduce?extends??

Reducer{??

String?tem?=?"0";?//因?yàn)槭谴髷?shù)，所以要string來存儲(chǔ)??

public?void?reduce(LongWritable?key,?Iterablevalues,?Context?context)throws?IOException,?InterruptedException{??

for?(Text?str?:?values){??

//獲取大數(shù),調(diào)用Sum（）大數(shù)加法函數(shù)??

tem?=?Sum(tem,?str.toString());??

}??

context.write(key,?new?Text(tem));??

}??

//這是我實(shí)現(xiàn)的大數(shù)加法函數(shù)，其實(shí)我作了很久心理斗爭，因?yàn)檫@個(gè)函數(shù)寫的實(shí)在不怎么樣，大家可以自己實(shí)現(xiàn)，不要看我這個(gè)壞例子。呵呵~~?這個(gè)函數(shù)我就不寫注釋了。??

public??static?String??Sum(String?a,?String?b){??

String?c?=?"";??

int?a_len?=?a.length();??

int?b_len?=?b.length();??

int?jin?=?0;??

int?a_first;??

int?b_first;??

int?temp;??

while?(a_len??>?0?&&?b_len??>?0){??

a_first?=?Integer.parseInt(a.substring(0,?1));??

b_first?=?Integer.parseInt(b.substring(0,?1));??

a?=?a.substring(1);??

b?=?b.substring(1);??

temp=?a_first?+?b_first?+jin;??

jin?=?temp/?10;??

temp=?temp-?10?*?jin;??

c?+=?temp;??

a_len--;??

b_len--;??

}??

if?(a_len?==?0?&&?b_len?==?0?&&?jin?!=?0)??

c?+=?jin;??

while?(a_len?>?0){??

int?k?=?Integer.parseInt(a.substring(0,?1))?+?jin;??

a?=?a.substring(1);??

c?+=?k;??

a_len--;??

jin?=?0;??

}??

while?(b_len?>?0){??

int?k?=?Integer.parseInt(b.substring(0,?1))?+?jin;??

b?=?b.substring(1);??

c?+=?k;??

b_len?--;??

jin?=?0;??

}??

return?c;??

}???

public?static??void?main(String?args[])throws?Exception{??

Configuration?conf?=?new?Configuration();??

Job?job?=?new?Job(conf,?"Sum");??

job.setJarByClass(MyMapre.class);??

job.setMapOutputKeyClass(LongWritable.class);??

job.setMapOutputValueClass(Text.class);??

job.setOutputKeyClass(LongWritable.class);??

job.setOutputValueClass(Text.class);??

job.setMapperClass(wordcountMapper.class);??

job.setReducerClass(wordcountReduce.class);??

FileInputFormat.setInputPaths(job,?new?Path(args[0]));??

FileOutputFormat.setOutputPath(job,?new?Path(args[1]));??

job.waitForCompletion(true);??

}??

1007：題目

WordCount?Plus

描述

WordCount例子輸入文本文件并計(jì)算單詞出現(xiàn)的次數(shù)?，F(xiàn)在有一個(gè)WordCount2.0版本，在這個(gè)版本中你必須處理含有"/.',"{}[]:;"等等字符的輸入文件。在你切詞的時(shí)候，你應(yīng)該把"declare,"?切成?"declare"，同樣?"Hello!"應(yīng)該切成"Hello"，"can't"應(yīng)該切成"can't"。

輸入

輸入是包含很多單詞的文本文件。

出入

輸出是一個(gè)文本文件，這個(gè)文件的每一行包含一個(gè)單詞和這個(gè)單詞在所有輸入文件中出現(xiàn)的次數(shù)。在輸出文件中單詞是按照字典順序排序的。

輸入樣例

input1:
hello?world,?bye?world.
input2:
hello?hadoop,?bye?hadoop!
輸出樣例

bye?2
hadoop?2
hello?2
world?2
1007解題思路：1007主要是對(duì)字符的過濾，這里我可以使用正則表達(dá)式來過濾。沒什么難點(diǎn)~~

我們還是邊看代碼邊說吧：

[java]?view plaincopy

public?class?MyMapre?{??

public?static??class?wordcountMapper?extends??

Mapper{??

private?final?static?IntWritable??one?=?new?IntWritable(1);??

private?String?pattern?=?"[^//w/']";??//定義正則表達(dá)式，過濾除數(shù)字、字母、“'”?外的字符??

public?void?map(LongWritable?key,?Text?value,?Context?context)throws?IOException,?InterruptedException{??

String?line?=?value.toString().toLowerCase();??

//用空格代替要過濾的字符??

line?=?line.replaceAll(pattern,?"?");??

//劃分??

StringTokenizer?itr?=?new?StringTokenizer(line);??

while(itr.hasMoreElements()){??

context.write(new?Text(itr.nextToken()),?one);??

}??

public?static??class?wordcountReduce?extends??

Reducer{??

public?void?reduce(Text?key,?Iterablevalues,?Context?context)throws?IOException,?InterruptedException{??

//這里就比較簡單了，跟wordcount一樣，我就不多說了。??

int?sum?=?0;??

for?(IntWritable?str?:?values){??

sum?+=?str.get();??

}??

context.write(key,?new?IntWritable(sum));??

}??

public?static??void?main(String?args[])throws?Exception{??

Configuration?conf?=?new?Configuration();??

Job?job?=?new?Job(conf,?"Plus");??

job.setJarByClass(MyMapre.class);??

job.setMapOutputKeyClass(Text.class);??

job.setMapOutputValueClass(IntWritable.class);??

job.setOutputKeyClass(Text.class);??

job.setOutputValueClass(IntWritable.class);??

job.setMapperClass(wordcountMapper.class);??

job.setReducerClass(wordcountReduce.class);??

FileInputFormat.setInputPaths(job,?new?Path(args[0]));??

FileOutputFormat.setOutputPath(job,?new?Path(args[1]));??

job.waitForCompletion(true);??

}??

終于寫完了，當(dāng)然這里寫的是我的解題思路，如果各位大大有更好的想法，不妨分享出來，大家一起happy。上面的程序都能正確提交。

當(dāng)然我不排除我程序中有考慮不周的地方或錯(cuò)誤的地方（測試數(shù)據(jù)的不全面造成）的，如果各位大大能指出，我將不勝感激~~

我最后再說明下，因?yàn)槌绦蚴俏覐木W(wǎng)站上的提交庫直接取回來的，格式不太好看。對(duì)不住各位了~~

總結(jié)

以上是生活随笔為你收集整理的人民大学云计算编程的网上评估平台--解题报告 1004-1007的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Hadoop IO
下一篇：人民大学云计算编程的网上评估平台--解题

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

人民大学云计算编程的网上评估平台--解题报告 1004-1007

總結(jié)