千家信息网

java map reduce怎么实现

发表于:2025-12-01 作者:千家信息网编辑
千家信息网最后更新 2025年12月01日,这篇文章主要讲解了"java map reduce怎么实现",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"java map reduce怎么实现"吧!输
千家信息网最后更新 2025年12月01日java map reduce怎么实现

这篇文章主要讲解了"java map reduce怎么实现",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"java map reduce怎么实现"吧!

输入文件内容:

a a1
b b2
c c3
d d4
a a1
b b2
c c3
d d4

输出:

a a1|0 a1|20
b b2|5 b2|25
c c3|10 c3|30
d d4|15 d4|35

代码:

import java.io.IOException;  import java.util.StringTokenizer;         import org.apache.hadoop.conf.Configuration;  import org.apache.hadoop.fs.Path;  import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;  import org.apache.hadoop.mapreduce.Job;  import org.apache.hadoop.mapreduce.Mapper;  import org.apache.hadoop.mapreduce.Reducer;  import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;  import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;  import org.apache.hadoop.util.GenericOptionsParser;         public class WordCount {                 public static class TokenizerMapper extends Mapper{                             public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {                        String[] oriSegs = value.toString().split("\t");                        String str = oriSegs[1] + "|" + key;                        context.write(new Text(oriSegs[0]), new Text(str));                }        }                 public static class IntSumReducer extends Reducer {                             public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException {                          String out = "";                        for (Text val: values) {                                if (!out.equals("")) {                                        out += '\t';                                }                                out += val.toString();                        }                        context.write(key, new Text(out));                    }          }                 public static void main(String[] args) throws Exception {                                  Configuration conf = new Configuration();                conf.set("mapred.job.queue.name", "platform");                String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();                  if (otherArgs.length != 2) {                          System.err.println("Usage: wordcount  ");                          System.exit(2);                      }                                Job job = new Job(conf, "word count");                 job.setJarByClass(WordCount.class);                   job.setMapperClass(TokenizerMapper.class);                  job.setCombinerClass(IntSumReducer.class);                  job.setReducerClass(IntSumReducer.class);                  job.setOutputKeyClass(Text.class);                  job.setOutputValueClass(Text.class);                job.setNumReduceTasks(1); //set reducer number                FileInputFormat.addInputPath(job, new Path(otherArgs[0]));                  FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));                  System.exit(job.waitForCompletion(true) ? 0 : 1);          }  }

编译:make.sh 编译成jar文件

javac -classpath /home/hadoop/hadoop-0.20.2-cdh4u0/hadoop-core-0.20.2-cdh4u0.jar:/home/hadoop/hadoop-0.20.2-cdh4u0/lib/commons-cli-1.2.jar -d wordcount_class WordCount.javajar -cvf WordCount.jar -C wordcount_class/ .

执行map reduce任务:exec.sh

IN=/user/zhumingliang/tanx_rtb_account/inputOUT=/user/zhumingliang/tanx_rtb_account/output/testhadoop jar WordCount.jar WordCount $IN $OUT

注意:

mapper的输入key在针对文件输入时,是一行起始位置在文件中的字符序号;而mapper的输入value则为整行内容。

reducer的输入key则为mapper的输出key; reducer的输入value则为mapper的输出value。

感谢各位的阅读,以上就是"java map reduce怎么实现"的内容了,经过本文的学习后,相信大家对java map reduce怎么实现这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是,小编将为大家推送更多相关知识点的文章,欢迎关注!

0