Skip to main content

Amazon Web Services Developer Connection : Running Hadoop Map...

Popularity Report

Total Popularity Score: 0

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Rank

URL Tag Cloud

Related Lists

Bookmark History

Saved by 3 people (1 private), first by anonymouse user on 2007-09-06


Public Sticky notes

Hadoop actually comes with a library of stock maps and reducers, and in this case we could have used LongSumReducer which does the same as our reducer

Highlighted by http://www.diigo.com/profile/

We didn't set the input types, since the defaults ( > > org.apache.hadoop.io.LongWritable > for the beginning of line character offsets, and > > org.apache.hadoop.io.Text > for the lines) are what we need. > > Also, the input format and output format - how the input files are turned into key-value pairs, and how the output key-value pairs are turned into output files - are not specified since the default is to use text files (as opposed to using a more compact binary format) >.

Highlighted by http://www.diigo.com/profile/

We also set the Combiner class. A Combiner is just a Reduce task that runs in the same process as the Map task after the Map task has finished.

Highlighted by http://www.diigo.com/profile/

When you run the main method of the job it will use a local job runner that runs Hadoop in the same JVM, which allows you to run a debugger, should you need to.

Highlighted by http://www.diigo.com/profile/

they may be input to a further MapReduce job.

Highlighted by http://www.diigo.com/profile/