Amazon Web Services Developer Connection : Running Hadoop Map...
Popularity Report
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
Bookmark History
Public Sticky notes
Hadoop actually comes with a library of stock maps and reducers, and in this case we could have used
LongSumReducer which does the same as our reducer
Highlighted by http://www.diigo.com/profile/
We didn't set the input types, since the defaults (
>
>
org.apache.hadoop.io.LongWritable
>
for the beginning of line character offsets, and
>
>
org.apache.hadoop.io.Text
>
for the lines) are what we need.
>
>
Also, the input format and output format - how the input files are turned into key-value pairs, and how the output key-value pairs are turned into output files - are not specified since the default is to use text files (as opposed to using a more compact binary format)
>.
Highlighted by http://www.diigo.com/profile/
We also set the Combiner class. A Combiner is just a Reduce task that runs in the same process as the Map task after the Map task has finished.
Highlighted by http://www.diigo.com/profile/
When you run the main method of the job it will use a local job runner that runs Hadoop in the same JVM, which allows you to run a debugger, should you need to.
Highlighted by http://www.diigo.com/profile/
they may be input to a further MapReduce job.
Highlighted by http://www.diigo.com/profile/


Public Comment