Big Data Big Data Big Data Technologies Hadoop Tutorials

Hadoop Tutorial – MapReduce function for native data access


MapReduce is the elementary data access for Hadoop. MapReduce provides the fastest way in terms of performance, but maybe not in terms of time-to-market. Writing MapReduce queries might be trickier than Hive or Pig. Other projects, such as Hive or Pig translate the code you entered into native MapReduce queries and therefore often come with a tradeoff.

A typical MapReduce function follows the following process:

  • The Input Data is distributed on different Map-processes. The Map processes work on the provided Map-function.
  • The Map-processes are executed in parallel.
  • Each Map-process issues intermediate results. These results are stored, which is often called the shuffle-phase.
  • Once all intermediate results are available, the Map-function has finished and the reduce function starts.
  • The Reduce-function works on the intermediate results. The Reduce-function is also provided by the user (just like the Map-function).

A classical way to demonstrate MapReduce is via the Word-count example. The following listing will show this.

map(String name, String content):

for each word w in content:

EmitIntermediate(w, 1);

reduce(String word, Iterator intermediateList):

int result = 0;

for each v in intermediateList:

result++;

Emit(word, result);

I lead a team of Senior Experts in Data & Data Science as Head of Data & Analytics and AI at A1 Telekom Austria Group. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data & Data Science.

0 comments on “Hadoop Tutorial – MapReduce function for native data access

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: