Apache YARN can easily be called “the answer to everything”. YARN takes care of most of the things in Hadoop and you will use YARN always without noticing it. YARN is the central point of contact for all operations in the Hadoop ecosystem. YARN executes all MapReduce jobs among other things. What YARN takes care of:

  • Resource Management
  • Job Management
  • Job Tracking
  • Job Scheduling

YARN is built of 3 major components. The first one is the resource manager. The resource manager takes care of distributing the resources for individual applications. Next, there is the node manager. This component is running on the node that a specific job is running on. The third component is the Application Master. The Application Master is in charge of retrieving tasks from the resource manager and to ensure the work with the node manager. The Application Master typically works with one or more tasks.

Yarn components
Yarn components

The following image displays a common workflow in YARN.

YARN architecture
YARN architecture

YARN is used by all other projects such as Hive and Pig. It is possible to access YARN via Java Applications or a REST-Interface.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!