Hadoop Tutorial – Accessing streaming data with Apache Storm


Apache Storm is in charge for analyzing streaming data in Hadoop. Storm is extremely powerful when analyzing streaming data and is capable of working near real-time. Storm was initially developed by Twitter to power their streaming API. At present, Storm is capable of processing 1 million tuples per node and second. The nice thing about Storm is that it scales linearly. The Storm architecture is similar to other Hadoop projects. However, Storm comes with different challenges. First, there is Nimbus. Nimbus is the controller for Storm, which is similar to the JobTracker in Hadoop. Apache Storm also utilizes ZooKeeper. The Supervisor is on each instance and takes care of the tuples once they come in. The following figure shows this. Major concepts in Apache Storm are 4 elements: streams, spouts, bolts and topologies. Streams are an unbound sequence of Tuples, a Spout is a source of streams, Bolts process input streams and create new output streams and a topology is

read more Hadoop Tutorial – Accessing streaming data with Apache Storm

Apache Storm is now a Top Level Hadoop Project – but what is it?


//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js // Apache Software Foundation announced that Apache Storm is now a top level Hadoop project. But what is Apache Storm about? Well, basically Apache Storm is a project to analyse data streams that are near real time. Storm works with messages and analyses what is going on. Storm originates from Twitter, which is using it for their streaming API. Storm is about processing time-critical data and Storm guarantees that your data gets processed. It is basically fault tolerant and scalable. Apache Storm is useful for fraud protection in gambling, banking and financial services, but not only there. Storm can be used wherever real-time or time-critical applications are necessary. At the moment, Storm allows to process 1 million tupels per second and node. This is massive, given the fact that Storm is all about scaling out. Imagine adding 100 nodes! Apache Storm works with Tupels that come from spouts. A spout is a messaging system such as Apache Kafka. Storm

read more Apache Storm is now a Top Level Hadoop Project – but what is it?