Big Data Big Data Big Data Technologies Hadoop Tutorials

Hadoop Tutorial – Apache Accumulo

Apache Accumulo is another NoSQL Database in the Hadoop stack. Accumulo is based on Google’s Big Table design and is a sorted and distributed key/value storage.

Key/Value storages are basically not operating on rows, but it is possible to query them – which comes with a performance trade-off often. Accumulo allows us to query large rows which typically wouldn’t fit into the memory.

Accumulo is also built for high availability, scalability and fault tolerance. As of the ACID-topology, Accumulo supports “Isolation”. This basically means that recently inserted data isn’t displayed in case that the insert was after the query was sent.

Accumulo is built with a PlugIn-based architecture and provides a comprehensive API. With Accumulo, it is possible to execute MapReduce jobs, bulk- and batch operations.

The following Figure outlines how a Key/Value is displayed in Accumulo. The Key consists of the Row id, a column specifier and a timestamp. The column contains informations about the column family, the qualifier and the visibility.

Apache Accumulo
Apache Accumulo

The next sample will display how Accumulo code is written. The sample displays how to write a text to the database.

Text uid = new Text(“columid”);

Text family = new Text(“columnFamily”);

Text qualifier = new Text(“columnQualifier”);

ColumnVisibility visibility = new ColumnVisibility(“public”);

long timestamp = System.currentTimeMillis();

Value value = new Value(“Here is my text”.getBytes());

Mutation mutation = new Mutation(uid);

mutation.put(family, qualifier, visibility, timestamp, value);

I lead a team of Senior Experts in Data & Data Science as Head of Data & Analytics and AI at A1 Telekom Austria Group. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data & Data Science.

0 comments on “Hadoop Tutorial – Apache Accumulo

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: