Big Data Big Data Big Data Technologies Hadoop Tutorials

Hadoop Tutorial – Apache HBase

HBase is one of the most popular databases in the Hadoop and NoSQL ecosystem. HBase is a highly-scaleable database that works with fulfilling the partition tolerance and availability of the CAP-Theorem. In case you aren’t familiar with the CAP-Theorem: the theorem states that requirements for a database are consistency, availability and partition tolerance. However, you can only have two of them and the third one comes with a trade-off.

HBase uses a Key/Value storage. The schema of a table in HBase is not present (schema-less), which gives you much more flexibility than with a traditional relational database. HBase takes care of the failover and sharding of data for you.

HBase uses HDFS as storage and ZooKeeper for the coordination. There are several region servers that are controlled by a master server. This is displayed in the next image.

Apache HBase
Apache HBase

I lead a team of Senior Experts in Data & Data Science as Head of Data & Analytics and AI at A1 Telekom Austria Group. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data & Data Science.

1 comment on “Hadoop Tutorial – Apache HBase

  1. Pingback: Hadoop Tutorial – Apache HBase | Big Enterprise Data

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: