Apache HBase is an open-source, non-relational and a distributed data base system built on top of HDFS (Hadoop Distributed File system). HBase was designed post Google’s Big table and it is written in Java. It was developed as a part of Apache’s Hadoop Project. It provides a kind of fault – tolerant mechanism to store minor amounts of non-zero items caught within large amounts of empty items. HBase is used when we require real-time read/write access to huge data
bases. HBase project was started by the end of 2006 by Chad Walters and Jim Kellerman at Powerset. The main purpose of HBase is to process large amounts of data. Mike Cafarella worked on code of the working system initially and later Jim Kellerman carried it to the next stage. HBase was first released as a part of Hadoop 0.15.0 in October 2007. The project goal was holding of very large tables like billions of rows X millions of columns. In May 2010, HBase advanced to a major project and it became an Apache Top Level Project. Several applications like Adobe, Twitter, Yahoo, Trend Micro etc. use this data base. Social networking sites like Facebook have implemented its messenger application using HBase. This document helps us to understand how HBase works and how is it different from other data bases. This
document highlights about the current challenges in data security and a couple of models have been proposed towards the security and levels of data access to overcome the challenges. This document also discusses the workload challenges and techniques to overcome. Also an overview has been given on how HBase has been implemented in real time application Facebook messenger app.
Approved and recommended for acceptance as a project in partial fulfillment of the requirements for the degree of Master of Science in Computer and Information Science.