In the introduction to this book, we discussed some of the principles in the design of distributed systems and talked about inherent system trade-offs that we need to choose between while setting out to build a distributed system.
How does HBase make those trade-offs? What aspects of its architecture are affected by these design choices, and what effect does it have on the set of use cases that it might be a fit for?
At this point, we already know HBase range partitions the key space, dividing it into key ranges assigned to different regions. The purpose of the META table is to record the range assignments. This is different from Cassandra, which uses consistent hashing and has no central state store that captures the data placement state.
We already know that HBase is an LSM database, converting random writes into a stream of append operations. This allows it to achieve higher write throughputs than conventional databases, and also makes the layering on top of HDFS possible.
How does it handle other trade-offs, such as consistency versus availability?
HBase, like Bigtable, chooses consistency over availability. In HBase, a given key has to be in a single region, and a region hosted on a single RegionServer. As a result, all operations on a key are processed by a single RegionServer. This ensures strong consistency, since there is only a single arbiter responsible for handling updates to a given record. This makes operations such as increment and checkAndPut possible, since there is one and only one RegionServer responsible for handling both the read and write phases in a read-modify-write operation, and hence it can ensure that no updates to the data can happen outside its realm.
However, this stronger notion of consistency comes at the cost of availability. If the RegionServer goes down, there is a period of downtime before a new RegionServer can take over ownership of the regions. On the face of it, it might seem like this makes HBase unsuitable for mission-critical applications. However, there is some difference between theoretical and practical notions of availability, and we'll discuss techniques to improve the availability of HBase clusters in a later section.
Our second consideration was about the transaction model and the isolation levels that the database seeks to provide.
HBase provides ACID only at the row level, so updates to multiple columns in the same row can be done atomically, but no atomic updates are possible for updates spanning multiple rows. This means dirty writes from a transaction are invisible to other concurrent transactions operating on the same row, but once our transaction modifies an R1 row and moves on to the next row, R2, the updated R1 is immediately visible to other transactions operating on R1.
Finally, HBase is a type of row store. To be specific, it is what we'd call a column-family oriented store. We'll discuss what this means in our next section on the HBase data model.