In any enterprise looking to build mission-critical applications on top of HBase, the main questions on everybody's minds are Is the database reliable? What if it goes down? Under what conditions does it fail? How long will it take for the system to be functional again? Will there be lingering effects?
Let's try and understand each piece of this puzzle. As we've discussed, HBase favors strong consistency, and consequently makes a single RegionServer responsible for all reads and writes for a given key. When that RegionServer goes down, we lose access to the data stored within it and are unable to perform reads and writes with that data. However, since the underlying data is stored on HDFS, the loss of access is only temporary. Once the regions are reassigned to a different RegionServer, reads and writes can resume.
What exactly are the recovery steps involved when a RegionServer goes down? The first step in addressing a failure is to detect that a failure has occurred. Each RegionServer sends a heartbeat to Zookeeper periodically. When Zookeeper doesn't get a heartbeat over a certain period of time, it concludes that the RegionServer has gone offline and notifies the master to initiate corrective steps.
Region reassignment would be easy if all we had to do was recover the HFiles on HDFS. Since the HFiles are on HDFS, new RegionServers can access existing HFiles right away. The issue, however, is that the contents of the Memstore are lost upon RS failure. This has to be recovered before the regions can be brought online. The contents of the Memstore can be reconstructed by replaying edits in the commit log. Since the commit log is also stored in HDFS, it can be accessed remotely from any one of the three DataNodes that maintain a copy of it. The primary challenge is that the commit log has the edits interspersed for all of the regions that are hosted on the RegionServer. However, these regions might be reassigned to different RegionServers upon recovery to prevent undue load on any one RegionServer.
To achieve this, the commit log has to be split by region. Once split, per-region HFiles are generated from the commit log and placed back into HDFS. At this point, all the regions on the RegionServer have up-to-date HFiles. The regions can now be reassigned to new RegionServers (essentially a metadata operation), where they are opened and made available for reads and writes.