Seven Databases in Seven Weeks, 2nd Edition

It Starts with a Question

The central question of Seven Databases in Seven Weeks is this: what database or combination of databases best resolves your problem? If you walk away understanding how to make that choice, given your particular needs and resources at hand, we’re happy.

But to answer that question, you’ll need to understand your options. To that end, we’ll take you on a deep dive—one that is both conceptual and practical—into each of seven databases. We’ll uncover the good parts and point out the not so good. You’ll get your hands dirty with everything from basic CRUD operations to fine-grained schema design to running distributed systems in far-away datacenters, all in the name of finding answers to these questions:

What type of database is this? Databases come in a variety of genres, such as relational, key-value, columnar, document-oriented, and graph. Popular databases—including those covered in this book—can generally be grouped into one of these broad categories. You’ll learn about each type and the kinds of problems for which they’re best suited. We’ve specifically chosen databases to span these categories, including one relational database (Postgres), a key-value store (Redis), a column-oriented database (HBase), two document-oriented databases (MongoDB, CouchDB), a graph database (Neo4J), and a cloud-based database that’s a difficult-to-classify hybrid (DynamoDB).
What was the driving force? Databases are not created in a vacuum. They are designed to solve problems presented by real use cases. RDBMS databases arose in a world where query flexibility was more important than flexible schemas. On the other hand, column-oriented databases were built to be well suited for storing large amounts of data across several machines, while data relationships took a backseat. We’ll cover use cases for each database, along with related examples.
How do you talk to it? Databases often support a variety of connection options. Whenever a database has an interactive command-line interface, we’ll start with that before moving on to other means. Where programming is needed, we’ve stuck mostly to Ruby and JavaScript, though a few other languages sneak in from time to time—such as PL/pgSQL (Postgres) and Cypher (Neo4J). At a lower level, we’ll discuss protocols such as REST (CouchDB) and Thrift (HBase). In the final chapter, we present a more complex database setup tied together by a Node.js JavaScript implementation.
What makes it unique? Any database will support writing data and reading it back out again. What else it does varies greatly from one database to the next. Some allow querying on arbitrary fields. Some provide indexing for rapid lookup. Some support ad hoc queries, while queries must be planned for others. Is the data schema a rigid framework enforced by the database or merely a set of guidelines to be renegotiated at will? Understanding capabilities and constraints will help you pick the right database for the job.
How does it perform? How does this database function and at what cost? Does it support sharding? How about replication? Does it distribute data evenly using consistent hashing, or does it keep like data together? Is this database tuned for reading, writing, or some other operation? How much control do you have over its tuning, if any?
How does it scale? Scalability is related to performance. Talking about scalability without the context of what you want to scale to is generally fruitless. This book will give you the background you need to ask the right questions to establish that context. While the discussion on how to scale each database will be intentionally light, in these pages you’ll find out whether each database is geared more for horizontal scaling (MongoDB, HBase, DynamoDB), traditional vertical scaling (Postgres, Neo4J, Redis), or something in between.

Our goal is not to guide a novice to mastery of any of these databases. A full treatment of any one of them could (and does) fill entire books. But by the end of this book, you should have a firm grasp of the strengths of each, as well as how they differ.

Table of Contents for Seven Databases in Seven Weeks, 2nd Edition

It Starts with a Question

Table of Contents for
Seven Databases in Seven Weeks, 2nd Edition