Appendix 1
Database Overview Tables

This book contains a wealth of information about each of the seven databases we discuss: PostgreSQL, HBase, MongoDB, CouchDB, Neo4j, DynamoDB, and Redis. In the pages that follow, you’ll find tables that tally up these databases along a number of dimensions to present an overview of what’s covered in more detail elsewhere in the book. Although the tables are not a replacement for a true understanding, they should provide you with an at-a-glance sense of what each database is capable of, where it falls short, and how it fits into the modern database landscape.

	Genre	Version	Datatypes	Data Relations
PostgreSQL	Relational	9.1	Predefined and typed	Predefined
HBase	Columnar	1.4.1	Predefined and typed	None
MongoDB	Document	3.6	Typed	None
CouchDB	Document	2.1.1	Typed	None
Neo4j	Graph	3.1.4 Enterprise	Untyped	Ad hoc (Edges)
DynamoDB	Key-value (or “key-value plus,” for reasons explained in chapter 7)	API version 2012-08-10	Typed	Predefined tables (plus support for arbitrary fields)
Redis	Key-value	4.0	Semi-typed	None

	Standard Object	Written in Language	Interface Protocol	HTTP/REST
PostgreSQL	Table	C	Custom over TCP	No
HBase	Columns	Java	Thrift, HTTP	Yes
MongoDB	JSON	C++	Custom over TCP	Simple
CouchDB	JSON	Erlang	HTTP	Yes
Neo4j	Hash	Java	HTTP	Yes
DynamoDB	Table	Unknown	JSON over HTTP	Yes
Redis	String	C/C++	Simple text over TCP	No

	Ad Hoc Query	Mapreduce	Scalable	Durability
PostgreSQL	SQL	No	Cluster (via add-ons)	ACID
HBase	Weak	Hadoop	Datacenter	Write-ahead logging
MongoDB	Commands, mapreduce	JavaScript	Datacenter	Write-ahead journaling, Safe mode
CouchDB	Temporary views	JavaScript	Datacenter (via BigCouch)	Crash-only
Neo4j	Graph walking, Cypher, search	No (in the distributed sense)	Cluster (via HA)	ACID
DynamoDB	Limited range of SQL-style queries	No	Multi-datacenter	ACID
Redis	Commands	No	Cluster (via master-slave)	Append-only log

	Secondary Indexes	Versioning	Bulk Load	Very Large Files
PostgreSQL	Yes	No	COPY command	BLOBs
HBase	No	Yes	No	No
MongoDB	Yes	No	mongoimport	GridFS
CouchDB	Yes	Yes	Bulk Doc API	Attachments
Neo4j	Yes (via Lucene)	No	No	No
DynamoDB	Yes	Yes	No	Lewak (deprecated)
Redis	No	No	No	No

	Requires Compaction	Replication	Sharding	Concurrency
PostgreSQL	No	Master-slave	Add-ons (e.g., PL/Proxy)	Table/row writer lock
HBase	No	Master-slave	Yes via HDFS	Consistent per row
MongoDB	No	Master-slave (via replica sets)	Yes	Write lock
CouchDB	File rewrite	Master-master	Yes (with filters in BigCouch)	Lock-free MVCC
Neo4j	No	Master-slave (in Enterprise Edition)	No	Write lock
DynamoDB	No	Peer-based, master-master	Yes	Vector-clocks
Redis	Snapshot	Master-slave	Add-ons (e.g., client)	None

	Transactions	Triggers	Security	Multitenancy
PostgreSQL	ACID	Yes	Users/groups	Yes
HBase	Yes (when enabled)	No	Kerberos via Hadoop security	No
MongoDB	No	No	Users	Yes
CouchDB	No	Update validation or Changes API	Users	Yes
Neo4j	ACID	Transaction event handlers	None	No
DynamoDB	No	Pre/postcommits	None	No
Redis	Multi operation queues	No	Passwords	No

	Main Differentiator	Weaknesses
PostgreSQL	Best of OSS RDBMS model	Distributed availability
HBase	Very large-scale, Hadoop infrastructure	Flexible growth, query-ability
MongoDB	Easily query Big Data	Embed-ability
CouchDB	Durable and embeddable clusters	Query-ability
Neo4j	Flexible graph	BLOBs or terabyte scale
DynamoDB	Highly available	Query-ability
Redis	Very, very fast	Complex data

Previous Chapter

Where Do We Go from Here?

Next Chapter

A2. The CAP Theorem

Table of Contents for Seven Databases in Seven Weeks, 2nd Edition

Appendix 1Database Overview Tables

Table of Contents for
Seven Databases in Seven Weeks, 2nd Edition

Appendix 1
Database Overview Tables