PostgreSQL has a number of options for sharing data with external
servers or data sources. The first option is the built-in replication options of PostgreSQL, which allow you to create a
copy of your server ready to run on another PostgreSQL server. The second
option is to use third-party add-ons, many of which are freely available and
time-tested. The third option is to use a foreign data wrapper (FDW). FDWs
give you the flexibility to query from a wide array of external data
sources. Since version 9.3, some FDWs also permit updating: these include postgres_fdw, hadoop_fdw, and ogr_fdw
(see “Querying Other Tabular Formats with ogr_fdw”).
The reasons for replicating your databases distill down to two: availability and scalability. Availability is assured by providing a redundant server so that, if your main server goes down, you have another that can immediately assume its role. For small databases, you could just make sure you have another physical server ready and restore the database onto it. But for large databases (in the terabytes), the restore itself could take hours, if not days. To avoid downtime, you’ll need to replicate.
The other motivation for replications is scalability. Suppose you set up a database to breed fancy elephant shrews for profit. After a few years of breeding, you now have thousands of elephant shrews. People all over the world come to your site to gawk and purchase. You’re overwhelmed by the traffic, but replication comes to your aid. You arrange a read-only slave server to replicate with your main server. Then you direct the countless gawkers to the slave, and let only serious buyers onto the master server to finalize their purchases.
Before we get too carried away, we should introduce some common lingo in PostgreSQL replication:
The master server is the database server sourcing the data being replicated and where all updates take place. You’re allowed only one master when using the built-in server replication features of PostgreSQL. Plans are in place to support multimaster replication scenarios. Watch for it in future releases. You may also hear the term publisher used to mean the provider of the data. Publisher/subscriber terminology gains more traction in PostgreSQL 10 for built-in logical replication.
A slave server consumes the replicated data and provides a replica of the master. More aesthetically pleasing terms such as subscriber and agent have been bandied about, but slave is still the most apropos. PostgreSQL built-in replication supports only read-only slaves at this time.
WAL is the log that keeps track of all transactions, often referred to as the transaction log in other database products. To stage replication, PostgreSQL simply makes the logs available to the slaves. Once slaves have pulled the logs, they just need to execute the transactions therein.
A transaction on the master will not be considered complete until at
least one synchronous slave listed in
synchronous_standby_names updates and reports
back. Prior to version 9.6, if any synchronous slave responds, the
transaction is complete. In version 9.6 and higher, the number of
standbys that must respond is configurable using the
synchronous_standby_names postgresql.conf
configuration variable. Version 10 introduced the keywords FIRST
and ANY that can be added to the synchronous_standby_names
configuration variable that dictates which nodes need to report
back. FIRST is the default behavior if not specified and the
behavior of 9.6.
A transaction on the master will commit even if no slave updates. This is expedient for distant servers where you don’t want transactions to wait because of network latency, but the downside is that your dataset on the slave might lag behind. Should the lag be severe, the slave might need to be reinitialized if the transaction it needs to continue has already been removed from the WAL logs.
To minimize the risk of WALs being removed before all slaves have used them, version 9.4 introduced replication slots. A replication slot is a contract between a slave and its master whereby the master will not wipe out any WAL logs that are still needed by any replication slots. The hazard is that if a slave holding a replication slot fails or loses communication for a long time, the master will keep the WALS indefinitely and run out of disk space and shut down.
The slave does not require direct file access between master and slaves. Instead, it relies on the PostgreSQL connection protocol to transmit the WALs.
Slaves can receive logs from nearby slaves instead of directly from the master. This allows a slave to behave like a master for replication purposes. The slave remains read-only. When a slave acts both as a receiver and a sender, it is called a cascading standby.
This is a new replication option in version 10 that allows the
replication of individual tables instead of requiring the whole
server cluster to be replicated. It relies on a feature called logical decoding, which
extracts changes to a database table from the WAL logs in an
easy-to-understand format without detailed knowledge of the
database’s internal state. Logical decoding has existed since 9.4
and has been used by some extensions for auditing and providing
replication. This new feature comes with the new DDL commands CREATE PUBLICATION and
CREATE SUBSCRIPTION for designating what tables to
replicate and what servers and corresponding database to send data
to.
To use this feature, you must set wal_level to
logical.
Refer to Logical Replication in PostgreSQL 10 for an example of its use.
Remastering promotes a slave to be the master. Version 9.3 introduced streaming-only remastering, which eliminates the need for remastering to consult a WAL archive; it can be done via streaming, and slaves no longer need to be recloned. As of version 9.4, though, a restart is still required. This may change in future releases.
PostgreSQL binary replication replicates only changes that are
transactional. Because any DDL command is transactional, the creation of tables,
views, and installation of extensions can be replicated as well. But
because unlogged table inserts and updates are not transactional, they
cannot be replicated. When installing extensions, you should make sure
all slaves have the binaries for the extension and version of extension
you are installing; otherwise, replication will fail when the CREATE EXTENSION command is executed on
the master.
PostgreSQL’s stock replication relies on WAL shipping. Streaming replication slaves should be running the same OS and bitness (32-bit/64-bit) as the master. It is also recommended that all servers be running the same minor version as the master, though running the same patch level (microversion) is not required. Though not recommended, the slave and master can be running a different minor version. In this case, it’s preferable for the slave to be running a newer minor version than the master.
Support for built-in replication improved over the following PostgreSQL releases:
Version 9.4 added replication slots. A replication slot is a contract between a master and a slave that requires the master to hold on to WALs until a slave is done processing them.
Version 9.5 added several functions for monitoring the progress of replication: refer to Replication Progress Tracking in the documentation.
Version 9.6 introduced multiple standby servers in synchronous replication for increased reliability.
Version 10 introduced built-in logical replication, which allows the replication of individual tables. The other benefit of logical replication is that a slave can have databases and tables of its own that are not part of replication and that can be updated on the slave. Version 10 also introduced temporary replication slots, which allow a process to create a replication slot on a one-time basis and have it disappear after the session is over. This is particularly useful for initializing a new copy of the server via pg_basebackup.
Although logical replication is built into PostgreSQL for the first time in version 10, you can use logical replication in PostgreSQL 9.4 and higher versions of PostgreSQL 9 through the open source PostgreSQL extension pglogical. If you need to replicate between version 10 and versions 9.4−9.6, you’ll need to have pglogical installed on both version 10 and the lower-versioned server. For logical replication between version 10 and future versions of PostgreSQL, you can use the built-in logical replication feature.
As alternatives to PostgreSQL’s built-in replication, common third-party options abound. Slony and Bucardo are two popular open source ones. Although PostgreSQL is improving replication with each new release, Slony, Bucardo, and other third-party replication options still offer more flexibility. Slony and Bucardo allow you to replicate individual databases or even tables instead of the entire server. They also don’t require that all masters and slaves run the same PostgreSQL version and OS. Both also support multimaster scenarios. However, both rely on additional triggers and possible addition of columns to tables to initiate the replication and often don’t replicate DDL commands for rare actions such as creating new tables, installing extensions, and so on. Thus, they require more manual intervention, such as the addition of triggers, additional table fields, or views.
We urge you to consult a comparison matrix of popular third-party options before deciding what to use.
Let’s go over the steps to replicate the whole server cluster. We’ll take advantage of streaming replication. Recall that streaming replication only requires connections at the PostgreSQL database level between the master and slaves.
The steps for setting up the master are:
CREATE ROLE pgrepuser REPLICATION LOGIN PASSWORD 'woohoo';
Alter the following configuration settings in postgresql.auto.conf. These can be done
using ALTER SYSTEM set variable=value followed by
SELECT pg_reload_conf(); without the need to touch the
physical config file:
listen_addresses = * wal_level = hot_standby archive_mode = on max_wal_senders = 5 wal_keep_segments = 10
If you want to use logical replication to do partial replication of only some tables, you’ll need to set wal_level = logical. Logical does more logging than hot_standby so will also work for doing full server replication.
These settings are described in Server Configuration:
Replication. You may want to set wal_keep_segments higher if
your servers are far apart and your production server has a lot of
transactions. If you are running version 9.6 or above, you should
use replica instead of hot_standby for the
wal_level. hot_standby is still
accepted in 9.6 for backward compatibility, but will be read as
replica.
Add the archive_command
configuration directive to postgresql.auto.conf or use ALTER
SYSTEM to indicate where the WALs will be saved. With
streaming, you’re free to choose any directory. More details on this
setting can be found in the PostgreSQL PGStandby
documentation.
On Linux/Unix, your archive_command line should look
something like:
archive_command = 'cp %p ../archive/%f'
You can also use rsync instead of cp
if you want to store the WALs on a different server:
archive_command = 'rsync -av %p postgres@192.168.0.10:archive/%f'
archive_command = 'copy %p ..\\archive\\%f'
Add a rule to pg_hba.conf
allowing the slaves to replicate. As an example, the following
rule will allow a PostgreSQL account named pgrepuser on a server on your private network with an IP address
in the range 192.168.0.1 to 192.168.0.254 to replicate using an md5
password:
host replication pgrepuser 192.168.0.0/24 md5
Restart the PostgreSQL service for the settings to take effect.
Use the pg_basebackup utility, found in the bin
folder of your PostgreSQL installation, to create a cluster backup.
This will create a copy of the data cluster files in the specified
directory.
When using pg_basebackup, use the
--xlog-method-stream switch to also copy over the WAL
logs and the -R switch to automatically create a config
file. The command --xlog-method-stream will spawn
another database connection for copying the WALs.
In version 10 and above, the pg_xlog directory is pg_wal.
In the following example, we are on the slave server and performing a streaming basebackup from our master server (192.168.0.1):
pg_basebackup -D /target_dir -h 192.168.0.1 \ --port=5432 --checkpoint=fast --xlog-method=stream -R
If you are using pg_basebackup primarily for backup purposes, you
can use the tarred/compressed form, which will create a tar.gz file in the target_dir folder for each table space.
-X is shorthand for --xlog-method. The tarred/compression format
does not support streaming logs, so you have to resort to fetching the
logs with that format:
pg_basebackup -Z9 -D /target_dir/ -h 192.168.0.1 -Ft -Xfetch
For backup, you will want to augment your backup to include transaction log shipping backup using pg_receivexlog for versions prior to 10. For versions 10 and above, pg_receivexlog was renamed to pg_receivewal. This you’ll want to keep running as a cronjob or service to continually make log backups.
This part is not needed for logical replication. To minimize headaches, slaves should have the same configuration as the master, especially if you’ll be using them for failover. They must also have the same set of PostgreSQL extensions installed in binary; otherwise, when CREATE EXTENSION is played back, it will fail and stop restore. In order for the server to be a slave, it must be able to play back the WAL transactions of the master. The steps for creating a slave are as follows:
Create a new instance of PostgreSQL with the same version (preferably even microversions) as your master server. For PostgreSQL, keeping servers identical for microversions is not a requirement, and you’re welcome to experiment and see how far you can deviate.
Shut down PostgreSQL on the new slave.
Overwrite the data folder files with those you generated with pg_basebackup.
Add the following configuration setting to the postgresql.auto.conf file:
hot_standby = on max_connections = 20 #set to higher or equal to master
You don’t need to run the slaves on the same port as the
master, so you can optionally change the port either via postgresql.auto.conf, postgresql.conf, or via some other
OS-specific startup script that sets the PGPORT
environment variable before startup.
Create a new file in the data folder called recovery.conf with the following contents, but substitute the actual hostname, IP address, and port of your master on the second line. This file is automatically created if you used pg_basebackup. You will have to add the trigger_file line though.
The application_name is optional but useful if you want to track the replica in postgresql system views:
standby_mode = 'on' primary_conninfo = 'host=192.168.0.1 port=5432 user=pgrepuser