How to Create a PostgreSQL Database Snapshot
Joe Scharf
Updated
5 min read
A point-in-time database backup, also known as a database snapshot, is a common database development practice for backing up and transferring databases. PostgreSQL includes a robust set of utilities for managing this task. The PostgreSQL snapshot and restore utilities come with a plethora of command-line flags, giving you a great deal of control over what data is included and how the backup is performed. For most use cases, there is a standard set of options that can be specified to quickly generate a flexible snapshot.
This article is intended to present these options as well as a summary of important things to know when creating a PostgreSQL database snapshot.
2024-01-05 Update: See our complimentary article: How to Restore a PostgreSQL database snapshot with
pg_restore
.
The pg_dump
Utility
The command-line utility used to create the PostgreSQL database snapshot is named pg_dump
. Some important things to know about pg_dump
:
pg_dump
makes consistent database snapshots, even if the database is being used concurrently. It does not block other users from reading or writing to the database.pg_dump
executesSELECT
statements to create the database snapshot. At a minimum, you'll needSELECT
privileges on the database you wish to snapshot.- Generating a snapshot in plain-text format can be restored using the
psql
utility. Snapshots created in archive format are restored withpg_restore
.
TL;DR - The pg_dump command with recommended flags
For those readers that want to get right to the answer, here is the pg_dump
command with recommended pg_dump
command flags.
pg_dump -d "<url>" -j 8 -Fd -f <snapshot_directory_name>
Description of pg_dump
flags:
-d "<url>"
Specifies the source database in URL format. This is a convenient way to specify all database connection information in a single parameter. Example:
pg_dump -d "postgres://user:pass@host:port/database?sslmode=[require | disable] ..."
-j 8
Indicates that we want to use parallelization to speed up the generation of the database snapshot. In this case, we are taking snapshots of eight (8) tables at the same time. This necessitates opening 8 + 1 database connections, so adjust the the server's max_connections
setting accordingly.
-Fd
Selects directory format as the snapshot operation's output. The database snapshot will be saved in this directory, with one file for each table included. This option supports parallel snapshots (as specified by the -j
option) and adds flexibility and selectivity when restoring a snapshot using the pg_restore
utility.
-f <snapshot_directory_name>
Specifies the name of the target directory where the snapshot output will be saved. The pg_dump
utility will create this directory and it must not already exist.
Compatibility Guide
The compatibility matrix below will help you determine whether different versions of pg_dump
, the database snapshot, and the PostgreSQL server will work together. The compatibility matrix answers the following questions:
- Will a version of
pg_dump
allow me to create a snapshot from a given version of PostgreSQL server? - Will I be able to load a database snapshot created by a given version of
pg_dump
onto a given version of PostgreSQL server?
...is older than PostgreSQL Server | ...is newer than PostgreSQL Server | |
---|---|---|
snapshot version | ✅ OK | ❌ NO |
pg_dump version | ⚠️ OK within 1 major version of server | ✅ OK for server versions >= 9.2 |
Output Mode and Parallelization
The pg_dump
utility supports a few different output formats for the database snapshot specified by the -F format
command-line flag. The default option -Fp
generates a plain-text SQL script file that can be used to rebuild the database. While the plain-text output format can potentially be used on other SQL database products (with some modifications), its speed and capabilities are limited.
Archive file formats provide more robust database snapshot and restore capabilities. The directory (-Fd
) and custom (-Fc
) archive file formats are compressed by default and offer selective restores of the data contained in the snapshot. The directory format also supports creating and restoring snapshots in parallel (specified by the -j command-line flag), which is why it is included in the recommended command above. Several command-line flags supported by pg_dump
are ignored when using archive formats but can be specified when restoring the snapshot with pg_restore
.
Selecting What to Snapshot
There are several command-line flags that allow you to specify what is included in the database snapshot. You can include -n
or exclude -N
schemas and use -t pattern
to specify which tables are included in the snapshot. However it is generally recommended to take a full snapshot of the database using the directory archive format and use the pg_restore
utility to selectively restore a subset of the snapshot if desired. This way, if a problem arises, you will always have a complete point-in-time backup of your original database.
DBSnapper simplifies database snapshot management
The DBSnapper app integrates with the PostgreSQL pg_dump
and pg_restore
commands and simplifies the process of creating, sanitizing, and restoring database snapshots. When combined with the DBSnapper Cloud, management of your snapshots is much easier with private cloud storage profiles and team sharing options. Install the DBSnapper Agent to get started, and Sign up for the DBSnapper Cloud today!
If you have any questions or feedback on this article, please get in touch on our contact page.