DBSnapper v2.0 Featuring Database Subsetting has been released! Learn More

How to Create a PostgreSQL Database Snapshot

Joe Scharf

Joe Scharf

Updated

5 min read

Polaroid camera taking a picture of the postgresql logo and the resulting slide.

A point-in-time database backup, also known as a database snapshot, is a common database development practice for backing up and transferring databases. PostgreSQL includes a robust set of utilities for managing this task. The PostgreSQL snapshot and restore utilities come with a plethora of command-line flags, giving you a great deal of control over what data is included and how the backup is performed. For most use cases, there is a standard set of options that can be specified to quickly generate a flexible snapshot.

This article is intended to present these options as well as a summary of important things to know when creating a PostgreSQL database snapshot.

2024-01-05 Update: See our complimentary article: How to Restore a PostgreSQL database snapshot with pg_restore.

The pg_dump Utility

The command-line utility used to create the PostgreSQL database snapshot is named pg_dump. Some important things to know about pg_dump:

  • pg_dump makes consistent database snapshots, even if the database is being used concurrently. It does not block other users from reading or writing to the database.
  • pg_dump executes SELECT statements to create the database snapshot. At a minimum, you'll need SELECT privileges on the database you wish to snapshot.
  • Generating a snapshot in plain-text format can be restored using the psql utility. Snapshots created in archive format are restored with pg_restore.

For those readers that want to get right to the answer, here is the pg_dump command with recommended pg_dump command flags.

pg_dump -d "<url>" -j 8 -Fd -f <snapshot_directory_name>

Description of pg_dump flags:

-d "<url>" Specifies the source database in URL format. This is a convenient way to specify all database connection information in a single parameter. Example:

pg_dump -d "postgres://user:pass@host:port/database?sslmode=[require | disable] ..."

-j 8 Indicates that we want to use parallelization to speed up the generation of the database snapshot. In this case, we are taking snapshots of eight (8) tables at the same time. This necessitates opening 8 + 1 database connections, so adjust the the server's max_connections setting accordingly.

-Fd Selects directory format as the snapshot operation's output. The database snapshot will be saved in this directory, with one file for each table included. This option supports parallel snapshots (as specified by the -j option) and adds flexibility and selectivity when restoring a snapshot using the pg_restore utility.

-f <snapshot_directory_name> Specifies the name of the target directory where the snapshot output will be saved. The pg_dump utility will create this directory and it must not already exist.

Compatibility Guide

The compatibility matrix below will help you determine whether different versions of pg_dump, the database snapshot, and the PostgreSQL server will work together. The compatibility matrix answers the following questions:

  1. Will a version of pg_dump allow me to create a snapshot from a given version of PostgreSQL server?
  2. Will I be able to load a database snapshot created by a given version of pg_dump onto a given version of PostgreSQL server?
...is older than PostgreSQL Server...is newer than PostgreSQL Server
snapshot version✅ OK❌ NO
pg_dump version⚠️ OK within 1 major version of server✅ OK for server versions >= 9.2

Output Mode and Parallelization

The pg_dump utility supports a few different output formats for the database snapshot specified by the -F format command-line flag. The default option -Fp generates a plain-text SQL script file that can be used to rebuild the database. While the plain-text output format can potentially be used on other SQL database products (with some modifications), its speed and capabilities are limited.

Archive file formats provide more robust database snapshot and restore capabilities. The directory (-Fd) and custom (-Fc) archive file formats are compressed by default and offer selective restores of the data contained in the snapshot. The directory format also supports creating and restoring snapshots in parallel (specified by the -j command-line flag), which is why it is included in the recommended command above. Several command-line flags supported by pg_dump are ignored when using archive formats but can be specified when restoring the snapshot with pg_restore.

Selecting What to Snapshot

There are several command-line flags that allow you to specify what is included in the database snapshot. You can include -n or exclude -N schemas and use -t pattern to specify which tables are included in the snapshot. However it is generally recommended to take a full snapshot of the database using the directory archive format and use the pg_restore utility to selectively restore a subset of the snapshot if desired. This way, if a problem arises, you will always have a complete point-in-time backup of your original database.

DBSnapper simplifies database snapshot management

The DBSnapper app integrates with the PostgreSQL pg_dump and pg_restore commands and simplifies the process of creating, sanitizing, and restoring database snapshots. When combined with the DBSnapper Cloud, management of your snapshots is much easier with private cloud storage profiles and team sharing options. Install the DBSnapper Agent to get started, and Sign up for the DBSnapper Cloud today!

If you have any questions or feedback on this article, please get in touch on our contact page.


References