BACKUP

On this page

Note:

BACKUP is an enterprise-only feature. For non-enterprise backups, see cockroach dump.

CockroachDB's BACKUP statement allows you to create full or incremental backups of your cluster's schema and data that are consistent as of a given timestamp.

New in v20.1: You can backup a full cluster, which includes:

All user tables
Relevant system tables
All databases
All tables (which automatically includes their indexes)
All views

You can also back up:

An individual database, which includes all of its tables and views
An individual table, which includes its indexes and views

Because CockroachDB is designed with high fault tolerance, these backups are designed primarily for disaster recovery (i.e., if your cluster loses a majority of its nodes) through RESTORE. Isolated issues (such as small-scale node outages) do not require any intervention.

Note:

BACKUP only offers table-level granularity; it does not support backing up subsets of a table.

Tip:

To view the contents of an enterprise backup created with the BACKUP statement, use SHOW BACKUP.

Required privileges

Only members of the admin role can run BACKUP. By default, the root user belongs to the admin role.
BACKUP requires full read and write (including delete and overwrite) permissions to its target destination.

Synopsis

Parameters

Parameter	Description
`table_pattern`	The table(s) or view(s) you want to back up.
`database_name`	The name of the database(s) you want to back up (i.e., create backups of all tables and views in the database).
`destination`	The URL where you want to store the backup. For information about this URL structure, see Backup File URLs.
`timestamp`	Back up data as it existed as of `timestamp`. The `timestamp` must be more recent than your cluster's last garbage collection (which defaults to occur every 25 hours, but is configurable per table).
`full_backup_location`	Create an incremental backup using the backup stored at the URL `full_backup_location` as its base. For information about this URL structure, see Backup File URLs. Note: After a full backup for an explicit list of tables and/or databases, it is not possible to create an incremental backup if one or more tables were created, dropped, or truncated. In these cases, you must create a new full backup before more incremental backups can be created. To avoid this, backup the cluster instead of the explicit list of tables/databases.
`incremental_backup_location`	Create an incremental backup that includes all backups listed at the provided URLs. Lists of incremental backups must be sorted from oldest to newest. The newest incremental backup's timestamp must be within the table's garbage collection period. For information about this URL structure, see Backup File URLs. For more information about garbage collection, see Configure Replication Zones.
`kv_option_list`	Control the backup behavior with a comma-separated list of these options.

Note:

The BACKUP statement is a blocking statement and cannot be used within a transaction.

Options

Option	Value	Description
`revision_history`	N/A	Create a backup with full revision history, which records every change made to the cluster within the garbage collection period leading up to and including the given timestamp.
`encryption_passphrase`	`STRING`	New in v20.1: The passphrase used to encrypt the files (`BACKUP` manifest and data files) that the `BACKUP` statement generates. This same passphrase is needed to decrypt the file when it is used to restore and to list the contents of the backup when using `SHOW BACKUP`. There is no practical limit on the length of the passphrase.

For more information about these options, see Back up and Restore Data - Advanced Options.

Backup file URLs

CockroachDB uses the URL provided to construct a secure API call to the service you specify. The path to each backup must be unique, and the URL for your backup's destination/locations must use the following format:

[scheme]://[host]/[path]?[parameters]

Location	Scheme	Host	Parameters
Amazon	`s3`	Bucket name	`AUTH` ¹ (optional; can be `implicit` or `specified`), `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`
Azure	`azure`	N/A (see Example file URLs	`AZURE_ACCOUNT_KEY`, `AZURE_ACCOUNT_NAME`
Google Cloud ²	`gs`	Bucket name	`AUTH` (optional; can be `default`, `implicit`, or `specified`), `CREDENTIALS`
HTTP ³	`http`	Remote host	N/A
NFS/Local ⁴	`nodelocal`	`nodeID` or `self` ⁵ (see Example file URLs)	N/A
S3-compatible services ⁶	`s3`	Bucket name	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`, `AWS_REGION` ⁷ (optional), `AWS_ENDPOINT`

Note:

The location parameters often contain special characters that need to be URI-encoded. Use Javascript's encodeURIComponent function or Go language's url.QueryEscape function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters.

Note:

If your environment requires an HTTP or HTTPS proxy server for outgoing connections, you can set the standard HTTP_PROXY and HTTPS_PROXY environment variables when starting CockroachDB.

New in v20.1: If you cannot run a full proxy, you can disable external HTTP(S) access (as well as custom HTTP(S) endpoints) when performing bulk operations (e.g., BACKUP, RESTORE, etc.) by using the --external-io-disable-http flag. You can also disable the use of implicit credentials when accessing external cloud storage services for various bulk operations by using the --external-io-disable-implicit-credentials flag.

¹ If the AUTH parameter is not provided, AWS connections default to specified and the access keys must be provided in the URI parameters. If the AUTH parameter is implicit, the access keys can be omitted and the credentials will be loaded from the environment.
² If the AUTH parameter is not specified, the cloudstorage.gs.default.key cluster setting will be used if it is non-empty, otherwise the implicit behavior is used. If the AUTH parameter is implicit, all GCS connections use Google's default authentication strategy. If the AUTH parameter is default, the cloudstorage.gs.default.key cluster setting must be set to the contents of a service account file which will be used during authentication. If the AUTH parameter is specified, GCS connections are authenticated on a per-statement basis, which allows the JSON key object to be sent in the CREDENTIALS parameter. The JSON key object should be base64-encoded (using the standard encoding in RFC 4648).
³ You can create your own HTTP server with Caddy or nginx. A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from HTTPS URLs.
⁴ The file system backup location on the NFS drive is relative to the path specified by the --external-io-dir flag set while starting the node. If the flag is set to disabled, then imports from local directories and NFS drives are disabled.
⁵ New in v20.1: Using a nodeID is required and the data files will be in the extern directory of the specified node. In most cases (including single-node clusters), using nodelocal://1/<path> is sufficient. Use self if you do not want to specify a nodeID, and the individual data files will be in the extern directories of arbitrary nodes; however, to work correctly, each node must have the --external-io-dir flag point to the same NFS mount or other network-backed, shared storage.
⁶ A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from an S3-compatible service.
⁷ The AWS_REGION parameter is optional since it is not a required parameter for most S3-compatible services. Specify the parameter only if your S3-compatible service requires it.

Example file URLs

Location	Example
Amazon S3	`s3://acme-co/employees?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456`
Azure	`azure://employees?AZURE_ACCOUNT_KEY=123&AZURE_ACCOUNT_NAME=acme-co`
Google Cloud	`gs://acme-co`
HTTP	`http://localhost:8080/employees`
NFS/Local	`nodelocal://1/path/employees`, `nodelocal://self/nfsmount/backups/employees` ⁵

Functional details

Object dependencies

Dependent objects must be backed up at the same time as the objects they depend on.

Object	Depends On
Table with foreign key constraints	The table it `REFERENCES`; however, this dependency can be removed during the restore.
Table with a sequence	The sequence it uses; however, this dependency can be removed during the restore.
Views	The tables used in the view's `SELECT` statement.
Interleaved tables	The parent table in the interleaved hierarchy.

Users and privileges

The system.users table stores your users and their passwords. To restore your users and privilege grants, do a cluster backup and restore the cluster to a fresh cluster with no user data. You can also backup the system.users table, and then use this procedure.

Backup types

CockroachDB offers two types of backups: full and incremental.

Full backups

Full backups contain an unreplicated copy of your data and can always be used to restore your cluster. These files are roughly the size of your data and require greater resources to produce than incremental backups. You can take full backups as of a given timestamp and (optionally) include the available revision history.

Incremental backups

Incremental backups are smaller and faster to produce than full backups because they contain only the data that has changed since a base set of backups you specify (which must include one full backup, and can include many incremental backups). You can take incremental backups either as of a given timestamp or with full revision history.

Warning:

Incremental backups can only be created within the garbage collection period of the base backup's most recent timestamp. This is because incremental backups are created by finding which data has been created or modified since the most recent timestamp in the base backup––that timestamp data, though, is deleted by the garbage collection process.

You can configure garbage collection periods using the ttlseconds replication zone setting.

For an example of an incremental backup, see the Create incremental backups example below.

Performance

The BACKUP process minimizes its impact to the cluster's performance by distributing work to all nodes. Each node backs up only a specific subset of the data it stores (those for which it serves writes; more details about this architectural concept forthcoming), with no two nodes backing up the same data.

For best performance, we also recommend always starting backups with a specific timestamp at least 10 seconds in the past. For example:

> BACKUP...AS OF SYSTEM TIME '-10s';

This improves performance by decreasing the likelihood that the BACKUP will be retried because it contends with other statements/transactions. However, because AS OF SYSTEM TIME returns historical data, your reads might be stale.

Viewing and controlling backups jobs

After CockroachDB successfully initiates a backup, it registers the backup as a job, and you can do the following:

Action	SQL Statement
View the backup status	`SHOW JOBS`
Pause the backup	`PAUSE JOB`
Resume the backup	`RESUME JOB`
Cancel the backup	`CANCEL JOB`

You can also visit the Jobs page of the Admin UI to view job details. The BACKUP statement will return when the backup is finished or if it encounters an error.

Note:

The presence of a BACKUP-CHECKPOINT file in the backup destination usually means the backup is not complete. This file is created when a backup is initiated, and is replaced with a BACKUP file once the backup is finished.

Examples

Per our guidance in the Performance section, we recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME. Each example below follows this guidance.

Backup a cluster

New in v20.1: To backup a full cluster:

> BACKUP TO \
'gs://acme-co-backup/test-cluster' \
AS OF SYSTEM TIME '-10s';

Backup a database

To backup a single database:

> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '-10s';

To backup multiple databases:

> BACKUP DATABASE bank, employees \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '-10s';

Backup a table or view

To backup a single table or view:

> BACKUP bank.customers \
TO 'gs://acme-co-backup/bank-customers-2017-03-27-weekly' \
AS OF SYSTEM TIME '-10s';

To backup multiple tables:

> BACKUP bank.customers, bank.accounts \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '-10s';

Create incremental backups

New in v20.1: If you backup to a destination already containing a full backup, an incremental backup will be produced in a subdirectory with a date-based name (e.g., destination/day/time_1, destination/day/time_2):

> BACKUP TO \
'gs://acme-co-backup/test-cluster' \
AS OF SYSTEM TIME '-10s';

Note:

This incremental backup syntax does not work for backups using HTTP storage; you must explicitly control where your incremental backups go by using the INCREMENTAL FROM syntax.

Advanced examples

For examples of advanced BACKUP and RESTORE use cases, see Back up and Restore Data - Advanced Options. Advanced examples include:

Cockroach
University

Docs Hub

BACKUP

Required privileges

Synopsis

Parameters

Options

Backup file URLs

Example file URLs

Functional details

Object dependencies

Users and privileges

Backup types

Full backups

Incremental backups

Performance

Viewing and controlling backups jobs

Examples

Backup a cluster

Backup a database

Backup a table or view

Create incremental backups

Advanced examples

See also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

BACKUP

Required privileges

Synopsis

Parameters

Options

Backup file URLs

Example file URLs

Functional details

Object dependencies

Users and privileges

Backup types

Full backups

Incremental backups

Performance

Viewing and controlling backups jobs

Examples

Backup a cluster

Backup a database

Backup a table or view

Create incremental backups

Advanced examples

See also

Cockroach
University

Cockroach
University