This quick guide presents several techniques and strategies for PostgreSQL backup with Bacula Enterprise.
The PostgreSQL plugin is designed to simplify and automate the backup and restore procedure of the PostgreSQL database or cluster, dismissing the use of complex scripts. The plugin also backs up essential information such as setting users or tablespaces, supporting both both the dump technique and Point In Time Recovery (PITR).
This plugin is available for several Linux 32/64bit platforms, and officially supports and supports versions of PostgreSQL 8.4, 9.0.x, 9.1.x, 9.2.x and higher.
Installation
The PostgreSQL Plugin is unique to Bacula Enterprise and can be downloaded from the contractor’s repository. Contact us to obtain it.
To install the separate package on the machine that already has the Bacula client installed in the same version (e.g.):
rpm -ivh bacula-enterprise-postgresql-plugin-8.8.5-1.el7.x86_64.rpm
Restart the Bacula client after installing the plugin. The status client command should show that the client loaded the plugin successfully.
The plugin can be installed on the same machine or any other one with network access to the PostgreSQL service and also the “postgresql-client” package, or equivalent. Tools such as pg_dump and psql should be available, and the plugin uses the postgres user to access the banks without a password.
This configuration (which is the default) can be done with the following entry in your pg_hba.conf. This entry should be first on the list.
local all postgres ident
If you are accessing the remote database or are unable to disable the password communication method, you can set the password through a pgpass file.
Configuration
The Oracle plugin supports the following methods: custom Bacula Dump Streams and PITR. In the first, restoration of databases and granular components as base schema can be selected directly and automatically by Bacula, and certain banks can be filtered from the backup, but differential and incremental backup rely on Bacula’s block-level depuplication. In PITR, the Postgresql Archive Log configuration is required, providing the DBA with options for restoring bank transactions at different times of the day, not just when the backup is performed. Differential and incremental backups are also native to the PITR technique.
The selection of the backup method is done through the FileSet plugin options. Graphically, as shown in Figure 1.
Figure 1. Editing Postgresql Plugin Option in the Bweb FileSet Configuration.
Or text:
FileSet { Name = FS_postgresql Include { Options { Signature = MD5 } Plugin = "postgresql: mode=pitr" # ou mode=dump } }
Dumps Stream Method
Without specifying mode =, the Postgresql plugin will use Bacula’s special Dumps technique, which provides greater granularity of database elements at the time of the restore, such as roles, PostgreSQL configuration, pg_hba, pg_ident, tablespaces, base creation scripts, schema and DB data. If this partitioning is not desired, just use the mode=dump.
Postgresql dumps are stored in an unstructured way, which can bring few gains with global deduplication. Creating sort routines on the database can help promote more equal blocks and a better yield.
For this method, the plugin configuration options are available in Table 1.
Option | Default | Description | Example |
---|---|---|---|
dump_opt | -c -b | pg_dump aditional options | dump_opt=”-c” |
user | postgres | Operating system Postgresql user commands | user=heitor |
service | Used pg_service | service=main | |
use_sudo | Use sudo to execute the postgresql commands (when it is not root) | use_sudo | |
compress | 0 | Enables dump compression by Postgresql (0-9). 0 is off, ideal when using deduplication | compress=5 |
database | It will copy the databases that match this string. For more than one string, another plugin configuration line can be specified | database=heitor* | |
bin_dir | PostgreSQL binary location | bin_dir=/opt/pg9.1/bin |
Table 1. Postgresql Bacula Enterprise Plugin Dump Stream Options
Windows PostgreSQL Dump Stream
You can also use Bacula’s Postgresql plugin on Linux to back up databases installed on Windows. Any Linux machine with Bacula Client and Plugin, plus the PostgreSQL with a similar Windows Server version can be used.
To do this, you must enable remote access in the PostgreSQL server configuration. In the postgresql.conf configuration file, specify the addresses for external connection.
# Replace: # listen_addresses = 'localhost' # for listen_addresses = '*'
Also, modify the pg_hba.conf file on the server to accept trusted external connections from the Linux Client IP, or through a password (md5). Trust is usually easier to set up Bacula later.
host all all 192.168.0.50/0 trust # or host all all 192.168.0.50/0 md5
Reload the PostgreSQL settings in Windows to apply the changes.
On the Linux machine, install the Client and Plugin EBacula, in addition to the Postgresql Client (eg postgresql-client-9.6) if you have not done so. Use the psql -h windows_server_ip to test the connection.
Create a /etc/pg_service.conf file with the connection settings for each remote base:
[remote1] host=192.168.1.163 port=5432 user=postgres pass=XXXXXX [remote2] host=192.168.1.164 port=5432 user=postgres pass=XXXXXX
Tie the new Bacula Client to the Director and set up a FileSet (Edit Plugins) as in the example:
postgresql: dump_opt=-a database=mybacula service=remote1
If you preferred password authentication in pg_hba.conf, you will need to create a .pgpass file in the user’s home which runs Bacula Client (usually root) to configure password and DB access parameters.
# /root/.pgpass syntax hostname:port:database:username:password
Create a new Backup Job using this FileSet, reload the Bacula Director to apply the changes, and run a test Job.
Improving Dumps Deduplication
In order to improve PostgreSQL dump deduplication backup, it is possible to configure a data CLUSTER process, for all database tables or at least for the largest ones.
For each database table, configure the CLUSTER according to the primary key or any desired index:
select * from pg_indexes where tablename='table'; CLUSTER table USING table_pkey;
At bweb, add a Before Job Run Script for the PostgreSQL job, in order to run the CLUSTER command for each database:
su - postgres -c "psql -d database1 -c 'cluster verbose'" su - postgres -c "psql -d database2 -c 'cluster verbose'"
Dumps Stream Restore
As shown in Figure 2, restoring a base or some component is done directly in the restore file browser of the Bacula interfaces.
Figure 2. Dumps Components Selection for Restoration
To restore a database to a different PostgreSQL instance from the original, you must first restore the schema and then create and create database dumps.
Point-in-Time-Recovery Method
To enable Postgresql archiving mode from version 9.x, you must configure the archive_command, wal_level, and archive_mode directives in your Postgresql configuration file (typically postgresql.conf).
# on 9.0 - 9.x wal_level = archive archive_mode = on archive_command = 'test ! -f /mnt/waldir/%f && cp %p /mnt/waldir/%f'
Create the archive_command mentioned directory:
mkdir /mnt/waldir
Optionally, you can generate Postgresql log archiving with the following alternate archive_command:
archive_command = 'test ! -f /mnt/waldir/%f.gz && gzip -c %p > /mnt/waldir/%f.gz'
NOTE: You can also back up using the hot_standby wal_level option from the master server. The slave server is usually not possible because it must be in permanent recovery mode.
Restart the Postgresql service to apply the changes.
This method requires that the pg_start_backup and pg_stop_backup functions are working. You can test them using the following pgsql commands:
select pg_start_backup('test'); select pg_stop_backup();
The /mnt/waldir directory should be removed periodically when your backup is successful and contemplating some retention period. A Bacula ClientRunAfterJob can do this for files longer than 14 days:
rm -f $(find /mnt/waldir -type f -mtime +14)
The plugin configuration must contain the archive_dir directive that must match the directory where the logs are being written.
For the operation of the PITR method, the Bacula Job Accurate option must also be enabled:
Job { Name = "Postgresql-PITR" Client = laptop1-fd FileSet = FS_postgresql Accurate = yes ... }
The configuration of the plugin, as well as the connection to the bank, can be tested with Bacula’s estimate command:
* estimate listing job=pg-test
For this method the following options are available in Table 2.
Option | Default | Description | Example |
---|---|---|---|
mode=pitr | Enables the plugin PITR backup | ||
archive_dir | pg_xlog | Should point to where the WAL are being written by the archive_command | |
user | postgres | Postgresql operating system user | |
service | Postgresql connection information | service=main | |
pgpass | Path to the Postgresql password file, if necessary | pgpass=/etc/pgpass | |
bin_dir | PostgreSQL binary location | bin_dir=/opt/pg9.1/bin |
Table 2. PITR Options of the Postgresql Bacula Enterprise Plugin
For restoration:
- Stop the Postgresql service, if it is running.
- If you have room for this, copy the entire cluster data directory and any tablespaces to a temporary location, should you need them later. Note that this precaution will require that you have enough free space on the system to store two copies of the existing database. If you do not have enough space, you need to at least copy the contents of the pg_xlog subdirectory from the cluster’s data directory, as it may contain logs that were not archived before the system became inactive.
- Clean up any existing files and subdirectories in the cluster data directory and under the root directories of all the tablespaces you are using.
- Restore the database files from your dump. If you are using tablespaces, you should verify that the symbolic links in pg_tblspc / have been restored correctly. If necessary, check the PrefixLinks restore option.
- Remove all files present in pg_xlog; these have come from backup and therefore are probably obsolete and not current. This directory should normally be empty.
If you have unarchived the WAL segment files that you saved in step 2, copy them to pg_xlog/. (It’s better to copy them, not move them so you still have unmodified files if a problem occurs and you have to start again). - Edit the recovery command file recovery.conf.sample in the cluster data directory and rename it as recovery.conf. You may also want to temporarily modify pg_hba.conf to prevent ordinary users from connecting until you are sure that the recovery worked.
- Start the server. The server will enter recovery mode and continue reading the archived WAL files you need. If recovery is terminated due to an external error, the server can simply be restarted and recovery will continue. Upon completion of the recovery process, the server will rename recovery.conf to recovery.done (to prevent accidental recovery mode recovery in the event of a failure later), and then start normal database operations.
su postgres cd /path/to/your/data/directory mv recovery.conf.sample recovery.conf vi recovery.conf pg_ctl -D $PWD start
- Inspect the contents of the database to ensure that you have recovered where you want to. If not, go back to step 1. If all is well, let users access by restoring pg_hba.conf.
References
- PostgreSQL Backup Using Bacula Enterprise Edition – Bacula Systems. http://baculasystems.com
Disponível em: English