os: centos 7.4
postgresql: 9.6.9
pg_rman: REL9_6_STABLE
pg_rman 是一款优秀的postgresql 在线备份和恢复的工具,在github上可以找到该软件。
下面是pg_rman主页面的描述:
pg_rman is an online backup and restore tool for PostgreSQL.
The goal of the pg_rman project is to provide a method for online backup and PITR that is as easy as pg_dump. Also, it maintains a backup catalog per database cluster. Users can maintain old backups including archive logs with one command.
下载
https://github.com/ossc-db/pg_rman
由于本次的postgresql 版本为 9.6.9,所以需要下载相应的pg_rman REL9_6_STABLE : branch for PostgreSQL 9.6
# su - postgres
$ cd /tmp
$ git clone https://github.com/ossc-db/pg_rman.git
$ cd pg_rman/
$ git branch -a
* master
remotes/origin/HEAD -> origin/master
remotes/origin/REL9_2_STABLE
remotes/origin/REL9_3_STABLE
remotes/origin/REL9_4_STABLE
remotes/origin/REL9_5_STABLE
remotes/origin/REL9_6_STABLE
remotes/origin/REL_10_STABLE
remotes/origin/master
remotes/origin/pre-9.2
$ git checkout REL9_6_STABLE
Already on 'REL9_6_STABLE'
$ git status
$ On branch REL9_6_STABLE
nothing to commit, working directory clean
安装
前提是postgresl 9.6 已经安装好了。
# su - postgres
$ make
$ make installcheck
$ exit
# cd /tmp/pg_rman
# make install
/usr/bin/mkdir -p '/usr/pgsql-9.6/bin'
/usr/bin/install -c pg_rman '/usr/pgsql-9.6/bin'
# ls -l /usr/pgsql-9.6/bin | grep -i rman
-rwxr-xr-x. 1 root root 633680 Jun 11 23:08 pg_rman
pg_rman --help
$ which pg_rman
/usr/pgsql-9.6/bin/pg_rman
$ pg_rman --help
pg_rman manage backup/recovery of PostgreSQL database.
Usage:
pg_rman OPTION init
pg_rman OPTION backup
pg_rman OPTION restore
pg_rman OPTION show [DATE]
pg_rman OPTION show detail [DATE]
pg_rman OPTION validate [DATE]
pg_rman OPTION delete DATE
pg_rman OPTION purge
Common Options:
-D, --pgdata=PATH location of the database storage area
-A, --arclog-path=PATH location of archive WAL storage area
-S, --srvlog-path=PATH location of server log storage area
-B, --backup-path=PATH location of the backup storage area
-c, --check show what would have been done
-v, --verbose show what detail messages
-P, --progress show progress of processed files
Backup options:
-b, --backup-mode=MODE full, incremental, or archive
-s, --with-serverlog also backup server log files
-Z, --compress-data compress data backup with zlib
-C, --smooth-checkpoint do smooth checkpoint before backup
-F, --full-backup-on-error switch to full backup mode
if pg_rman cannot find validate full backup
on current timeline
NOTE: this option is only used in --backup-mode=incremental or archive.
--keep-data-generations=NUM keep NUM generations of full data backup
--keep-data-days=NUM keep enough data backup to recover to N days ago
--keep-arclog-files=NUM keep NUM of archived WAL
--keep-arclog-days=DAY keep archived WAL modified in DAY days
--keep-srvlog-files=NUM keep NUM of serverlogs
--keep-srvlog-days=DAY keep serverlog modified in DAY days
--standby-host=HOSTNAME standby host when taking backup from standby
--standby-port=PORT standby port when taking backup from standby
Restore options:
--recovery-target-time time stamp up to which recovery will proceed
--recovery-target-xid transaction ID up to which recovery will proceed
--recovery-target-inclusive whether we stop just after the recovery target
--recovery-target-timeline recovering into a particular timeline
--hard-copy copying archivelog not symbolic link
Catalog options:
-a, --show-all show deleted backup too
Delete options:
-f, --force forcibly delete backup older than given DATE
Connection options:
-d, --dbname=DBNAME database to connect
-h, --host=HOSTNAME database server host or socket directory
-p, --port=PORT database server port
-U, --username=USERNAME user name to connect as
-w, --no-password never prompt for password
-W, --password force password prompt
Generic options:
-q, --quiet don't show any INFO or DEBUG messages
--debug show DEBUG messages
--help show this help, then exit
--version output version information, then exit
Read the website for details. <http://github.com/ossc-db/pg_rman>
Report bugs to <http://github.com/ossc-db/pg_rman/issues>.
pg_rman init
pg_rman 需要一个备份目录
# mkdir -p /mnt/walbackup
# mkdir -p /mnt/pg_rman_backupset
# chown -R postgres:postgres /mnt
$ ls -l
total 0
drwxr-xr-x. 2 postgres postgres 6 Jun 11 23:24 pg_rman_backupset
drwxr-xr-x. 2 postgres postgres 6 Jun 11 23:05 walbackup
$ vi ~/.bash_profile
export BACKUP_PATH=/mnt/pg_rman_backupset
pg_rman init 初始化
$ pg_rman init
INFO: ARCLOG_PATH is set to '/mnt/walbackup'
INFO: SRVLOG_PATH is set to '/var/lib/pgsql/9.6/data/pg_log'
pg_rman backup
pg_rman 全量备份
$ pg_rman backup --backup-mode=full --with-serverlog --progress
INFO: copying database files
Processed 1166 of 1166 files, skipped 0
INFO: copying archived WAL files
Processed 3 of 3 files, skipped 0
INFO: copying server log files
Processed 1 of 1 files, skipped 0
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
pg_rman 校验备份集
pg_rman 的备份必须都是经过验证过的,否则不能进行恢复和增量备份
$ pg_rman validate
INFO: validate: "2018-06-11 23:30:47" backup, archive log files and server log files by CRC
INFO: backup "2018-06-11 23:30:47" is valid
pg_rman 列出备份集
$ pg_rman show
=====================================================================
StartTime EndTime Mode Size TLI Status
=====================================================================
2018-06-11 23:30:47 2018-06-11 23:30:49 FULL 58MB 1 OK
$ ls -l /mnt/pg_rman_backupset/
total 8
drwx------. 3 postgres postgres 20 Jun 11 23:30 20180611
drwx------. 4 postgres postgres 35 Jun 11 23:27 backup
-rw-r--r--. 1 postgres postgres 75 Jun 11 23:27 pg_rman.ini
-rw-r--r--. 1 postgres postgres 40 Jun 11 23:27 system_identifier
drwx------. 2 postgres postgres 6 Jun 11 23:27 timeline_history
pg_rman 增量备份
增量备份是基于文件系统的update time时间线
增量备份必须有个对应的全库备份
$ pg_rman backup --backup-mode incremental --progress --compress-data
INFO: copying database files
Processed 1435 of 1435 files, skipped 1135
INFO: copying archived WAL files
Processed 6 of 6 files, skipped 3
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
$ pg_rman validate
INFO: validate: "2018-06-11 23:40:57" backup and archive log files by CRC
INFO: backup "2018-06-11 23:40:57" is valid
$ pg_rman show
=====================================================================
StartTime EndTime Mode Size TLI Status
=====================================================================
2018-06-11 23:40:57 2018-06-11 23:40:59 INCR 907kB 1 OK
2018-06-11 23:30:47 2018-06-11 23:30:49 FULL 58MB 1 OK
pg_rman 删除备份集
若果提示不能删除,请执行查看输出信息。如果实在手贱,可以指定 -f 参数。
$ pg_rman delete '2018-06-11 23:30:47'
WARNING: cannot delete backup with start time "2018-06-11 23:30:47"
DETAIL: This is the latest full backup necessary for successful recovery.
$ pg_rman delete -f '2018-06-11 23:30:47'
INFO: delete the backup with start time: "2018-06-11 23:30:47"
$ pg_rman show
=====================================================================
StartTime EndTime Mode Size TLI Status
=====================================================================
2018-06-11 23:40:57 2018-06-11 23:40:59 INCR 907kB 1 OK
pg_rman restore
以下操作为模拟目录误删除。
# systemctl stop postgresql-9.6.service
# ps -ef|grep -i postmaster |grep -v grep
# su - postgres
$ cd $PGDATA/..
$ ls -l
total 8
drwx------. 2 postgres postgres 6 May 10 03:36 backups
drwx------. 20 postgres postgres 4096 Jun 12 02:51 data
-rw-------. 1 postgres postgres 878 Jun 11 21:47 initdb.log
$ mv ./data ./data.bak
$ ls -l
total 8
drwx------. 2 postgres postgres 6 May 10 03:36 backups
drwx------. 20 postgres postgres 4096 Jun 12 02:51 data.bak
-rw-------. 1 postgres postgres 878 Jun 11 21:47 initdb.log
$ mkdir data
$ chmod 700 ./data
$ ls -l
使用 pg_rman restore 还原
$ pg_rman restore
WARNING: pg_controldata file "/var/lib/pgsql/9.6/data/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2018-06-12 02:36:40"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2018-06-12 02:36:40" backup, archive log files and server log files by SIZE
INFO: backup "2018-06-12 02:36:40" is valid
INFO: restoring database files from the full mode backup "2018-06-12 02:36:40"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2018-06-12 02:36:40" is valid
INFO: restoring WAL files from backup "2018-06-12 02:36:40"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
$ cd $PGDATA/
$ ls -l
total 60
-rw-r--r--. 1 postgres postgres 213 Jun 12 02:57 backup_label
drwx------. 7 postgres postgres 67 Jun 12 02:57 base
drwx------. 2 postgres postgres 4096 Jun 12 02:57 global
drwx------. 2 postgres postgres 18 Jun 12 02:57 pg_clog
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_commit_ts
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_dynshmem
-rw-------. 1 postgres postgres 4224 Jun 12 02:57 pg_hba.conf
-rw-------. 1 postgres postgres 1636 Jun 12 02:57 pg_ident.conf
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_log
drwx------. 4 postgres postgres 39 Jun 12 02:57 pg_logical
drwx------. 4 postgres postgres 36 Jun 12 02:57 pg_multixact
drwx------. 2 postgres postgres 18 Jun 12 02:57 pg_notify
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_replslot
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_serial
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_snapshots
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_stat
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_stat_tmp
drwx------. 2 postgres postgres 18 Jun 12 02:57 pg_subtrans
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_tblspc
drwx------. 2 postgres postgres 6 Jun 12 02:57 pg_twophase
-rw-------. 1 postgres postgres 4 Jun 12 02:57 PG_VERSION
drwx------. 3 postgres postgres 28 Jun 12 02:57 pg_xlog
-rw-------. 1 postgres postgres 88 Jun 12 02:57 postgresql.auto.conf
-rw-------. 1 postgres postgres 22304 Jun 12 02:57 postgresql.conf
-rw-------. 1 postgres postgres 60 Jun 12 02:57 postmaster.opts
-rw-r--r--. 1 postgres postgres 118 Jun 12 02:57 recovery.conf
$ cat recovery.conf
# recovery.conf generated by pg_rman 1.3.6
restore_command = 'cp /mnt/walbackup/%f %p'
recovery_target_timeline = '1'
启动postgresql
# systemctl start postgresql-9.6.service
# ps -ef|grep -i postgres
postgres 29509 1 0 02:59 ? 00:00:00 /usr/pgsql-9.6/bin/postmaster -D /var/lib/pgsql/9.6/data/
postgres 29512 29509 0 02:59 ? 00:00:00 postgres: logger process
postgres 29515 29509 0 02:59 ? 00:00:00 postgres: checkpointer process
postgres 29516 29509 0 02:59 ? 00:00:00 postgres: writer process
postgres 29518 29509 0 02:59 ? 00:00:00 postgres: stats collector process
postgres 29525 29509 0 02:59 ? 00:00:00 postgres: wal writer process
postgres 29526 29509 0 02:59 ? 00:00:00 postgres: autovacuum launcher process
postgres 29527 29509 0 02:59 ? 00:00:00 postgres: archiver process last was 00000002.history
postgres 29559 16928 0 03:00 pts/1 00:00:00 ps -ef
postgres 29560 16928 0 03:00 pts/1 00:00:00 grep --color=auto -i postgres
有时候restore后启动会碰到如下错误:
invalid primary checkpoint record
invalid secondary checkpoint record
could not locate a valid checkpoint record
此时只能重置xlog,并取消恢复模式
$ pg_resetxlog -f $PGDATA
$ mv $PGDATA/recovery.conf $PGDATA/recovery.done
这里有个问题需要煮一下,使用pg_rman备份时对wal的归档会是通过软链接来实现。建议添加 --hard-copy
参考:
https://github.com/ossc-db/pg_rman/tree/master
http://ossc-db.github.io/pg_rman/index.html