千家信息网

pg_rman备份恢复测试

发表于:2025-11-09 作者:千家信息网编辑
千家信息网最后更新 2025年11月09日,环境描述1.OSCentOS Linux release 7.2.1511 (Core) X642.PostgreSQLPostgreSQL 9.6.13.pg_rmanpg_rman-1.3.3-p
千家信息网最后更新 2025年11月09日pg_rman备份恢复测试

环境描述

1.OS

CentOS Linux release 7.2.1511 (Core) X64

2.PostgreSQL

PostgreSQL 9.6.1

3.pg_rman

pg_rman-1.3.3-pg96.tar.gz v1.3.3

注意:请下载版本对应的源码包。

https://github.com/ossc-db/pg_rman/releases/download/v1.3.3/pg_rman-1.3.3-pg96.tar.gz

pg_rman-1.3.3.tar.gz(此源码编译过程中报错)

系统包

zlib-devel


二、pg_rman安装

1.安装pg_rman

root用户登录

export PATH=/opt/pgsql/9.6.1/bin:$PATH

export LD_LIBRARY_PATH=/opt/pgsql/9.6.1/lib

export MANPATH=/opt/pgsql/9.6.1/share/man:$MANPATH


# tar zxvf pg_rman-9_6_STABLE.tar.gz

# cd pg_rman-9_6_STABLE/

# make

......

......

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 backup.o catalog.o data.o delete.o dir.o init.o parray.o pg_rman.o restore.o show.o util.o validate.o xlog.o pgsql_src/pg_ctl.o pgut/pgut.o pgut/pgut-port.o -L/opt/pgsql/9.6.1/lib -lpgcommon -lpgport -L/opt/pgsql/9.6.1/lib -lpq -L/opt/pgsql/9.6.1/lib -Wl,--as-needed -Wl,-rpath,'/opt/pgsql/9.6.1/lib',--enable-new-dtags -lpgcommon -lpgport -lz -lreadline -lrt -lcrypt -ldl -lm -o pg_rman

# make install

/usr/bin/mkdir -p '/opt/pgsql/9.6.1/bin'

/usr/bin/install -c pg_rman '/opt/pgsql/9.6.1/bin'

#

2.安装验证

su - postgres

$ pg_rman --version

pg_rman 1.3.3


3.配置数据库参数

wal_level = replica

archive_mode = on

archive_command = 'test ! -f /pg_arclog/%f && cp %p /pg_arclog/%f'

--- root user

mkdir /backup_pg_rman /pg_arclog

chown -R postgres:postgres /backup_pg_rman

chown -R postgres:postgres /pg_arclog

--- postgresql

# pg_rman init -B $backup_dir


三、备份恢复测试

1.备份数据(full<0> + incremental<1>)

# full

export PGDATA=/pgdata96

export BACKUP_PATH=/backup_pg_rman


$ echo $PGDATA

/pgdata96

$ echo $BACKUP_PATH

/backup_pg_rman

$

--- init backup dir: pg_rman init -B $backup_dir -D $PGDATA(当不配置环境变量时,手工指定,注意路径末尾不添加'/'结束符)

$ pg_rman init

INFO: ARCLOG_PATH is set to '/pg_arclog'

INFO: SRVLOG_PATH is set to '/pgdata96/pg_log'

$


$ cat $BACKUP_PATH/pg_rman.ini

ARCLOG_PATH='/pg_arclog'

SRVLOG_PATH='/pgdata96/pg_log'


--- full backup

$ pg_rman backup --backup-mode=full --with-serverlog --progress

INFO: copying database files

Processed 1172 of 1172 files, skipped 0

INFO: copying archived WAL files

Processed 3 of 3 files, skipped 0

INFO: copying server log files

Processed 4 of 4 files, skipped 0

INFO: backup complete

INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.


--- validate backup

$ pg_rman validate, status: done

INFO: validate: "2017-03-06 16:43:39" backup, archive log files and server log files by CRC

INFO: backup "2017-03-06 16:43:39" is valid


--- show backup, status: ok

$ pg_rman show

==========================================================

StartTime Mode Duration Size TLI Status

==========================================================

2017-03-06 16:43:39 FULL 0m 58MB 1 OK

$

--- incremental

$ pg_rman backup --backup-mode=incremental --with-serverlog --progress

INFO: copying database files

Processed 1172 of 1172 files, skipped 1115

INFO: copying archived WAL files

Processed 48 of 48 files, skipped 3

INFO: copying server log files

Processed 4 of 4 files, skipped 3

INFO: backup complete

INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.

$

--- validate backup

$ pg_rman validate

INFO: validate: "2017-03-06 17:04:45" backup, archive log files and server log files by CRC

INFO: backup "2017-03-06 17:04:45" is valid

$

--- show, status: ok

$ pg_rman show detail

============================================================================================================

StartTime Mode Duration Data ArcLog SrvLog Total Compressed CurTLI ParentTLI Status

============================================================================================================

2017-03-06 17:04:45 INCR 0m 401MB 738MB 27kB 1136MB false 1 0 OK

2017-03-06 16:43:39 FULL 0m 30MB 33MB 206kB 58MB false 1 0 OK

$


2.模拟灾难恢复


1).删除PGDATA 目录下所有文件

安全停止数据库,删除文件

$ pg_ctl stop -m immediate -D /pgdata96/

$ cd /pgdata96

$ rm -rf *.*


2).恢复备份

--- postgres user

$ export PGDATA=/pgdata96

$ export BACKUP_PATH=/backup_pg_rman

$ pg_rman restore

WARNING: pg_controldata file "/pgdata96/global/pg_control" does not exist

WARNING: pg_controldata file "/pgdata96/global/pg_control" does not exist

INFO: the recovery target timeline ID is not given

INFO: use timeline ID of latest full backup as recovery target: 1

INFO: calculating timeline branches to be used to recovery target point

INFO: searching latest full backup which can be used as restore start point

INFO: found the full backup can be used as base in recovery: "2017-03-06 16:43:39"

INFO: copying online WAL files and server log files

INFO: clearing restore destination

INFO: validate: "2017-03-06 16:43:39" backup, archive log files and server log files by SIZE

INFO: backup "2017-03-06 16:43:39" is valid

INFO: restoring database files from the full mode backup "2017-03-06 16:43:39"

INFO: searching incremental backup to be restored

INFO: validate: "2017-03-06 17:04:45" backup, archive log files and server log files by SIZE

INFO: backup "2017-03-06 17:04:45" is valid

INFO: restoring database files from the incremental mode backup "2017-03-06 17:04:45"

INFO: searching backup which contained archived WAL files to be restored

INFO: backup "2017-03-06 17:04:45" is valid

INFO: restoring WAL files from backup "2017-03-06 17:04:45"

INFO: restoring online WAL files and server log files

INFO: generating recovery.conf

INFO: restore complete

HINT: Recovery will start automatically when the PostgreSQL server is started.

$


3).启动数据库验证数据

# /etc/init.d/postgresql start

Starting PostgreSQL: ok

#

切换至postgres用户,然后验证数据



基于时间点恢复

建立测试数据

testdb=# create table tbl(id int primary key, first varchar(20),second varchar(20));

CREATE TABLE

testdb=# INSERT INTO tbl VALUES(generate_series(1,1000000), 'first'||(random()*(10^3))::integer, 'second'||(random()*(10^3))::integer);

INSERT 0 1000000

testdb=#


建立全备份

--- postgres user

$ pg_rman backup --backup-mode=full --with-serverlog --progress

INFO: copying database files

Processed 1172 of 1172 files, skipped 0

INFO: copying archived WAL files

Processed 27 of 27 files, skipped 0

INFO: copying server log files

Processed 1 of 1 files, skipped 0

INFO: backup complete

INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.

$ pg_rman show

==========================================================

StartTime Mode Duration Size TLI Status

==========================================================

2017-03-07 16:57:33 FULL 0m 433MB 4 DONE

$ pg_rman validate

INFO: validate: "2017-03-07 16:57:33" backup, archive log files and server log files by CRC

INFO: backup "2017-03-07 16:57:33" is valid

[postgres@localhost ~]$ pg_rman show

==========================================================

StartTime Mode Duration Size TLI Status

==========================================================

2017-03-07 16:57:33 FULL 0m 433MB 4 OK

$


drop 表

testdb=# drop table tbl;

DROP TABLE

testdb=# \q


停止数据库

--- root user

# /etc/init.d/postgresql stop


恢复数据库到指定时间

$ pg_rman restore --recovery-target-time '2017-03-07 16:58:33'

INFO: the recovery target timeline ID is not given

INFO: use timeline ID of current database cluster as recovery target: 4

INFO: calculating timeline branches to be used to recovery target point

INFO: searching latest full backup which can be used as restore start point

INFO: found the full backup can be used as base in recovery: "2017-03-07 16:57:33"

INFO: copying online WAL files and server log files

INFO: clearing restore destination

INFO: validate: "2017-03-07 16:57:33" backup, archive log files and server log files by SIZE

INFO: backup "2017-03-07 16:57:33" is valid

INFO: restoring database files from the full mode backup "2017-03-07 16:57:33"

INFO: searching incremental backup to be restored

INFO: searching backup which contained archived WAL files to be restored

INFO: backup "2017-03-07 16:57:33" is valid

INFO: restoring WAL files from backup "2017-03-07 16:57:33"

INFO: restoring online WAL files and server log files

INFO: generating recovery.conf

INFO: restore complete

HINT: Recovery will start automatically when the PostgreSQL server is started.

$


启动数据库

--- root user

# /etc/init.d/postgresql start


验证数据

--- postgres user

$ psql testdb

psql (9.6.1)

Type "help" for help.


testdb=# \dt

List of relations

Schema | Name | Type | Owner

--------+------+-------+----------

public | tbl | table | postgres

(1 row)


testdb=# select count(*) from tbl;

count

---------

1000000

(1 row)


testdb=# \q




异常停止数据恢复

描述:当数据库没有成功执行检查点完成,恢复时可能会丢失数据,错误排查

现象:启动数据库失败时

$ more postgresql-Mon.log

2017-03-06 17:20:47 CST [3240]: [1-1] user=,db= LOG: database system was interrupted; last known up at 2017-03-06 17:04:51 CST

2017-03-06 17:20:47 CST [3240]: [2-1] user=,db= LOG: starting archive recovery

2017-03-06 17:20:47 CST [3240]: [3-1] user=,db= LOG: invalid primary checkpoint record

2017-03-06 17:20:47 CST [3240]: [4-1] user=,db= LOG: invalid secondary checkpoint record

2017-03-06 17:20:47 CST [3240]: [5-1] user=,db= PANIC: could not locate a valid checkpoint record

2017-03-06 17:20:47 CST [3238]: [3-1] user=,db= LOG: startup process (PID 3240) was terminated by signal 6: Aborted

2017-03-06 17:20:47 CST [3238]: [4-1] user=,db= LOG: aborting startup due to startup process failure

2017-03-06 17:20:47 CST [3238]: [5-1] user=,db= LOG: database system is shut down

2017-03-06 17:21:23 CST [3269]: [1-1] user=,db= LOG: database system was interrupted; last known up at 2017-03-06 17:04:51 CST

2017-03-06 17:21:23 CST [3269]: [2-1] user=,db= LOG: starting archive recovery

2017-03-06 17:21:23 CST [3269]: [3-1] user=,db= LOG: invalid primary checkpoint record

2017-03-06 17:21:23 CST [3269]: [4-1] user=,db= LOG: invalid secondary checkpoint record

2017-03-06 17:21:23 CST [3269]: [5-1] user=,db= PANIC: could not locate a valid checkpoint record

2017-03-06 17:21:23 CST [3267]: [3-1] user=,db= LOG: startup process (PID 3269) was terminated by signal 6: Aborted

2017-03-06 17:21:23 CST [3267]: [4-1] user=,db= LOG: aborting startup due to startup process failure

2017-03-06 17:21:23 CST [3267]: [5-1] user=,db= LOG: database system is shut down

$


处理步骤说明:

重置事务日志

仅保留备份时数据

$ pg_resetxlog -f /pgdata96

Transaction log reset

$

然后启动数据库,验证部分数据


数据 数据库 备份 验证 测试 文件 时间 源码 环境 用户 配置 安全 成功 中报 事务 参数 变量 手工 数据恢复 日志 数据库的安全要保护哪些东西 数据库安全各自的含义是什么 生产安全数据库录入 数据库的安全性及管理 数据库安全策略包含哪些 海淀数据库安全审计系统 建立农村房屋安全信息数据库 易用的数据库客户端支持安全管理 连接数据库失败ssl安全错误 数据库的锁怎样保障安全 机架服务器接笔记本 csgo杀几个队友被服务器封禁 未成年人网络安全宣讲 春苏网络技术工作室 多台服务器搭建docker 河北服务器机柜系列云空间 如何登录局域网内的共享服务器 数据库集群复制方式 甘肃网络安全保护制度 建数据库模型 筛数据 vba获取数据库的表和字段 图书馆管理系统数据库表格 如何进入服务器国际版 服务器步骤 一梦江湖不同服务器能一起玩吗 数据库系统的五个部分中核心部分 阿里云服务器 带数据库 代理服务器手机版 数据库灾害体编号 大疆软件开发工程师工资 服务器为什么关机就会丢失数据 泰安联想服务器代理电话 软件开发的技术支持怎么写 丰都市网络安全宣传 计算机网络技术能报考一建么 杭州慕协网络技术有限公司 oracle授权数据库 psm服务器和踏台服务器的区别 天津市网信办网络安全预警系统 魔兽8区有哪些服务器
0