Greenplum的全量备份之gpcrondump

时间:2022-02-01 04:16:01
gpcrondump是对gp_dump的一个包装,可以直接调用或者从crontab中调用。这个命令还允许备份除了数据库和数据之外的对象,比如数据库角色和服务器配置等。
gpcrondump 常用到的参数解释
**********************
Return Codes
**********************

The following is a list of the codes that gpcrondump returns.
   0 - Dump completed with no problems
   1 - Dump completed, but one or more warnings were generated
   2 - Dump failed with a fatal error

-a (do not prompt) 

 Do not prompt the user for confirmation. 

-d <master_data_directory> 

 The master host data directory. If not specified, the value set for
 $MASTER_DATA_DIRECTORY will be used. 

--dump-stats

 Dump optimizer statistics from pg_statistic. Statistics are dumped in the
 master data directory to db_dumps/YYYYMMDD/gp_statistics_1_1_<timestamp>.

-g (copy config files) 

 Secure a copy of the master and segment configuration files
 postgresql.conf, pg_ident.conf, and pg_hba.conf. These configuration
 files are dumped in the master or segment data directory to
 db_dumps/YYYYMMDD/config_files_<timestamp>.tar. 

 If --ddboost is specified, the backup is located on the default storage
 unit in the directory specified by --ddboost-backupdir when the Data
 Domain Boost credentials were set.

-G (dump global objects) 

 Use pg_dumpall to dump global objects such as roles and tablespaces.
 Global objects are dumped in the master data directory to
 db_dumps/YYYYMMDD/gp_global_1_1_<timestamp>. 

-h (record dump details) 

 Record details of database dump in database table
 public.gpcrondump_history in database supplied via -x option. Utility
 will create table if it does not currently exist. 

--incremental (backup changes to append-optimized tables)

 Adds an incremental backup to a backup set. When performing an
 incremental backup, the complete backup set created prior to the
 incremental backup must be available. The complete backup set includes
 the following backup files: 

 * The last full backup before the current incremental backup 

 * All incremental backups created between the time of the full backup
   the current incremental backup 

 An incremental backup is similar to a full back up except for
 append-optimized tables, including column-oriented tables. An
 append-optimized table is backed up only if at least one of the
 following operations was performed on the table after the last backup.
   ALTER TABLE
   INSERT
   UPDATE
   DELETE
   TRUNCATE
   DROP and then re-create the table

 For partitioned append-optimized tables, only the changed table
 partitions are backed up. 

 The -u option must be used consistently within a backup set that
 includes a full and incremental backups. If you use the -u option with a
 full backup, you must use the -u option when you create incremental
 backups that are part of the backup set that includes the full backup. 

 You can create an incremental backup for a full backup of set of
 database tables. When you create the full backup, specify the --prefix
 option to identify the backup. To include a set of tables in the full
 backup, use either the -t option or --table-file option. To exclude a
 set of tables, use either the -T option or the --exclude-table-file
 option. See the description of the option for more information on its
 use. 

 To create an incremental backup based on the full backup of the set of
 tables, specify the option --incremental and the --prefix option with
 the string specified when creating the full backup. The incremental
 backup is limited to only the tables in the full backup. 

 WARNING: gpcrondump does not check for available disk space prior to
 performing an incremental backup.

 IMPORTANT: An incremental back up set, a full backup and associated
 incremental backups, must be on a single device. For example, a the
 backups in a backup set must all be on a file system or must all be on a
 Data Domain system. 

--prefix <prefix_string> [--list-filter-tables ]

 Prepends <prefix_string> followed by an underscore character (_) to the
 names of all the backup files created during a backup. 

-r (rollback on failure) 

 Rollback the dump files (delete a partial dump) if a failure is
 detected. The default is to not rollback. 

-u <backup_directory> 

 Specifies the absolute path where the backup files will be placed on
 each host. If the path does not exist, it will be created, if possible.
 If not specified, defaults to the data directory of each instance to be
 backed up. Using this option may be desirable if each segment host has
 multiple segment instances as it will create the dump files in a
 centralized location rather than the segment data directories. 

 Note: This option is not supported if --ddboost is specified. 

--use-set-session-authorization 

 Use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands
 to set object ownership. 

-x <database_name> 

 Required. The name of the Greenplum database to dump. Specify multiple times for
 multiple databases.

操作新体验 备份数据库 lottu (其中 -x lottu 为指定数据库)

[gpadmin@mdw ~]$ gpcrondump -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x lottu -u /home/gpadmin/backup --prefix lottu -l /home/gpadmin/backup
20160713:16:02:36:044710 gpcrondump:mdw:gpadmin-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x lottu -u /home/gpadmin/backup --prefix lottu -l /home/gpadmin/backup
20160713:16:02:52:044710 gpcrondump:mdw:gpadmin-[INFO]:-Directory /home/gpadmin/backup/db_dumps/20160713 not found, will try to create
20160713:16:02:52:044710 gpcrondump:mdw:gpadmin-[INFO]:-Created /home/gpadmin/backup/db_dumps/20160713
20160713:16:02:52:044710 gpcrondump:mdw:gpadmin-[INFO]:-Checked /home/gpadmin/backup on master
20160713:16:03:14:044710 gpcrondump:mdw:gpadmin-[INFO]:-Configuring for single database dump
20160713:16:03:14:044710 gpcrondump:mdw:gpadmin-[INFO]:-Validating disk space
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Creating filter file: /home/gpadmin/backup/db_dumps/20160713/lottu_gp_dump_20160713160238_filter
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Adding compression parameter
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Adding --prefix
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Adding --no-expand-children
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump process command line gp_dump -p 1921 -U gpadmin --gp-d=/home/gpadmin/backup/db_dumps/20160713 --gp-r=/home/gpadmin/backup/db_dumps/20160713 --gp-s=p --gp-k=20160713160238 --no-lock -c --gp-c --prefix=lottu_ --no-expand-children "lottu" --use-set-session-authorization
20160713:16:03:19:044710 gpcrondump:mdw:gpadmin-[INFO]:-Starting Dump process
20160713:16:03:26:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump process returned exit code 0
20160713:16:03:26:044710 gpcrondump:mdw:gpadmin-[INFO]:-Timestamp key = 20160713160238
20160713:16:03:26:044710 gpcrondump:mdw:gpadmin-[INFO]:-Checked master status file and master dump file.
20160713:16:03:26:044710 gpcrondump:mdw:gpadmin-[INFO]:-Releasing pg_class lock
20160713:16:03:29:044710 gpcrondump:mdw:gpadmin-[INFO]:-Commencing pg_statistic dump
20160713:16:03:31:044710 gpcrondump:mdw:gpadmin-[INFO]:-Created public.gpcrondump_history in lottu database
20160713:16:03:33:044710 gpcrondump:mdw:gpadmin-[INFO]:-Inserted dump record into public.gpcrondump_history in lottu database
20160713:16:03:33:044710 gpcrondump:mdw:gpadmin-[INFO]:-Commencing pg_catalog dump
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump status report
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:----------------------------------------------------
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Target database                          = lottu
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump subdirectory                        = 20160713
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump type                                = Full database
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Clear old dump directories               = Off
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump start time                          = 16:02:38
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump end time                            = 16:03:26
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Status                                   = COMPLETED
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump key                                 = 20160713160238
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dump file compression                    = On
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Vacuum mode type                         = Off
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Exit code zero, no warnings generated
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:----------------------------------------------------
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dumping master config files
20160713:16:03:34:044710 gpcrondump:mdw:gpadmin-[INFO]:-Dumping segment config files
20160713:16:03:36:044710 gpcrondump:mdw:gpadmin-[WARNING]:-Found neither /usr/local/greenplum-db438/bin/mail_contacts nor /home/gpadmin/mail_contacts
20160713:16:03:36:044710 gpcrondump:mdw:gpadmin-[WARNING]:-Unable to send dump email notification
20160713:16:03:36:044710 gpcrondump:mdw:gpadmin-[INFO]:-To enable email notification, create /usr/local/greenplum-
db438/bin/mail_contacts or /home/gpadmin/mail_contacts containing required email addresses
执行之后发现有几个【WARNING】。这个是未设置发送邮件地址?
创建一个名为mail_contacts的文件放置在GP SUPERUSER根目录。例如:$ vi /home/gpadmin/mail_contacts
lottu_zhu@staff.easou.com
yee_yi@staff.easou.com

其中备份数据,自动生成子目录,输出到/home/gpadmin/backup/db_dumps/yyyymmdd目录下

[gpadmin@mdw 20160713]$ ll
total 60
-rw-------. 1 gpadmin gpadmin   136 Jul 13 16:03 lottu_gp_cdatabase_1_1_20160713160238
-rw-------. 1 gpadmin gpadmin   758 Jul 13 16:03 lottu_gp_dump_1_1_20160713160238.gz
-rw-------. 1 gpadmin gpadmin   374 Jul 13 16:03 lottu_gp_dump_1_1_20160713160238_post_data.gz
-rw-rw-r--. 1 gpadmin gpadmin     0 Jul 13 16:03 lottu_gp_dump_20160713160238_ao_state_file
-rw-rw-r--. 1 gpadmin gpadmin     0 Jul 13 16:03 lottu_gp_dump_20160713160238_co_state_file
-rw-rw-r--. 1 gpadmin gpadmin     0 Jul 13 16:03 lottu_gp_dump_20160713160238_last_operation
-rw-rw-r--. 1 gpadmin gpadmin  1129 Jul 13 16:03 lottu_gp_dump_20160713160238.rpt
-rw-------. 1 gpadmin gpadmin  2403 Jul 13 16:03 lottu_gp_dump_status_1_1_20160713160238
-rw-rw-r--. 1 gpadmin gpadmin  1041 Jul 13 16:03 lottu_gp_global_1_1_20160713160238
-rw-rw-r--. 1 gpadmin gpadmin 30720 Jul 13 16:03 lottu_gp_master_config_files_20160713160238.tar
-rw-rw-r--. 1 gpadmin gpadmin  1500 Jul 13 16:03 lottu_gp_statistics_1_1_20160713160238
[gpadmin@mdw 20160713]$ pwd
/home/gpadmin/backup/db_dumps/20160713

每个数据库中都会记录对应数据库的备份历史信息。涉及系统表gpcrondump_history

lottu=# \d gpcrondump*
              Table "public.gpcrondump_history"
       Column       |            Type             | Modifiers
--------------------+-----------------------------+-----------
 rec_date           | timestamp without time zone |
 start_time         | character(8)                |
 end_time           | character(8)                |
 options            | text                        |
 dump_key           | character varying(20)       |
 dump_exit_status   | smallint                    |
 script_exit_status | smallint                    |
 exit_text          | character varying(10)       |
Distributed by: (rec_date)

lottu=# \x
Expanded display is on.
lottu=# select * from gpcrondump_history;
-[ RECORD 1 ]------+---------------------------------------------------------------------------------------------------------------------------------------
rec_date           | 2016-07-13 16:03:32.079202
start_time         | 16:02:38
end_time           | 16:03:26
options            | -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x lottu -u /home/gpadmin/backup --prefix lottu -l /home/gpadmin/backup
dump_key           | 20160713160238
dump_exit_status   | 0
script_exit_status | 0
exit_text          | COMPLETED

最后奉上备份脚本:再加入定时任务中;每天可以执行备份

#!/bin/sh

backupdir="/home/gpadmin/backup"
logdir=$backupdir
dbid="lottu"

for dbname in `psql -A -q -t -c "select datname from pg_database where datname <>'template0'"`
do
now=`date +%Y%m%d%H%M%S`
gpcrondump -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x $dbname -u $backupdir --prefix $dbid -l $logdir -d $masterdir -K $now
done

--参考文献:德哥 https://yq.aliyun.com/articles/30330?spm=5176.8067842.tagmain.19.etfAn9