我们的mysql 备份系统遭遇严重bug
源于 开源软件 xtrabackup 的一个bug
https://bugs.launchpad.net/percona-xtrabackup/+bug/722638
之前我们的大规模部署都没有遇到这问题。
在做计数器转mysql 后,我们部署了备份系统,屡屡备份失败,于是决定彻底的解决这个问题,
经过一系列测试后,发现在备份过程无法跨越 计数器的数据入库操作,
备份系统报错:
[code]
[01] Copying ./cnt_it/cnt_referrer_channel_2011.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_referrer_channel_2011.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_goals_abandon_201109.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_goals_abandon_201109.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_referrer_search_keyword_201107.ibd InnoDB: Error: tablespace id is 43167 in the data dictionary
InnoDB: but in file ./cnt_it/cnt_referrer_summary_work.ibd it is 43178!
110610
18:37:57 InnoDB: Assertion failure in thread 1201920320 in file
/home/buildbot/slaves/percona-server-51-12/TGZ_CentOS_5_x86_64/work/xtrabackup-1.6/Percona-Server-5.5/storage/innobase/fil/fil0fil.c
line 780
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_referrer_search_keyword_201107.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_goals_referrer_201205.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_goals_referrer_201205.ibd
[01] ...done
./backup.sh: line 109: 24002 备份失败 xtrabackup
--defaults-file=$CNF --backup --target-dir=$BACKUP/$ENGINE/full/$day
--datadir=$DATADIR
+ return 1
+ critical
+ df -h
[/code]
上面是什么问题呢?
就是说在备份过程中,数据库的表不能rebuild 操作,比如: truncate table , drop table ,并重新建表 这样的操作。
从报错信息上看,应该是xtrabackup 已经考虑到这个问题了,只是当时没有处理,于是在相关的代码处加了一个assertion
代码这个地方出错,就退出。
这个bug 在1.5,1.5.1 ,1.6 版本都存在这个问题。 要到1.7版本才能修复。
慢慢等吧!
目前替代方案,采用备份从库解决。