MGR搭建过程中遇到的错误以及解决办法
MGR搭建过程中遇到的一些故障
实际中我一共部署了三套MGR环境,分别是单机多实例的MGR环境,多机同网段的MGR环境,多机不同网段的MGR环境,部署的过程大同小异,但是还是有一些有出入的地方,这里把部署过程遇到的故障列举出来,供大家参考,如果能有幸解决您在部署时候的问题,那是极好的。
01
常见故障1
[ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: bb874065-c485-11e8-8b52-000c2934472e:1 > Group transactions: 3db33b36-0e51-409f-a61d-c99756e90155:1-11' [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.' [Note] Plugin group_replication reported: ‘To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option’
解决方案:
根据提示打开set global group_replication_allow_local_disjoint_gtids_join=ON;
02
常见故障2
[ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: bb874065-c485-11e8-8b52-000c2934472e:1 > Group transactions: 3db33b36-0e51-409f-a61d-c99756e90155:1-15' [Warning] Plugin group_replication reported: 'The member contains transactions not present in the group. It is only allowed to join due to group_replication_allow_local_disjoint_gtids_join option' [Note] Plugin group_replication reported: 'This server is working as secondary member with primary member address localhost.localdomaion:3306.'
解决方案:
该故障和故障1的不同之处在于该问题出现时,参数group_replication_allow_local_disjoint_gtids_join已经设置成为on了。解决该问题的方法是执行reset master就行,然后重新在主节点和从节点开启通道,即
CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery';
03
常见故障3
本机测试时,遇到下面的问题
[Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@localhost.localdomaion:' - retry-time: 60 retries: 1, Error_code: 2005 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.' [ERROR] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.' [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt /'
解决方案:
这个问题是由于测试环境上三台主机的hostname设置成为了同一个名称,改了hostname之后,这个问题就解决了。
04
常见故障4
#在线上正式环境操作时,出现下面的错误, mysql--root@localhost:(none) ::>>START GROUP_REPLICATION; ERROR (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log. #查看log文件,发现只有一个warning: 2019-02-20T07::30.233937Z [Warning] Plugin group_replication reported: 'Group Replication requires slave-preserve-commit-order to be set to ON when using more than 1 applier threads.
解决方案:
mysql--root@localhost:(none) ::>>show variables like "%preserve%"; +--------------------------------+---------+ | Variable_name | Value | +--------------------------------+---------+ | slave_preserve_commit_order | OFF | +--------------------------------+---------+ row in set (0.01 sec) mysql--root@localhost:(none) ::>>set global slave_preserve_commit_order=1; Query OK, rows affected (0.00 sec)
05
常见问题5
2019-02-20T08::31.088437Z [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 192.168.9.208 refused. Address is not in the IP whitelist.' 2019-02-20T08::32.088676Z [Warning] Plugin group_replication reported: '[GCS] Connection attempt from IP address 192.168.9.208 refused. Address is not in the IP whitelist.'
解决方法:
在my.cnf中配置group_replication_ip_whitelist参数即可解决
06
常见问题6
2019-02-20T08::44.087492Z [Warning] Plugin group_replication reported: 'read failed' 2019-02-20T08::44.096171Z [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 24801' 2019-02-20T08::14.065775Z [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group
解决方案:
将my.cnf中的参数group_replication_group_seeds设置为只包含除自身外其他group成员的ip地址以及内部通信端口,如果写成group所有成员的IP地址,则会出现这个错误,这和相同网段的MGR部署方式有些差异。
07
常见问题7
[ERROR] Plugin group_replication reported: ‘[GCS] Error on opening a connection to oceanbase07: on local port: '.’ [ERROR] Plugin group_replication reported: ‘[GCS] Error on opening a connection to oceanbase08: on local port: '.’ [ERROR] Plugin group_replication reported: ‘[GCS] Error on opening a connection to oceanbase07: on local port: '.’
解决方案:
未开通防火墙上的固定端口,开通防火墙之后即可解决
08
常见问题8
[Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. [ERROR] Slave I/O for channel 'group_replication_recovery': Master command COM_REGISTER_SLAVE failed: Access denied for user 'rpl_user'@'%' (using password: YES) (Errno: 1045), Error_code: 1597 [ERROR] Slave I/O thread couldn't register on master [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position
解决方案:
漏掉了某个节点的用户,为了保险起见,在group节点上执行
CREATE USER rpl_user@'%';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'rpl_pass';
09
常见问题9
[ERROR] Failed to open the relay log './localhost-relay-bin.000011' (relay_log_pos ). [ERROR] Could not find target log file mentioned in relay log info in the index file './work_NAT_1-relay-bin. index' during relay log initialization. [ERROR] Slave: Failed to initialize the master info structure for channel ''; its record may still be present in 'mysql.slave_master_info' table, consider deleting it. [ERROR] Failed to open the relay log './localhost-relay-bin-group_replication_recovery.000001' (relay_log_pos ). [ERROR] Could not find target log file mentioned in relay log info in the index file './work_NAT_1-relay-bin-group_replication_recovery.index' during relay log initialization. [ERROR] Slave: Failed to initialize the master info structure for channel 'group_replication_recovery'; its record may still be present in 'mysql.slave_master_info' table, consider deleting it. [ERROR] Failed to create or recover replication info repositories. [ERROR] Slave SQL for channel '': Slave failed to initialize relay log info structure from the repository, Error_code: [ERROR] /usr/local/mysql/bin/mysqld: Slave failed to initialize relay log info structure from the repository [ERROR] Failed to start slave threads for channel ''
解决方案:
这个错误是由于slave节点由于某种原因导致找不到relay-log的位置了,需要重新reset slave