1,环境准备:
DB:mysql-5.7.16
MHA:mha4mysql-manager-0.56.tar.gz、mha4mysql-node-0.56.tar.gz、daemontools-0.76.tar.gz
role ipmonitor 10.99.121.206master 10.99.121.209slave 10.99.121.210slave 10.99.121.213
主从:
在主上(10.99.121.206)创建复制用户与监控用户
GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%' IDENTIFIED BY '123qwe';grant all privileges on *.* to 'monitor'@'%' identified by '123qwe';
在从上(10.99.121.210、10.99.121.213)配置
change master tomaster_host='10.99.121.209',master_port=3306,master_user='repluser',master_password='123qwe',master_log_file='binlog.000002',master_log_pos=1009;set global read_only=on;
2,在所有的node上安装perl依赖、mha-node
master slave install epel and perl-DBD-MySQLrpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpmyum install perl-DBD-MySQL -yyum install -y perl-develyum install -y perl-CPANmaster slave install MHA nodetar xf mha4mysql-node-0.56.tar.gzcd mha4mysql-node-0.56perl Makefile.PLmake && make install
安装完成后会在/usr/local/bin目录下生成以下脚本文件:
apply_diff_relay_logs //识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog //去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs //清除中继日志(不会阻塞SQL线程)
save_binary_logs //保存和复制master的二进制日志
3,安装MHA Manager,在MHA Manager的主机也是需要安装MHA Node,MHA Manger也依赖于perl模块
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpmyum install perl-DBD-MySQL -yyum install -y perl-devel perl-CPANyum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -ytar xf mha4mysql-node-0.56.tar.gzcd mha4mysql-node-0.56perl Makefile.PLmake && make installtar xf mha4mysql-manager-0.56.tar.gzcd mha4mysql-manager-0.56perl Makefile.PLmake && make installcd /apps/mha4mysql-manager-0.56/samples/scriptscp * /usr/local/bin/
4,配置SSH登录无密码验证
所有服务器上生成:
ssh-keygen -t rsa
登陆10.99.121.206:
ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.209ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.210ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.213
登陆10.99.121.209
ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.210ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.213
登陆10.99.121.210
ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.209ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.213
登陆10.99.121.213
ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.209ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.99.121.210
5,配置MHA(登陆10.99.121.206)
(1)创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)
mkdir -p /etc/masterhacp mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha/
修改app1.cnf配置文件,修改后的文件内容如下
[server default]manager_workdir=/var/log/masterha/app1manager_log=/var/log/masterha/app1/manager.logmaster_binlog_dir=/data/mysql/mysql_datamaster_ip_failover_script= /usr/local/bin/master_ip_failoverster_ip_online_change_script= /usr/local/bin/master_ip_online_changepassword=123qweuser=monitorping_interval=1remote_workdir=/tmprepl_password=123qwerepl_user=repluserreport_script=/usr/local/bin/send_reportsecondary_check_script= /usr/local/bin/masterha_secondary_check -s lenovo2 -s lenovo1 --user=repluser --master_host=lenovo1 --master_ip=10.99.121.209 --master_port=3306shutdown_script=""ssh_user=root[server1]hostname=10.99.121.209port=3306[server2]hostname=10.99.121.210port=3306candidate_master=1check_repl_delay=0[server3]hostname=10.99.121.213port=3306
(2)设置relay log的清除方式(在每个slave节点上)
mysql -uroot -pLenovo123#@! -e "set global relay_log_purge=0"mysql -uroot -pLenovo123#@! -e "set global relay_log_purge=0"
6,检查SSH配置(10.99.121.206 Monitor 监控节点上操作),如下
[root@lenovo16 masterha]# masterha_check_ssh --conf=/etc/masterha/app1.cnfThu May 25 16:55:19 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.Thu May 25 16:55:19 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..Thu May 25 16:55:19 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..Thu May 25 16:55:19 2017 - [info] Starting SSH connection tests..Thu May 25 16:55:20 2017 - [debug] Thu May 25 16:55:19 2017 - [debug] Connecting via SSH from root@10.99.121.210(10.99.121.210:22) to root@10.99.121.209(10.99.121.209:22)..Thu May 25 16:55:20 2017 - [debug] ok.Thu May 25 16:55:20 2017 - [debug] Connecting via SSH from root@10.99.121.210(10.99.121.210:22) to root@10.99.121.213(10.99.121.213:22)..Thu May 25 16:55:20 2017 - [debug] ok.Thu May 25 16:55:20 2017 - [debug] Thu May 25 16:55:19 2017 - [debug] Connecting via SSH from root@10.99.121.209(10.99.121.209:22) to root@10.99.121.210(10.99.121.210:22)..Thu May 25 16:55:19 2017 - [debug] ok.Thu May 25 16:55:19 2017 - [debug] Connecting via SSH from root@10.99.121.209(10.99.121.209:22) to root@10.99.121.213(10.99.121.213:22)..Thu May 25 16:55:20 2017 - [debug] ok.Thu May 25 16:55:21 2017 - [debug] Thu May 25 16:55:20 2017 - [debug] Connecting via SSH from root@10.99.121.213(10.99.121.213:22) to root@10.99.121.209(10.99.121.209:22)..Thu May 25 16:55:20 2017 - [debug] ok.Thu May 25 16:55:20 2017 - [debug] Connecting via SSH from root@10.99.121.213(10.99.121.213:22) to root@10.99.121.210(10.99.121.210:22)..Thu May 25 16:55:21 2017 - [debug] ok.Thu May 25 16:55:21 2017 - [info] All SSH connection tests passed successfully.
7,检查整个复制环境状况(10.99.121.206 Monitor 监控节点上操作),如下
masterha_check_repl --conf=/etc/masterha/app1.cnf
如果遇到如下错误:
Thu May 25 16:58:30 2017 - [info] Connecting to root@10.99.121.210(10.99.121.210:22).. Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 106.mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options at /usr/local/bin/apply_diff_relay_logs line 493
那在所有mysql数据库上执行:
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlogln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
8,开启MHA Manager监控(10.99.121.206 Monitor 监控节点上操作)如下
[root@lenovo16 apps]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 is stopped(2:NOT_RUNNING).[root@lenovo16 apps]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &[1] 18587[root@lenovo16 apps]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 monitoring program is now on initialization phase(10:INITIALIZING_MONITOR). Wait for a while and try checking again.[root@lenovo16 apps]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 (pid:18587) is running(0:PING_OK), master:10.99.121.209
wget http://cr.yp.to/daemontools/daemontools-0.76.tar.gztar -zxvf daemontools-0.76.tar.gzcd admin/daemontools-0.76/package/install
在src/conf-cc最后加上-include /usr/include/errno.h
[root@lenovo16 apps]# package/install
最后会在admin/daemontools-0.76先建立command目录并存放相关命令,有什么命令我们可以这样查看:
[root@lenovo16 apps]# ls /apps/admin/daemontools-0.76/commandenvdir fghack pgrphack setlock softlimit svc svscan svstat tai64nlocalenvuidgid multilog readproctitle setuidgid supervise svok svscanboot tai64n
同时在/usr/local/bin下对上面这些命令建立了软连接方便我们执行
另外创建监控/services目录,并在/etc/inittab下也有变化:
SV:123456:respawn:/command/svscanboot
它使用init的方式来守护自己
[root@lenovo16 apps]# mkdir -p /service/masterha_app1[root@lenovo16 apps]# vim /service/masterha_app1/run#!bin/bashexec masterha_manager --conf=/etc/mha/app1.cnf --wait_on_monitor_error=60 --wait_on_failover_error=60 >> /var/log/masterha/app1/app1.log 2>&1[root@lenovo16 apps]#chmod 755 /service/masterha_app1/run##启动monitoringsvc -u /service/masterha_app1##停止monitoringsvc -d /service/masterha_app1
在此我们先不使用这种方式启动。