1. 搭建主从复制(一主两从)
1.1 准备环境
1 主库:10.0.0.51/db01 2 从库:10.0.0.52/db02,10.0.0.53/db03
1.2 清理环境
1 systemctl stop mysqld 2 rm -rf /data/mysql_3306/* 3 rm -rf /binlog/ 4 mkdir /binlog/ 5 6 创建相关目录与授权 7 主库操作:开启binlog 和GTID 8 mkdir -p /binlog/ 9 mkdir -p /var/log/mysql/ 10 touch /var/log/mysql/mysql.err 11 chown -R mysql.mysql /var/log/mysql/ 12 chown -R mysql.mysql /binlog/ 13 14 从库操作:从库不需要开启binlog 15 mkdir -p /var/log/mysql/ 16 touch /var/log/mysql/mysql.err 17 chown -R mysql.mysql /var/log/mysql/
1.3 准备配置⽂件
1 a. 主库 (开启binlog和GTID) 2 cat > /etc/my.cnf <<EOF 3 [mysqld] 4 user=mysql 5 datadir=/data/mysql_3306 6 basedir=/opt/mysql/ 7 socket=/tmp/mysql.sock 8 port=3306 9 log_error=/var/log/mysql/mysql.err 10 server_id=51 11 log_bin=/binlog/mysql-bin 12 autocommit=0 13 binlog_format=row 14 gtid-mode=on 15 enforce-gtid-consistency=true 16 log-slave-updates=1 17 [mysql] 18 socket=/tmp/mysql.sock 19 [client] 20 socket=/tmp/mysql.sock 21 EOF 22 23 b.从库 (开启GTID,不用开启binlog) 24 cat > /etc/my.cnf <<EOF 25 [mysqld] 26 user=mysql 27 datadir=/data/mysql_3306 28 basedir=/opt/mysql/ 29 socket=/tmp/mysql.sock 30 port=3306 31 log_error=/var/log/mysql/mysql.err 32 server_id=52 33 autocommit=0 34 gtid-mode=on 35 enforce-gtid-consistency=true 36 log-slave-updates=1 37 [mysql] 38 socket=/tmp/mysql.sock 39 [client] 40 socket=/tmp/mysql.sock 41 EOF 42 43 c.从库 (开启GTID,不用开启binlog) 44 cat > /etc/my.cnf <<EOF 45 [mysqld] 46 user=mysql 47 datadir=/data/mysql_3306 48 basedir=/opt/mysql/ 49 socket=/tmp/mysql.sock 50 port=3306 51 log_error=/var/log/mysql/mysql.err 52 server_id=53 53 autocommit=0 54 gtid-mode=on 55 enforce-gtid-consistency=true 56 log-slave-updates=1 57 [mysql] 58 socket=/tmp/mysql.sock 59 [client] 60 socket=/tmp/mysql.sock 61 EOF
1.4 初始化数据
1 mysqld --initialize-insecure --user=mysql --basedir=/opt/mysql --datadir=/data/mysql_3306/
1.5 启动数据库
1 systemctl start mysql 2 查看是否启动 3 netstat -lntup |grep 3306 4 5 如果是刚装的mysql,启动mysql需要以下方法才可以启动 6 方法一:cp /opt/mysql/support-files/mysql.server /etc/init.d/mysqld 7 chkconfig --add mysqld 8 systemctl start mysqld 9 10 方法二: /etc/init.d/mysqld start
1.6 设置数据库密码
1 mysqladmin password
1.7 创建远程复制⽤户 (db01操作)
1 mysql> grant replication slave on *.* to repl@'10.0.0.%' identified by '123456abcd'; 2 Query OK, 0 rows affected, 1 warning (0.01 sec) 3 4 mysql> select user,host from mysql.user; 5 +---------------+-----------+ 6 | user | host | 7 +---------------+-----------+ 8 | repl | 10.0.0.% | 9 | mysql.session | localhost | 10 | mysql.sys | localhost | 11 | root | localhost | 12 +---------------+-----------+ 13 4 rows in set (0.00 sec)
1.8 构建主从
1 查看主库GTID 位置 2 mysql> show master status; 3 +------------------+----------+--------------+------------------+------------------------------------------+ 4 | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | 5 +------------------+----------+--------------+------------------+------------------------------------------+ 6 | mysql-bin.000004 | 704 | | | 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2 | 7 +------------------+----------+--------------+------------------+------------------------------------------+ 8 1 row in set (0.00 sec) 9 10 从库执行以下命令: 11 change master to 12 master_host='10.0.0.52', 13 master_user='repl', 14 master_password='123456abcd', 15 MASTER_AUTO_POSITION=1; 16 17 开启主从复制 18 start slave; 19 20 确认启动是否OK 21 mysql> start slave; 22 Query OK, 0 rows affected (0.00 sec) 23 24 mysql> show slave status\G 25 *************************** 1. row *************************** 26 Slave_IO_State: Waiting for master to send event 27 Master_Host: 10.0.0.51 28 Master_User: repl 29 Master_Port: 3306 30 Connect_Retry: 60 31 Master_Log_File: mysql-bin.000004 32 Read_Master_Log_Pos: 704 33 Relay_Log_File: db02-relay-bin.000002 34 Relay_Log_Pos: 917 35 Relay_Master_Log_File: mysql-bin.000004 36 Slave_IO_Running: Yes 37 Slave_SQL_Running: Yes 38 Replicate_Do_DB: 39 Replicate_Ignore_DB: 40 Replicate_Do_Table: 41 Replicate_Ignore_Table: 42 Replicate_Wild_Do_Table: 43 Replicate_Wild_Ignore_Table: 44 Last_Errno: 0 45 Last_Error: 46 Skip_Counter: 0 47 Exec_Master_Log_Pos: 704 48 Relay_Log_Space: 1123 49 Until_Condition: None 50 Until_Log_File: 51 Until_Log_Pos: 0 52 Master_SSL_Allowed: No 53 Master_SSL_CA_File: 54 Master_SSL_CA_Path: 55 Master_SSL_Cert: 56 Master_SSL_Cipher: 57 Master_SSL_Key: 58 Seconds_Behind_Master: 0 59 Master_SSL_Verify_Server_Cert: No 60 Last_IO_Errno: 0 61 Last_IO_Error: 62 Last_SQL_Errno: 0 63 Last_SQL_Error: 64 Replicate_Ignore_Server_Ids: 65 Master_Server_Id: 51 66 Master_UUID: 86ef6764-6e12-11eb-ac19-000c29ee4d81 67 Master_Info_File: /data/mysql_3306/master.info 68 SQL_Delay: 0 69 SQL_Remaining_Delay: NULL 70 Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates 71 Master_Retry_Count: 86400 72 Master_Bind: 73 Last_IO_Error_Timestamp: 74 Last_SQL_Error_Timestamp: 75 Master_SSL_Crl: 76 Master_SSL_Crlpath: 77 Retrieved_Gtid_Set: 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2 78 Executed_Gtid_Set: 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-2 79 Auto_Position: 1 80 Replicate_Rewrite_DB: 81 Channel_Name: 82 Master_TLS_Version: 83 1 row in set (0.00 sec)
1.9 解除从库
1 stop slave; 2 reset slave all;
2.MHA环境准备
主库:10.0.0.51 从库:10.0.0.52,10.0.0.53
2.0 MHA 介绍
1 1.Master HA,对主节点进行监控,可实现自动故障转 移至其它从节点;通过提升某一从节点为新的主节点,基于主从复制实现,还需要客户端配合实现,目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有 三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库,如果财大气粗,也可以用一台专门的服务器来当MHA监控管理服务器 2 3 2.MHA工作原理 4 1 从宕机崩溃的master保存二进制日志事件(binlog events) 5 2 识别含有最新更新的slave 6 3 应用差异的中继日志(relay log)到其他的slave 7 4 应用从master保存的二进制日志事件(binlog events) 8 5 提升一个slave为新的master 9 6 使其他的slave连接新的master进行复制 10 11 注意:MHA需要基于ssh,key验证登入方法 12 13 14 3.MHA软件由两部分组成,Manager工具包和Node工具包,具体的说明如下。 15 1).Manager工具包主要包括以下几个工具: 16 masterha_check_ssh 检查MHA的SSH配置状况 17 masterha_check_repl 检查MySQL复制状况 18 masterha_manger 启动MHA 19 masterha_check_status 检测当前MHA运行状态 20 masterha_master_monitor 检测master是否宕机 21 masterha_master_switch 控制故障转移(自动或者手动) 22 masterha_conf_host 添加或删除配置的server信息 23 24 2).Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具: 25 save_binary_logs 保存和复制master的二进制日志 26 apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave 27 filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具) 28 purge_relay_logs 清除中继日志(不会阻塞SQL线程) 29 30 自定义扩展: 31 secondary_check_script:通过多条网络路由检测master的可用性; 32 master_ip_failover_script:更新application使用的masterip; 33 report_script:发送报告; 34 init_conf_load_script:加载初始配置参数; 35 master_ip_online_change_script;更新master节点ip地址。
2.1 搭建关键的软链接(51,52,53都需要设置)
1 ln -s /opt/mysql-5.7.28/bin/mysqlbinlog /usr/bin/mysqlbinlog 2 ln -s /opt/mysql-5.7.28/bin/mysql /usr/bin/mysql
2.2 配置互信(不用秘钥可以登录ssh)
1 db01: 2 rm -rf /root/.ssh 3 ssh-keygen 4 cd /root/.ssh 5 mv id_rsa.pub authorized_keys 6 scp -r /root/.ssh 10.0.0.52:/root 7 scp -r /root/.ssh 10.0.0.53:/root 8 9 各节点验证:(ssh无秘钥登录) 10 db01: 11 ssh 10.0.0.51 date 登录并查看时间 12 ssh 10.0.0.52 date 13 ssh 10.0.0.53 date 14 db02: 15 ssh 10.0.0.51 date 16 ssh 10.0.0.52 date 17 ssh 10.0.0.53 date 18 db03: 19 ssh 10.0.0.51 date 20 ssh 10.0.0.52 date 21 ssh 10.0.0.53 date
2.3 各节点安装 node软件和依赖包
1 yum install perl-DBD-MySQL -y 2 rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm 3 4 [root@db01 ~00:50:55]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm 5 Preparing... ################################# [100%] 6 Updating / installing... 7 1:mha4mysql-node-0.56-0.el6 ################################# [100%]
2.4 在主库db01中创建mha需要的用户
1 grant all privileges on *.* to mha@'10.0.0.%' identified by '123456abcd'; 2 3 select user,host from mysql.user;
2.5 Manager软件安装(db03)
1 yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes 2 rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm 3 4 [root@db03 ~00:53:36]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm 5 Preparing... ################################# [100%] 6 Updating / installing... 7 1:mha4mysql-manager-0.56-0.el6 ################################# [100%]
2.6 配置文件准备(db03)
1 # 创建配置文件目录 2 mkdir -p /etc/mha 3 # 创建日志目录 4 mkdir -p /var/log/mha/app1 5 # 编辑mha配置文件 6 cat > /etc/mha/app1.cnf <<EOF 7 [server default] 8 manager_log=/var/log/mha/app1/manager 9 manager_workdir=/var/log/mha/app1 10 master_binlog_dir=/binlog #主库binlog位置 11 user=mha #mha用户 12 password=123456abcd 13 ping_interval=2 #每隔2秒扫描一次 14 repl_password=123456abcd 15 repl_user=repl #主从复制用户 16 ssh_user=root #无秘钥使用用户 17 [server1] 18 hostname=10.0.0.51 19 port=3306 20 [server2] 21 hostname=10.0.0.52 22 port=3306 23 [server3] 24 hostname=10.0.0.53 25 port=3306 26 EOF
2.7 状态检查(db03)
1 1. masterha_check_ssh --conf=/etc/mha/app1.cnf 2 3 [root@db03 ~00:59:38]# masterha_check_ssh --conf=/etc/mha/app1.cnf 4 Sun Feb 14 00:59:41 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. 5 Sun Feb 14 00:59:41 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf.. 6 Sun Feb 14 00:59:41 2021 - [info] Reading server configuration from /etc/mha/app1.cnf.. 7 Sun Feb 14 00:59:41 2021 - [info] Starting SSH connection tests.. 8 Sun Feb 14 00:59:42 2021 - [debug] 9 Sun Feb 14 00:59:41 2021 - [debug] Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.52(10.0.0.52:22).. 10 Sun Feb 14 00:59:41 2021 - [debug] ok. 11 Sun Feb 14 00:59:41 2021 - [debug] Connecting via SSH from root@10.0.0.51(10.0.0.51:22) to root@10.0.0.53(10.0.0.53:22).. 12 Sun Feb 14 00:59:41 2021 - [debug] ok. 13 Sun Feb 14 00:59:42 2021 - [debug] 14 Sun Feb 14 00:59:41 2021 - [debug] Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.51(10.0.0.51:22).. 15 Sun Feb 14 00:59:41 2021 - [debug] ok. 16 Sun Feb 14 00:59:41 2021 - [debug] Connecting via SSH from root@10.0.0.52(10.0.0.52:22) to root@10.0.0.53(10.0.0.53:22).. 17 Sun Feb 14 00:59:42 2021 - [debug] ok. 18 Sun Feb 14 00:59:43 2021 - [debug] 19 Sun Feb 14 00:59:42 2021 - [debug] Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.51(10.0.0.51:22).. 20 Sun Feb 14 00:59:42 2021 - [debug] ok. 21 Sun Feb 14 00:59:42 2021 - [debug] Connecting via SSH from root@10.0.0.53(10.0.0.53:22) to root@10.0.0.52(10.0.0.52:22).. 22 Sun Feb 14 00:59:42 2021 - [debug] ok. 23 Sun Feb 14 00:59:43 2021 - [info] All SSH connection tests passed successfully. 24 25 26 2. masterha_check_repl --conf=/etc/mha/app1.cnf 27 Sun Feb 14 14:16:52 2021 - [info] Checking replication health on 10.0.0.52.. 28 Sun Feb 14 14:16:52 2021 - [info] ok. 29 Sun Feb 14 14:16:52 2021 - [info] Checking replication health on 10.0.0.53.. 30 Sun Feb 14 14:16:52 2021 - [info] ok. 31 Sun Feb 14 14:16:52 2021 - [warning] master_ip_failover_script is not defined. 32 Sun Feb 14 14:16:52 2021 - [warning] shutdown_script is not defined. 33 Sun Feb 14 14:16:52 2021 - [info] Got exit code 0 (Not master dead). 34 MySQL Replication Health is OK. 35 36 37 报错: 38 [root@db03 ~00:59:43]# masterha_check_repl --conf=/etc/mha/app1.cnf 39 Sun Feb 14 01:00:24 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. 40 Sun Feb 14 01:00:24 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf.. 41 Sun Feb 14 01:00:24 2021 - [info] Reading server configuration from /etc/mha/app1.cnf.. 42 Sun Feb 14 01:00:24 2021 - [info] MHA::MasterMonitor version 0.56. 43 Sun Feb 14 01:00:25 2021 - [info] GTID failover mode = 1 44 Sun Feb 14 01:00:25 2021 - [info] Dead Servers: 45 Sun Feb 14 01:00:25 2021 - [info] Alive Servers: 46 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.51(10.0.0.51:3306) 47 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.52(10.0.0.52:3306) 48 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.53(10.0.0.53:3306) 49 Sun Feb 14 01:00:25 2021 - [info] Alive Slaves: 50 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.52(10.0.0.52:3306) Version=5.7.28 (oldest major version between slaves) log-bin:disabled 51 Sun Feb 14 01:00:25 2021 - [info] GTID ON 52 Sun Feb 14 01:00:25 2021 - [info] Replicating from 10.0.0.51(10.0.0.51:3306) 53 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.53(10.0.0.53:3306) Version=5.7.28 (oldest major version between slaves) log-bin:disabled 54 Sun Feb 14 01:00:25 2021 - [info] GTID ON 55 Sun Feb 14 01:00:25 2021 - [info] Replicating from 10.0.0.51(10.0.0.51:3306) 56 Sun Feb 14 01:00:25 2021 - [info] Current Alive Master: 10.0.0.51(10.0.0.51:3306) 57 Sun Feb 14 01:00:25 2021 - [info] Checking slave configurations.. 58 Sun Feb 14 01:00:25 2021 - [info] read_only=1 is not set on slave 10.0.0.52(10.0.0.52:3306). 59 Sun Feb 14 01:00:25 2021 - [warning] log-bin is not set on slave 10.0.0.52(10.0.0.52:3306). This host cannot be a master. 60 Sun Feb 14 01:00:25 2021 - [info] read_only=1 is not set on slave 10.0.0.53(10.0.0.53:3306). 61 Sun Feb 14 01:00:25 2021 - [warning] log-bin is not set on slave 10.0.0.53(10.0.0.53:3306). This host cannot be a master. 62 Sun Feb 14 01:00:25 2021 - [info] Checking replication filtering settings.. 63 Sun Feb 14 01:00:25 2021 - [info] binlog_do_db= , binlog_ignore_db= 64 Sun Feb 14 01:00:25 2021 - [info] Replication filtering check ok. 65 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf 66 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48. 67 Sun Feb 14 01:00:25 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers. 68 Sun Feb 14 01:00:25 2021 - [info] Got exit code 1 (Not master dead). 69 70 SQL Replication Health is NOT OK! 71 72 73 解决问题: 74 [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf 75 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.52(10.0.0.52:3306) Version=5.7.28 (oldest major version between slaves) log-bin:disabled 76 Sun Feb 14 01:00:25 2021 - [info] 10.0.0.53(10.0.0.53:3306) Version=5.7.28 (oldest major version between slaves) log-bin:disabled 77 78 以上说明log-bin处于关闭的状态,同时show variables like 'log_bin';可以看到log_bin处于OFF 79 mysql> show variables like 'log_bin'; 80 +---------------+-------+ 81 | Variable_name | Value | 82 +---------------+-------+ 83 | log_bin | OFF | 84 +---------------+-------+ 85 1 row in set (0.00 sec) 86 87 解决方法: 88 在db02,db03的mysql 配置档中写入log_bin=master-bin,必须写在server-id=[] 后面. 89 [root@db03 ~14:16:52]# cat /etc/my.cnf 90 [mysqld] 91 user=mysql 92 datadir=/data/mysql_3306 93 basedir=/opt/mysql/ 94 socket=/tmp/mysql.sock 95 port=3306 96 log_error=/var/log/mysql/mysql.err 97 server_id=53 98 log_bin=master-bin 99 autocommit=0 100 gti--mode=on 101 enforce-gtid-consistency=true 102 log-slave-updates=1 103 [mysql] 104 socket=/tmp/mysql.sock 105 [client] 106 socket=/tmp/mysql.sock 107 重新再测试下,问题已解决 108 [root@db03 ~14:16:52]# cat /etc/my.cnf 109 [mysqld] 110 user=mysql 111 datadir=/data/mysql_3306 112 basedir=/opt/mysql/ 113 socket=/tmp/mysql.sock 114 port=3306 115 log_error=/var/log/mysql/mysql.err 116 server_id=53 117 log_bin=master-bin 118 autocommit=0 119 gti--mode=on 120 enforce-gtid-consistency=true 121 log-slave-updates=1 122 [mysql] 123 socket=/tmp/mysql.sock 124 [client] 125 socket=/tmp/mysql.sock
2.8 开启MHA(db03)
1 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
2.9 查看MHA状态(db03)
1 masterha_check_status --conf=/etc/mha/app1.cnf 2 3 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf 4 app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51
3.MHA 故障测试(主库宕机会自动选出新主库)
1 MHA Failover 过程原理: 2 高可用最擅长的是为我们解决物理损坏,就是说主库坏了可以很快从备份从库中筛选出新的主库,使数据正常运行. 3 1.启动Manager 4 调用masterha_manager脚本启动Manager程序 5 6 2.监控: 7 通过:masterha_master_monitor心跳检测脚本,数据库节点,主要监控主库.默认探测4次,每隔(ping_interval=2)秒,如果主库还没有心跳,认为主库宕机,进入failover过程. 8 9 3. 选主: 10 [server2] 11 hostname=172.25.83.2 12 port=3306 13 candidate_master=1 #设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave 14 check_repl_delay=0 #默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master 15 a. 优先级(主观),如果在节点配置时,加入了candidate_master=1 参数.如果备选主,日志量落后mater太多(后master 100M 的ralay_1ogs的话,也不会被选择成为新主master),也可以通过check_repl_delay=0,不检查日志落后的情景. 16 b. 日志量最接近主库 17 c. 日志量一样,按配置文件顺序来选择主库master. 18 19 4. 日志补偿: 20 a. ssh 能连接上,通过save_binary_logs立即保存缺失部分日志到从库(/var/tmp目录下)并恢复 21 b. ssh 连接不了,两个从库进行relay_log日志diff(apply_diff_relay_logs)差异补偿. 22 23 5. 主从身份切换,所有从库取消和原有主库的复制关系(stop slave;reset slave all),新主库与剩下从库构建主从复制. 24 25 6. 故障库自动被剔除集群(通过master_conf_host 从配置信息中去掉) 26 27 7. MHA是一次性的高可用,Failover 后,Manager 自动退出(需要手动再次加入集群) 28 以上是MHA的基础环境所有具备的功能. 29 不足的地方: 30 1.应该透明(IP地址自动切换) 31 2.数据补偿(数据不能丢失) 32 3.自动提醒 33 4.自愈功能(待开发) 34 思路:MHA + K8S +Operator 官方 , 8.0 MGR + Mysqlsh
4.MHA应用透明VIP功能
1 说明:只能同机房使用,无法跨机房跨网络 2 如果想用跨机房跨网络 可以用keepalived,需要用candidate_master=1和check_repl_delay=0来配合,防止VIP和主库不在一个节点上.
4.1配置参数(db03)
1 将master_ip_failover_script= /usr/local/bin/master_ip_failover写入到mha配置档中 2 3 [root@db03 ~00:27:23]# cat /etc/mha/app1.cnf 4 [server default] 5 master_ip_failover_script= /usr/local/bin/master_ip_failover 6 manager_log=/var/log/mha/app1/manager 7 manager_workdir=/var/log/mha/app1 8 master_binlog_dir=/binlog 9 password=123456abcd 10 ping_interval=2 11 repl_password=123456abcd 12 repl_user=repl 13 ssh_user=root 14 user=mha 15 16 [server1] 17 hostname=10.0.0.51 18 port=3306 19 20 [server2] 21 hostname=10.0.0.52 22 port=3306 23 24 [server3] 25 hostname=10.0.0.53 26 port=3306
4.2 修改脚本
1 vim /usr/local/bin/master_ip_failover 2 my $vip = '10.0.0.55/24'; 3 my $key = "1"; 4 my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; 5 my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; 6 7 a. 需要转换下中文字符 8 dos2unix /usr/local/bin/master_ip_failover 9 10 [root@db03 ~00:51:52]# dos2unix /usr/local/bin/master_ip_failover 11 dos2unix: converting file /usr/local/bin/master_ip_failover to Unix format ... 12 b. 赋予x权限 13 chmod +x /usr/local/bin/master_ip_failover
4.3 手动在主库master 中新建一个VIP.网卡名称一定要与配置文档中的一样
1 db01:10.0.0.51 操作 2 ifconfig eth0:1 10.0.0.55/24 3 4 [root@db01 ~01:02:14]# ifconfig eth0:1 10.0.0.55/24 5 [root@db01 ~01:02:38]# ip a 6 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 7 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 8 inet 127.0.0.1/8 scope host lo 9 valid_lft forever preferred_lft forever 10 inet6 ::1/128 scope host 11 valid_lft forever preferred_lft forever 12 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 13 link/ether 00:0c:29:ee:4d:81 brd ff:ff:ff:ff:ff:ff 14 inet 10.0.0.51/24 brd 10.0.0.255 scope global noprefixroute eth0 15 valid_lft forever preferred_lft forever 16 inet 10.0.0.55/24 brd 10.0.0.255 scope global secondary eth0:1 17 valid_lft forever preferred_lft forever 18 inet6 fe80::f2eb:f691:47ea:8c4e/64 scope link tentative noprefixroute dadfailed 19 valid_lft forever preferred_lft forever 20 inet6 fe80::967b:20ed:e702:f7c2/64 scope link tentative noprefixroute dadfailed 21 valid_lft forever preferred_lft forever 22 inet6 fe80::fda8:a693:1736:211/64 scope link noprefixroute 23 valid_lft forever preferred_lft forever
4.4 重启MHA(db03操作)
a. 先停止 masterha_stop --conf=/etc/mha/app1.cnf b. 再启动 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 & c.查看下VIP是否还是在主库51上或者查看master是否为51 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51
4.5 测试VIP漂移
1 1.将主库master mysql 停掉 2 systemctl stop mysqld 3 4 2.查看mha 日志 5 tail -f /var/log/mha/app1/manager 6 7 3.查看VIP IP已经漂移到52,说明已经将52做为新主库master. 8 9 4.主库51上查看主从复制,VIP 都已要没有了,同时53库上的/etc/mha/app1.cnf [server 1] 配置已经清除. 10 11 5.将51库的Mysql启动来,再做主从复制,主库为52.同时/etc/mha/app1.cnf [server 1] 增加进去. 12 change master to 13 master_host='10.0.0.52', 14 master_user='repl', 15 master_password='123456abcd', 16 MASTER_AUTO_POSITION=1; 17 再开启start slave; 18 查看show slave status\G 19 20 6.在主库52中执行show slave hosts;可以查看到有几台从库 21 mysql> show slave hosts; 22 +-----------+------+------+-----------+--------------------------------------+ 23 | Server_id | Host | Port | Master_id | Slave_UUID | 24 +-----------+------+------+-----------+--------------------------------------+ 25 | 53 | | 3306 | 51 | b03f4593-6e18-11eb-a543-000c291a88d9 | 26 | 52 | | 3306 | 51 | 6a0cfd41-6e18-11eb-a1b0-000c29d2d3c1 | 27 +-----------+------+------+-----------+--------------------------------------+ 28 2 rows in set (0.00 sec)
master_ip_failover(perl)脚本
1 cat /usr/local/bin/master_ip_failover 2 #!/usr/bin/env perl 3 use strict; 4 use warnings FATAL =>'all'; 5 6 use Getopt::Long; 7 8 my ( 9 $command, $ssh_user, $orig_master_host, $orig_master_ip, 10 $orig_master_port, $new_master_host, $new_master_ip, $new_master_port 11 ); 12 13 my $vip = '10.0.0.55/24'; # Virtual IP 14 my $key = "1"; 15 my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; 16 my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; 17 my $exit_code = 0; 18 19 GetOptions( 20 'command=s' => \$command, 21 'ssh_user=s' => \$ssh_user, 22 'orig_master_host=s' => \$orig_master_host, 23 'orig_master_ip=s' => \$orig_master_ip, 24 'orig_master_port=i' => \$orig_master_port, 25 'new_master_host=s' => \$new_master_host, 26 'new_master_ip=s' => \$new_master_ip, 27 'new_master_port=i' => \$new_master_port, 28 ); 29 30 exit &main(); 31 32 sub main { 33 34 #print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; 35 36 if ( $command eq "stop" || $command eq "stopssh" ) { 37 38 # $orig_master_host, $orig_master_ip, $orig_master_port are passed. 39 # If you manage master ip address at global catalog database, 40 # invalidate orig_master_ip here. 41 my $exit_code = 1; 42 eval { 43 print "\n\n\n***************************************************************\n"; 44 print "Disabling the VIP - $vip on old master: $orig_master_host\n"; 45 print "***************************************************************\n\n\n\n"; 46 &stop_vip(); 47 $exit_code = 0; 48 }; 49 if ($@) { 50 warn "Got Error: $@\n"; 51 exit $exit_code; 52 } 53 exit $exit_code; 54 } 55 elsif ( $command eq "start" ) { 56 57 # all arguments are passed. 58 # If you manage master ip address at global catalog database, 59 # activate new_master_ip here. 60 # You can also grant write access (create user, set read_only=0, etc) here. 61 my $exit_code = 10; 62 eval { 63 print "\n\n\n***************************************************************\n"; 64 print "Enabling the VIP - $vip on new master: $new_master_host \n"; 65 print "***************************************************************\n\n\n\n"; 66 &start_vip(); 67 $exit_code = 0; 68 }; 69 if ($@) { 70 warn $@; 71 exit $exit_code; 72 } 73 exit $exit_code; 74 } 75 elsif ( $command eq "status" ) { 76 print "Checking the Status of the script.. OK \n"; 77 `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`; 78 exit 0; 79 } 80 else { 81 &usage(); 82 exit 1; 83 } 84 } 85 86 # A simple system call that enable the VIP on the new master 87 sub start_vip() { 88 `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; 89 } 90 # A simple system call that disable the VIP on the old_master 91 sub stop_vip() { 92 `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; 93 } 94 95 sub usage { 96 print 97 "Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po 98 rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n"; 99 }
5.binlog server (数据补偿)
1 db03 主机操作 2 数据补偿,尽可能不丢失数据 3 有专门一台binlog server时时拉取Master 的Mysql-binlog,如果主master 宕机了,从库直接从binlog server读取数据. 4 缺点:可能会拉低master 性能,所以尽能将binlog server装在ssd上.
5.1 参数设置
1 vim /etc/mha/app1.cnf 2 [binlog1] 3 no_master=1 #不参与选主 4 hostname=10.0.0.53 #binlog server 主机Ip 5 master_binlog_dir=/data/mysql/binlog #这个目录需要与master_binlog_dir=/binlog 目录不一样.
5.2 创建对应目录
1 mkdir -p /data/mysql/binlog 2 chown -R mysql.mysql /data/*
5.3 拉取主库日志
1 必须先进入到binlog目录中 2 cd /data/mysql/binlog 3 mysqlbinlog -R --host=10.0.0.51 --user=mha --password=123456abcd --raw --stop-never mysql-bin.000001 & 4 mysql-bin.000001: 5 这个日志是为了测试用,如果生产环境中,因为有很多日志也有可能没有这个日志,所以需要用show master status;来查看下当前日志,或者flush logs;重新刷新一个新的日志起点. 6 mysql> show master status; 7 +------------------+----------+--------------+------------------+------------------------------------------+ 8 | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | 9 +------------------+----------+--------------+------------------+------------------------------------------+ 10 | mysql-bin.000006 | 194 | | | 86ef6764-6e12-11eb-ac19-000c29ee4d81:1-9 | 11 +------------------+----------+--------------+------------------+------------------------------------------+ 12 1 row in set (0.00 sec) 13 14 [root@db03 /data/mysql/binlog00:26:46]# ps -ef |grep mysqlbinlog 15 root 13131 7587 0 00:24 pts/0 00:00:00 mysqlbinlog -R --host=10.0.0.51 --user=mha --password=x xxxxxxxx --raw --stop-never mysql-bin.000001 16 root 13318 7587 0 00:29 pts/0 00:00:00 grep --color=auto mysqlbinlog 17 18 [root@db03 /data/mysql/binlog00:20:15]# ll 19 total 24 20 -rw-r----- 1 root root 177 Feb 16 00:19 mysql-bin.000001 21 -rw-r----- 1 root root 177 Feb 16 00:19 mysql-bin.000002 22 -rw-r----- 1 root root 177 Feb 16 00:19 mysql-bin.000003 23 -rw-r----- 1 root root 1940 Feb 16 00:19 mysql-bin.000004 24 -rw-r----- 1 root root 217 Feb 16 00:19 mysql-bin.000005 25 -rw-r----- 1 root root 194 Feb 16 00:19 mysql-bin.000006 26 27 [root@db01 /binlog00:20:39]# ll 28 total 28 29 -rw-r----- 1 mysql mysql 177 Feb 13 23:41 mysql-bin.000001 30 -rw-r----- 1 mysql mysql 177 Feb 13 23:45 mysql-bin.000002 31 -rw-r----- 1 mysql mysql 177 Feb 13 23:46 mysql-bin.000003 32 -rw-r----- 1 mysql mysql 1940 Feb 14 01:41 mysql-bin.000004 33 -rw-r----- 1 mysql mysql 217 Feb 14 14:13 mysql-bin.000005 34 -rw-r----- 1 mysql mysql 194 Feb 15 21:47 mysql-bin.000006 35 -rw-r----- 1 mysql mysql 150 Feb 15 21:47 mysql-bin.index 36 37 在主库中先查看日志 38 show master status; 39 再刷新下log 40 flush logs; 41 查看从库53就会有一个新的binlog日志
5.4 重启MHA
1 a. 先停止 2 masterha_stop --conf=/etc/mha/app1.cnf 3 4 b. 再启动 5 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 & 6 7 c.查看下VIP是否还是在主库51上或者查看master是否为51 8 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf 9 app1 (pid:7808) is running(0:PING_OK), master:10.0.0.51
6.发送邮件提醒
6.1 参数设置(db03)
1 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh 2 写入配置文档中db03 3 [root@db03 ~20:54:53]# cat /etc/mha/app1.cnf 4 [server default] 5 manager_log=/var/log/mha/app1/manager 6 manager_workdir=/var/log/mha/app1 7 master_binlog_dir=/binlog 8 master_ip_failover_script=/usr/local/bin/master_ip_failover 9 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh 10 password=123456abcd 11 ping_interval=2 12 repl_password=123456abcd 13 repl_user=repl 14 ssh_user=root 15 user=mha 16 17 [server1] 18 hostname=10.0.0.51 19 port=3306 20 21 [server2] 22 hostname=10.0.0.52 23 port=3306 24 25 [server3] 26 hostname=10.0.0.53 27 port=3306 28 29 [binlog1] 30 no_master=1 31 hostname=10.0.0.53 32 master_binlog_dir=/data/mysql/binlog
6.2 解压sendEmail 软件
1 解压软件到指定位置 2 tar xf sendEmail-v1.56.tar.gz -C /usr/local/bin/ 3 4 [root@db03 /usr/local/bin/sendEmail-v1.5621:30:31]# ll 5 -rw-r--r-- 1 root root 13329 Sep 30 2009 CHANGELOG 6 -rw-r--r-- 1 root root 7949 Sep 30 2009 README 7 -rw-r--r-- 1 root root 4701 Sep 30 2009 README-BR.txt 8 -rwxr-xr-x 1 root root 80213 Sep 30 2009 sendEmail 9 lrwxrwxrwx 1 root root 9 Sep 30 2009 sendEmail.pl -> sendEmail 10 -rwxr-xr-x 1 root root 280 Feb 26 19:38 sendemail.sh 11 -rw-r--r-- 1 root root 1229 Sep 30 2009 TODO 12 13 report_script=/usr/local/bin/sendEmail-v1.56/sendemail.sh 14 解释:MHA 最后会读取report_script里的脚本,其实就是调用sendemail.sh脚本,而sendemail.sh 脚本又是调用sendEmail 命令来发送邮件.
6.3 发邮件脚本测试
1 #1.下载一个sendEmail-v1.56.tar.gz 2 3 #2. 编写一个测试脚本./sendemail.sh 看看测试邮件是否可以发出.确认没有问题. 4 vim sendemail.sh 5 #!/bin/bash 6 /usr/local/bin/sendEmail-v1.56/sendEmail -f "friend_lss@163.com" -t "282474670@qq.com" -s "smtp.163.com" -u "myTest" -o tls=no -o message-content-type=html -o message-charset=utf-8 -xu "friend_lss@163.com" -xp "OTUWLQOUFMHVKFRL" -m "MHA 出现问题,请及时处理" 7 8 #3. 直接用命令方式 9 sendEmail -f "friend_lss@163.com" -t "282474670@qq.com" -s "smtp.163.com" -u "myTest" -o tls=no -o message-content-type=html -o message-charset=utf-8 -xu "friend_lss@163.com" -xp "OTUWLQOUFMHVKFRL" -m "MHA 出现问题,请及时处理" 10 11 #4. sendEmail 参数解释 12 -f linuxyw@163.com是发件邮箱,必须是真实的,要不发不出去的,与sendmail不一样 13 -t 63780668@qq.com linuxyw@163.com 这二个邮件是接收邮件的邮箱地址,可以是一个,也可以是多个,用空格隔开就行,实现邮件群发 14 -s smtp.163.com 这是163邮箱的smtp地址,如果用其它企业邮箱,必须指定smtp地址,要不发不出邮件 15 -u hello hello是邮件主题 16 -xu linuxyw 发件邮件的登陆用户名,必须是真实的,否则系统登陆不上去,发送不了邮件 17 -xp 123456pass 发件邮箱的登陆用户密码,必须是真实的,否则系统登陆不上去,发送不了邮件 18 -m www.linuxyw.com 这里的[www.linuxyw.com](http://www.linuxyw.com/)是邮件的内容,你可以输入你所要写的邮件内容 19 20 #5. 运行脚本报错处理 21 ================================================================================== 22 运行脚本报错: 23 [root@db03 /usr/local/bin/sendEmail-v1.5623:20:19]# ./sendemail.sh 24 25 ******************************************************************* 26 27 Using the default of SSL_verify_mode of SSL_VERIFY_NONE for client 28 is deprecated! Please set SSL_verify_mode to SSL_VERIFY_PEER 29 possibly with SSL_ca_file|SSL_ca_path for verification. 30 If you really don't want to verify the certificate and keep the 31 connection open to Man-In-The-Middle attacks please set 32 SSL_verify_mode explicitly to SSL_VERIFY_NONE in your application. 33 34 ******************************************************************* 35 36 at /usr/local/bin/sendEmail-v1.56/sendEmail line 1906. 37 invalid SSL_version specified at /usr/share/perl5/vendor_perl/IO/Socket/SSL.pm line 444. 38 ================================================================================================ 39 #处理方法: 40 在centos7中,perl版本是5.16,要降级到5.10.但是降级这个perl,依赖关系很多。经过反复搜索,发现有如下解决办法: 41 a、卸载现有的perl版本,安装老版本(不推荐) 42 b、增加参数-o tls=no 选项 43 [root@zabbix02 sendEmail-v1.56]# sendEmail -f username@163.com -t username@qq.com -s smtp.163.com -u "I am zabbix" -o tls=no -o message-content-type=html -o message-charset=utf8 -xu username@163.com -xp password -m "hello zabbix" 44 Feb 16 15:43:04 zabbix02 sendEmail[18761]: Email was sent successfully!
6.4 测试邮件是否OK
1 a. 主库Master 执行systemctl stop mysql, 使主库宕机 2 3 b. 查看是否有邮件提醒 4 或者看邮件日志 5 6 c. 查看VIP 是否转移 7 8 ============================================================== 9 ----- Failover Report ----- 10 11 app1: MySQL Master failover 10.0.0.51(10.0.0.51:3306) to 10.0.0.52(10.0.0.52:3306) succeeded 12 13 Master 10.0.0.51(10.0.0.51:3306) is down! 14 15 Check MHA Manager logs at db03:/var/log/mha/app1/manager for details. 16 17 Started automated(non-interactive) failover. 18 Invalidated master IP address on 10.0.0.51(10.0.0.51:3306) 19 Selected 10.0.0.52(10.0.0.52:3306) as a new master. 20 10.0.0.52(10.0.0.52:3306): OK: Applying all logs succeeded. 21 10.0.0.52(10.0.0.52:3306): OK: Activated master IP address. 22 10.0.0.53(10.0.0.53:3306): OK: Slave started, replicating from 10.0.0.52(10.0.0.52:3306) 23 10.0.0.52(10.0.0.52:3306): Resetting slave info succeeded. 24 Master failover to 10.0.0.52(10.0.0.52:3306) completed successfully. 25 Fri Feb 26 19:49:45 2021 - [info] Sending mail.. 26 Feb 26 19:49:47 db03 sendEmail[7962]: Email was sent successfully!
7. MHA故障排除思路
1 如果主库宕机了,思路如下 2 1. 查看进程 3 ps -ef |grep manager 4 masterha_check_status --conf=/etc/mha/app1.cnf 5 6 2. 检查配置文档节点 7 vim /etc/mha/app1.conf 8 如果节点已经被移除了,说明切换过程已经大部分成功 9 如果世点还在,说明切换过程卡在中间 10 11 3. 看日志 12 tail -f /var/log/mha/app1/manager 13 14 4. 修复故障库,把节点修复好 15 systemctl start mysqld 16 /etc/init.d/mysqld start 17 5. 修复主从 18 将故障库修好后手工加入已有的主从中,做为从库 19 change master to 20 master_host='10.0.0.52', 21 master_user='repl', 22 master_password='123456abcd', 23 MASTER_AUTO_POSITION=1; 24 start slave; 25 26 6. 修复配置文档(配置文档如果还在,则省略) 27 将被移除节点配置重新写入配置中 28 [server1] 29 hostname=10.0.0.51 30 port=3306 31 32 7. 检查SSH 互信和Repl 主从关系 33 masterha_check_ssh --conf=/etc/mha/app1.cnf 34 masterha_check_repl --conf=/etc/mha/app1.cnf 35 如有报错,解决报错后再检查 36 37 8. 修复binlogserver (主库宕机了,binlogserver 也会停掉) 38 [root@db03 ~14:24:23]# cd /data/mysql/binlog/ 39 [root@db03 /data/mysql/binlog14:24:31]# rm -rf ./* #删除已有的binlog日志 40 拉取新主库binlog信息 (新主库IP地址写正确) 41 cd /data/mysql/binlog 42 mysqlbinlog -R --host=10.0.0.52 --user=mha --password=123456abcd --raw --stop-never mysql-bin.000001 & 43 44 9. 检查节点VIP 的状态 45 如果不在,再的手工生成 46 47 10. 启动MHA 48 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 & 49 50 11. 查看MHA 状态 51 [root@db03 ~14:30:54]# masterha_check_status --conf=/etc/mha/app1.cnf 52 app1 (pid:7808) is running(0:PING_OK), master:10.0.0.52
8. 读写分离 Atlas
8.1 Atlas 安装(db03)
1 rpm -ivh Atlas-2.2.1.el6.x86_64.rpm
8.2 配置Atlas
1 cd /usr/local/mysql-proxy/conf 2 同有一个test.cnf 配置文档,先做个备份一份 3 [root@db03 /usr/local/mysql-proxy/conf20:59:31]# cat test.cnf 4 [mysql-proxy] 5 #带#号的为非必需的配置项目 6 #管理接口的用户名 7 admin-username = user 8 #管理接口的密码 9 admin-password = pwd 10 #Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔 11 proxy-backend-addresses =10.0.0.55:3306 12 #Atlas后端连接的MySQL从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔 13 proxy-read-only-backend-addresses = 10.0.0.52:3306,10.0.0.53:3306 14 #用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码! 15 #[root@db03 /usr/local/mysql-proxy/bin22:25:52]# ./encrypt repl:123456abcd 16 #rdCkrCmLhbFMY6YuvYplDg== 17 #[root@db03 /usr/local/mysql-proxy/bin22:26:06]# ./encrypt 123456abcd 18 #tyElBOCykN3FCr2uMcG5AQ== 19 #[root@db03 /usr/local/mysql-proxy/bin22:26:28]# ./encrypt mha 20 #O2jBXONX098= 21 pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ== 22 #设置Atlas的运行方式,设为true时为守护进程方式,设为false时为前台方式,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。 23 daemon = true 24 #设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。 25 keepalive = true 26 #工作线程数,对Atlas的性能有很大影响,可根据情况适当设置 27 event-threads = 8 28 #日志级别,分为message、warning、critical、error、debug五个级别 29 log-level = message 30 #日志存放的路径 31 log-path = /usr/local/mysql-proxy/log 32 #SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,REALTIME代表记录SQL日志且实时写入磁盘,默认为OFF 33 sql-log = ON 34 #Atlas监听的工作接口IP和端口 35 proxy-address = 0.0.0.0:33060 36 #Atlas监听的管理接口IP和端口 37 admin-address = 0.0.0.0:2345 38 #默认字符集,设置该项后客户端不再需要执行SET NAMES语句 39 charset = utf8
8.3. 启动Atlas
1 启动: 2 /usr/local/mysql-proxy/bin/mysql-proxyd test start 3 重起: 4 /usr/local/mysql-proxy/bin/mysql-proxyd test restart 5 [root@db03 /usr/local/mysql-proxy/conf21:44:42]# /usr/local/mysql-proxy/bin/mysql-proxyd test restart 6 OK: MySQL-Proxy of test is stopped 7 OK: MySQL-Proxy of test is started 8 9 [root@db03 /usr/local/mysql-proxy/conf21:03:27]# /usr/local/mysql-proxy/bin/mysql-proxyd test start 10 OK: MySQL-Proxy of test is started 11 12 确认是否启动 13 ps -ef | grep proxy 14 15 [root@db03 /usr/local/mysql-proxy/conf21:05:02]# ps -ef | grep proxy 16 root 6281 1 0 12:54 ? 00:00:00 /usr/sbin/gssproxy -D 17 root 20514 1 0 21:05 ? 00:00:00 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/test.cnf 18 root 20515 20514 0 21:05 ? 00:00:00 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/test.cnf 19 root 20560 7774 0 21:06 pts/1 00:00:00 grep --color=auto proxy 20 21 查看端口 22 [root@db03 /usr/local/mysql-proxy/conf21:45:03]# netstat -lntup |grep proxy 23 tcp 0 0 0.0.0.0:2345 0.0.0.0:* LISTEN 21790/mysql-proxy 24 tcp 0 0 0.0.0.0:33060 0.0.0.0:* LISTEN 21790/mysql-proxy
8.4 Atlas 配置说明
1 Atlas运行需要依赖一个配置文件(test.cnf)。在运行Atlas之前,需要对该文件进行配置。Atlas的安装目录是/usr/local/mysql-proxy,进入安装目录下的conf目录,可以看到已经有一个名为test.cnf的默认配置文件,我们只需要修改里面的某些配置项,不需要从头写一个配置文件。 2 3 配置范例及说明如下: 4 [mysql-proxy] 5 6 (必备,默认值即可)管理接口的用户名 7 8 admin-username = user 9 10 (必备,默认值即可)管理接口的密码 11 12 admin-password = pwd 13 14 (必备,根据实际情况配置)主库的IP和端口 15 16 proxy-backend-addresses = 192.168.0.12:3306 17 18 (非必备,根据实际情况配置)从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔。如果想让主库也能分担读请求的话,只需要将主库信息加入到下面的配置项中。 19 20 proxy-read-only-backend-addresses = 192.168.0.13:3306,192.168.0.14:3306 21 22 (必备,根据实际情况配置)用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,用户名与密码之间用冒号分隔。主从数据库上需要先创建该用户并设置密码(用户名和密码在主从数据库上要一致)。比如用户名为myuser,密码为mypwd,执行./encrypt mypwd结果为HJBoxfRsjeI=。如果有多个用户用逗号分隔即可。则设置如下行所示: 23 24 pwds = repl: HJBoxfRsjeI=,mha:HJBoxfRsjeI= 25 26 (必备,默认值即可)Atlas的运行方式,设为true时为守护进程方式,设为false时为前台方式,一般开发调试时设为false,线上运行时设为true 27 28 daemon = true 29 30 (必备,默认值即可)设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true 31 32 keepalive = true 33 34 (必备,根据实际情况配置)工作线程数,推荐设置成系统的CPU核数的2至4倍 35 36 event-threads = 4 37 38 (必备,默认值即可)日志级别,分为message、warning、critical、error、debug五个级别 39 40 log-level = message 41 42 (必备,默认值即可)日志存放的路径 43 44 log-path = /usr/local/mysql-proxy/log 45 46 (必备,根据实际情况配置)SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,该模式下日志刷新是基于缓冲区的,当日志填满缓冲区后,才将日志信息刷到磁盘。REALTIME用于调试,代表记录SQL日志且实时写入磁盘,默认为OFF 47 48 sql-log = OFF 49 50 (可选项,可不设置)慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志记录。不设置该参数则输出全部日志。 51 52 sql-log-slow = 10 53 54 (可选项,可不设置)关闭不活跃的客户端连接设置。当设置了该参数时,Atlas会主动关闭经过'wait-timeout'时间后一直未活跃的连接。单位:秒 55 56 wait-timeout = 10 57 58 (必备,默认值即可)Atlas监听的工作接口IP和端口 59 60 proxy-address = 0.0.0.0:1234 61 62 (必备,默认值即可)Atlas监听的管理接口IP和端口 admin-address = 0.0.0.0:2345 63 64 (可选项,可不设置)分表设置,此例中person为库名,mt为表名,id为分表字段,3为子表数量,可设置多项,以逗号分隔,若不分表则不需要设置该项,子表需要事先建好,子表名称为表名_数字,数字范围为[0,子表数-1],如本例里,子表名称为mt_0、mt_1、mt_2 65 66 tables = person.mt.id.3 67 68 (可选项,可不设置)默认字符集,若不设置该项,则默认字符集为latin1 69 70 charset = utf8 71 72 (可选项,可不设置)允许连接Atlas的客户端的IP,可以是精确IP,也可以是IP段,以逗号分隔,若不设置该项则允许所有IP连接,否则只允许列表中的IP连接 73 74 client-ips = 127.0.0.1, 192.168.1 75 76 (可选项,极少需要)Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置,否则可以不设置 77 78 lvs-ips = 192.168.1.1 79 80 2. 重要配置说明 81 以下几项配置参数对性能和正常运行起到重要作用,需要正确设置。 82 83 (1)线程数 84 85 event-threads项设置,过小无法充分发挥多核CPU的性能,过大造成不必要的线程切换开销,推荐设置为CPU的核数。 86 87 (2)最小空闲连接数(2.x以上版本不需要该项,1.x版本需要) 88 89 min-idle-connections项设置,过小则在高并发下会有报错,过大虽然不报错但在测试时不容易看出读写分离效果,推荐设置为比客户端的并发峰值稍大,详见《配置参数详解》。上面的配置范例是针对Atlas 2.X版本,没有该选项。对于Atlas 1.X版本的配置文件,需要加入该配置选项。 90 91 3. 可选配置说明 92 以下几项可以设置,也可以使用默认值,区别不大。 93 94 (1)Atlas的工作端口 95 96 proxy-address项配置,例如proxy-address = 0.0.0.0:1234代表客户端应该使用1234这个端口连接Atlas来发送SQL请求。 97 98 (2)Atlas的管理端口 99 100 admin-address项配置,例如admin-address = 0.0.0.0:2345代表DBA应该使用2345这个端口连接Atlas来执行运维管理操作。 101 102 (3)管理接口的用户名和密码 103 104 admin-username项和admin-password项设置,这两项是用来进入Atlas的管理界面的,与后端连接的MySQL没有关系,所以可以任意设置,不需要MySQL在配置上做任何改动。 105 106 (4)日志级别 107 108 以log-level项配置,分为message、warning、critical、error、debug五个级别。 109 110 (5)日志路径 111 112 以log-path项配置,如log-path = /usr/local/mysql-proxy/log。 113 114 [mysql-proxy] 115 116 管理接口的用户名 117 admin-username = user 118 管理接口的密码 119 admin-password = pwd 120 Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔 121 proxy-backend-addresses = 192.168.1.1:3306 122 从库 123 proxy-read-only-backend-addresses = 192.168.1.2:3306@1 124 用户名和密码配置项,需要和主从复制配置的用户名和密码配置一样 125 r1:+jKsgB3YAG8=, user2:GS+tr4TPgqc= 126 后台运行 127 daemon = true keepalive = false 128 工作线程数,对Atlas的性能有很大影响,可根据情况适当设置 129 event-threads = 4 130 日志级别,分为message、warning、critical、error、debug五个级别 131 log-level = error 132 日志存放的路径 133 log-path = ./log 134 SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,REALTIME代表记录SQL日>志且实时写入磁盘,默认为OFF 135 sql-log = OFF 136 慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志>记录。不设置该参数则输出全部日志。 137 sql-log-slow = 1000 138 实例名称,用于同一台机器上多个Atlas实例间的区分 139 instance = web 140 Atlas监听的工作接口IP和端口 141 proxy-address = 0.0.0.0:13470 142 Atlas监听的管理接口IP和端口 143 admin-address = 0.0.0.0:23470 144 分表设置,此例中person为库名,mt为表名,id为分表字段,3为子表数量,可设置多项,以逗号分>隔,若不分表则不需要设置该项 145 tables = person.mt.id.3 146 默认字符集,设置该项后客户端不再需要执行SET NAMES语句 147 charset = utf8 148 允许连接Atlas的客户端的IP,可以是精确IP,也可以是IP段,以逗号分隔,若不设置该项则允许所>有IP连接,否则只允许列表中的IP连接 149 client-ips = 127.0.0.1, 192.168.1 150 Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置>,否则可以不设置 151 lvs-ips = 192.168.1.1
8.5 测试 读写分离功能(那台机器装了Atlas就用哪台)
1 db03 操作:用mha 用户登录 2 3 1. mysql -umha -p123456abcd -h 10.0.0.53 -P33060 4 5 [root@db03 /usr/local/mysql-proxy/conf22:42:49]# mysql -umha -p123456abcd -h 10.0.0.53 -P33060 6 mysql: [Warning] Using a password on the command line interface can be insecure. 7 Welcome to the MySQL monitor. Commands end with ; or \g. 8 Your MySQL connection id is 1 9 Server version: 5.0.81-log MySQL Community Server (GPL) 10 11 Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. 12 13 Oracle is a registered trademark of Oracle Corporation and/or its 14 affiliates. Other names may be trademarks of their respective 15 owners. 16 17 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. 18 19 报错: 20 登录密码有问题 21 22 解决问题: 23 Atlas 配置文档中的 pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ==,这个密码必须是加密过的,要不然是无法识别.加密方式为/usr/local/mysql-proxy/bin/下的encrypt 文件加密 24 #用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码! 25 #[root@db03 /usr/local/mysql-proxy/bin22:26:06]# ./encrypt 123456abcd 26 #tyElBOCykN3FCr2uMcG5AQ== 27 #[root@db03 /usr/local/mysql-proxy/bin22:26:28]# ./encrypt mha 28 #O2jBXONX098= 29 pwds = repl:tyElBOCykN3FCr2uMcG5AQ==, mha:tyElBOCykN3FCr2uMcG5AQ== 30 31 2. 登录后测试读写分离:设置51,53为从库,52为主库 32 读操作: 33 select @@server_id; 可以看到51,53一直轮循. 34 mysql> select @@server_id; 35 +-------------+ 36 | @@server_id | 37 +-------------+ 38 | 53 | 39 +-------------+ 40 1 row in set (0.00 sec) 41 42 mysql> select @@server_id; 43 +-------------+ 44 | @@server_id | 45 +-------------+ 46 | 51 | 47 +-------------+ 48 1 row in set (0.00 sec) 49 50 mysql> select @@server_id; 51 +-------------+ 52 | @@server_id | 53 +-------------+ 54 | 53 | 55 +-------------+ 56 1 row in set (0.00 sec) 57 58 mysql> select @@server_id; 59 +-------------+ 60 | @@server_id | 61 +-------------+ 62 | 51 | 63 +-------------+ 64 1 row in set (0.00 sec) 65 66 写操作: 只会写在db02 67 begin;select @@server_id;commit; 68 69 mysql> begin;select @@server_id;commit; 70 Query OK, 0 rows affected (0.01 sec) 71 72 +-------------+ 73 | @@server_id | 74 +-------------+ 75 | 52 | 76 +-------------+ 77 1 row in set (0.00 sec) 78 79 Query OK, 0 rows affected (0.00 sec) 80 81 mysql> begin;select @@server_id;commit; 82 Query OK, 0 rows affected (0.00 sec) 83 84 +-------------+ 85 | @@server_id | 86 +-------------+ 87 | 52 | 88 +-------------+ 89 1 row in set (0.00 sec) 90 91 Query OK, 0 rows affected (0.01 sec)
8.6 Atlas 管理(db03)
a. 登录Atlas 登录Atlas 界面,用户名: user 密码:pwd 端口号:2345 mysql -uuser -ppwd -h 10.0.0.53 -P2345 [root@db03 /usr/local/mysql-proxy/conf23:01:54]# mysql -uuser -ppwd -h 10.0.0.53 -P2345 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.0.99-agent-admin Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. b. 查看Atlas 所有管理命令 mysql> select * from help; +----------------------------+---------------------------------------------------------+ | command | description | +----------------------------+---------------------------------------------------------+ | SELECT * FROM help | shows this help | | SELECT * FROM backends | lists the backends and their state | | SET OFFLINE $backend_id | offline backend server, $backend_id is backend_ndx's id | | SET ONLINE $backend_id | online backend server, ... | | ADD MASTER $backend | example: "add master 127.0.0.1:3306", ... | | ADD SLAVE $backend | example: "add slave 127.0.0.1:3306", ... | | REMOVE BACKEND $backend_id | example: "remove backend 1", ... | | SELECT * FROM clients | lists the clients | | ADD CLIENT $client | example: "add client 192.168.1.2", ... | | REMOVE CLIENT $client | example: "remove client 192.168.1.2", ... | | SELECT * FROM pwds | lists the pwds | | ADD PWD $pwd | example: "add pwd user:raw_password", ... | | ADD ENPWD $pwd | example: "add enpwd user:encrypted_password", ... | | REMOVE PWD $pwd | example: "remove pwd user", ... | | SAVE CONFIG | save the backends to config file | | SELECT VERSION | display the version of Atlas | +----------------------------+---------------------------------------------------------+ c. 管理命令 查看所有帮助 SELECT * FROM help 查看后端节点状态 SELECT * FROM backends 查看数据库节点状态 SELECT * FROM backends mysql> SELECT * FROM backends; +-------------+----------------+-------+------+ | backend_ndx | address | state | type | +-------------+----------------+-------+------+ | 1 | 10.0.0.55:3306 | up | rw | | 2 | 10.0.0.51:3306 | up | ro | | 3 | 10.0.0.53:3306 | up | ro | +-------------+----------------+-------+------+ 3 rows in set (0.00 sec) 上线或下线节点(对某节点需要做升级或者数据分析等操作时) SET OFFLINE $backend_id SET ONLINE $backend_id mysql> SET OFFLINE 2; +-------------+----------------+---------+------+ | backend_ndx | address | state | type | +-------------+----------------+---------+------+ | 2 | 10.0.0.51:3306 | offline | ro | +-------------+----------------+---------+------+ 1 row in set (0.00 sec) mysql> SET ONLINE 2; +-------------+----------------+---------+------+ | backend_ndx | address | state | type | +-------------+----------------+---------+------+ | 2 | 10.0.0.51:3306 | unknown | ro | +-------------+----------------+---------+------+ 1 row in set (0.00 sec) mysql> SELECT * FROM backends; +-------------+----------------+-------+------+ | backend_ndx | address | state | type | +-------------+----------------+-------+------+ | 1 | 10.0.0.55:3306 | up | rw | | 2 | 10.0.0.51:3306 | up | ro | | 3 | 10.0.0.53:3306 | up | ro | +-------------+----------------+-------+------+ 3 rows in set (0.00 sec) 添加或移除节点 ADD MASTER $backend example: "add master 127.0.0.1:3306" ADD SLAVE $backend example: "add slave 127.0.0.1:3306" REMOVE BACKEND $backend_id mysql> REMOVE BACKEND 2; Empty set (0.00 sec) mysql> SELECT * FROM backends; +-------------+----------------+-------+------+ | backend_ndx | address | state | type | +-------------+----------------+-------+------+ | 1 | 10.0.0.55:3306 | up | rw | | 2 | 10.0.0.53:3306 | up | ro | +-------------+----------------+-------+------+ 2 rows in set (0.00 sec) mysql> ADD SLAVE 10.0.0.51:3306; Empty set (0.00 sec) mysql> SELECT * FROM backends; +-------------+----------------+-------+------+ | backend_ndx | address | state | type | +-------------+----------------+-------+------+ | 1 | 10.0.0.55:3306 | up | rw | | 2 | 10.0.0.53:3306 | up | ro | | 3 | 10.0.0.51:3306 | up | ro | +-------------+----------------+-------+------+ 3 rows in set (0.00 sec)
1 | SELECT * FROM pwds | lists the pwds | 2 | ADD PWD $pwd | example: "add pwd user:raw_password", ... | 3 | ADD ENPWD $pwd | example: "add enpwd user:encrypted_password", ... | 4 | REMOVE PWD $pwd 5 6 mysql> SELECT * FROM pwds; 7 +----------+--------------------------+ 8 | username | password | 9 +----------+--------------------------+ 10 | repl | tyElBOCykN3FCr2uMcG5AQ== | 11 | mha | tyElBOCykN3FCr2uMcG5AQ== | 12 +----------+--------------------------+ 13 2 rows in set (0.00 sec) 14 15 16 企业应用案列: 17 开发人员申请一个应用用户 app( select update insert) 密码123456abcd,要通过10网段登录 18 1. 在主库中,创建用户 19 grant select ,update,insert on *.* to app@'10.0.0.%' identified by '123456abcd'; 20 2. 在atlas中添加生产用户 21 方法一: 明文 22 ADD PWD $pwd ---需要添加的用户名称,例如:ADD PWD app:123456abcd; 会自动加,并加密 23 24 方法二: 密文 25 /usr/local/mysql-proxy/bin/encrypt 123456abcd 26 ADD ENPWD $pwd 例: ADD ENPWD app:tyElBOCykN3FCr2uMcG5AQ== 27 方法三: 28 /usr/local/mysql-proxy/bin/encrypt 123456abcd ---->制作加密密码 29 vim test.cnf 30 pwds = repl:3yb5jEku5h4=,mha:O2jBXONX098=,app:/iZxz+0GRoA= 31 /usr/local/mysql-proxy/bin/mysql-proxyd test restart 32 [root@db03 conf]# mysql -uapp -p123456abcd -h 10.0.0.53 -P 33060
8.7 配置永久生效(以上改的任何配置机器重启都会丢掉,相当于暂存在内存里)
SAVE CONFIG;标签:10.0,00,14,mysql,MHA,master,Mysql,Atlas,root From: https://www.cnblogs.com/hanease/p/18671263