为了提升数据检索的效率,有时候我们会在数据库前加一层缓存,Redis就是常见的一种缓存组件,他的全称是REmote DIctionary Server,是一个由Salvatore Sanfilippo写的key-value存储系统,而且是可以跨平台的非关系型数据库。
Redis是一个开源的使用ANSI C语言编写、遵守BSD协议、支持网络、可基于内存、分布式、可选持久性的键值对(Key-Value)存储数据库,并提供多种语言的API。他的值(value)可以是字符串(String)、哈希(Hash)、列表(list)、集合(sets)和有序集合(sorted sets)等类型。
在实际场景中,我们不会使用单点的Redis,而是会选择主从模式,但若主节点一旦发生故障不能提供服务,需要人工干预,将从节点晋升为主节点,同时还需要修改客户端配置。
因此,从Redis 2.8开始,提供了Sentinel哨兵模式,Redis Sentinel是Redis的高可用实现方案,他的作用是完成对Redis的故障判断、故障转移、通知客户端。
Sentinel是建立在主从结构之上,会多几个Sentinel节点,他不会存储数据,但是多个Sentinel可达到对Redis故障判断的公平性,还能保证高可用。即使一个Sentinel节点挂了,也能保证Sentinel机制的存在。客户端不会直接从Redis中获取信息,而是从Sentinel获取信息。Sentinel会对所有的master和slave监控。当多个Sentinel发现master挂了,会内部选举出一个Sentinel作为领导。被选举出的Sentinel领导会选出一个slave作为新的master,同时通知slave复制新的master,通知客户端新的master是谁。如果老的master复活了,他会变成一个slave去复制新的master。一套Sentinel可以监控多套master-slave,有效节省资源。每套master-slave会使用一个master-name配置作为标识。
一句话,Sentinel架构解决了Redis主从人工干预的问题。
我们尝试在Linux 7安装Redis 3.0.3,以及Sentinel。
首先,下载redis-3.0.3.tar.gz并解压,生成文件如下,
[bisal@bisal redis-3.0.3]$ tar zxvf redis-3.0.3.tar.gz
[bisal@bisal redis-3.0.3]$ ls -rlht
total 136K
drwxrwxr-x. 5 bisal bisal 4.0K Jul 17 2015 utils
drwxrwxr-x. 10 bisal bisal 167 Jul 17 2015 tests
drwxrwxr-x. 2 bisal bisal 4.0K Jul 17 2015 src
-rw-rw-r--. 1 bisal bisal 7.0K Jul 17 2015 sentinel.conf
-rwxrwxr-x. 1 bisal bisal 281 Jul 17 2015 runtest-sentinel
-rwxrwxr-x. 1 bisal bisal 280 Jul 17 2015 runtest-cluster
-rwxrwxr-x. 1 bisal bisal 271 Jul 17 2015 runtest
-rw-rw-r--. 1 bisal bisal 41K Jul 17 2015 redis.conf
-rw-rw-r--. 1 bisal bisal 5.1K Jul 17 2015 README
-rw-rw-r--. 1 bisal bisal 4.2K Jul 17 2015 MANIFESTO
-rw-rw-r--. 1 bisal bisal 151 Jul 17 2015 Makefile
-rw-rw-r--. 1 bisal bisal 11 Jul 17 2015 INSTALL
drwxrwxr-x. 6 bisal bisal 107 Jul 17 2015 deps
-rw-rw-r--. 1 bisal bisal 1.5K Jul 17 2015 COPYING
-rw-rw-r--. 1 bisal bisal 1.5K Jul 17 2015 CONTRIBUTING
-rw-rw-r--. 1 bisal bisal 53 Jul 17 2015 BUGS
-rw-rw-r--. 1 bisal bisal 28K Jul 17 2015 00-RELEASENOTES
Redis的安装,就是make,会在Redis路径下增加一个src文件夹,包含了相关的执行脚本,
[bisal@bisal redis-3.0.3]$ make
cd src && make all
make[1]: Entering directory `/opt/software/redis-3.0.3/src'
rm -rf redis-server redis-sentinel redis-cli redis-benchmark redis-check-dump redis-check-aof *.o *.gcda *.gcno *.gcov redis.info lcov-html
(cd ../deps && make distclean)
...
Hint: It's a good idea to run 'make test' ;)
make[1]: Leaving directory `/opt/software/redis-3.0.3/src'
Redis,就,安装完了。
接下来需要的就是利用配置文件执行启动脚本了。为了验证Sentinel的作用,我们搭建一个Redis主从架构,由于机器有限,选择一台机器,利用三个端口,模拟搭建三套Redis,一主两从的配置。
首先,创建三个Redis软件安装的路径,以及三个Redis对应日志的路径,分别使用6379、6380和6381,作为端口,
/opt/app/redis_6379
/opt/app/redis_6380
/opt/app/redis_6381
/opt/applog/redis_6380
/opt/applog/redis_6379
/opt/applog/redis_6381
接下来,改下这三个Redis的配置文件,
/opt/app/redis_6379/redis.conf
/opt/app/redis_6380/redis.conf
/opt/app/redis_6381/redis.conf
调整内容如下,验证的密码是TestRedis,端口、pid文件、log文件、dir,rdb文件都按照实际进行改动,因为要配置主从,还需要配置masterauth指定密码,否则切主从时,会无法连接,
P.S. 三个配置文件使用对应的6379,6380、6381标识,
requirepass "TestRedis"
pidfile "/opt/app/redis_6379/redis_6379.pid"
port 6379
bind 192.168.15.130
tcp-keepalive 60
logfile "/opt/applog/redis_6379/redis_6379.log"
dir "/opt/app/redis_6379"
dbfilename dump_6379.rdb
masterauth "TestRedis"
分别在三个Redis的src下,执行redis-server,参数就是刚才配置的redis.conf,使用&运行在后台,此时Redis就启动了,
./redis-server ../redis.conf &
从对应的redis.log,可看到启动信息,
15312:M 02 Feb 14:09:11.656 * Increased maximum number of open files to 4096 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.0.3 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6380
| `-._ `._ / _.-' | PID: 15312
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
15312:M 02 Feb 14:09:11.658 # Server started, Redis version 3.0.3
15312:M 02 Feb 14:09:11.658 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
15312:M 02 Feb 14:09:11.658 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
15312:M 02 Feb 14:09:11.658 * DB loaded from disk: 0.000 seconds
15312:M 02 Feb 14:09:11.658 * The server is now ready to accept connections on port 6379
当前启动的Redis只是单兵作战,开始配置主从。
我们初始设置6379是master,6380和6381是slave,通过redis-cli分别登录两个slave,执行"slaveof 192.168.15.130 6379",表示这两个是6379的slave,即6379是master,"config rewrite"会将信息回写至配置文件,进行生效,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6380
192.168.15.130:6380> auth "TestRedis"
OK
192.168.15.130:6380> slaveof 192.168.15.130 6379
OK
192.168.15.130:6380> config rewrite
OK
此时登录6379、6380和6381的客户端,可以看下info replication部分,6379的role是master,6380和6381的role是slave,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6379 -a TestRedis info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.15.130,port=6381,state=online,offset=7788,lag=1
master_repl_offset:7931
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:7930
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6380 -a TestRedis info replication
# Replication
role:slave
master_host:192.168.15.130
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:17605
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6381 -a TestRedis info replication
# Replication
role:slave
master_host:192.168.15.130
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:18463
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
可以尝试插入和检索数据,但是要注意,只可以在主节点进行插入操作,如果插入备节点,会提示错误,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6380 -a TestRedis
192.168.15.130:6380> sadd name bisal
(error) READONLY You can't write against a read only slave.
登录主节点,sadd插入name=bisal,smembers name检索key=name的value,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6379 -a TestRedis
192.168.15.130:6379> sadd name bisal
(integer) 1
192.168.15.130:6379> smembers name
1) "bisal"
我们可以到3个Redis路径下使用od十六进制打开dump_xxxx.rdb文件,看到都存在name、bisal的值,说明主从节点都同步了刚才插入的数据,
[bisal@bisal redis_6380]$ od -A x -t x1c -v dump_6380.rdb
000000 52 45 44 49 53 30 30 30 36 fe 00 02 04 6e 61 6d
R E D I S 0 0 0 6 376 \0 002 004 n a m
000010 65 01 05 62 69 73 61 6c ff 98 c8 57 eb b8 b8 5b
e 001 005 b i s a l 377 230 310 W 353 270 270 [
000020 9b
233
000021
接下来,Sentinel登场了,首先改下sentinel.conf,我们模拟创建3个Sentinel,端口分别是26379、26380、26381,和Redis的主从不同,这三个Sentinel是相互独立的,没有所谓的master-slave。
在sentinel.conf配置文件中主要改动如下的内容,3个配置文件中"sentinel monitor mymaster 192.168.15.130 6379 2"都是相同的,表示都监控的是6379这个Redis master,mymaster是起的名字,auth-pass要配置刚才Redis的验证密码,否则不会起作用,down-after-milliseconds默认30秒,表示发现主节点无法工作重新选主的间隔时间,为了测试,可以改小点,此处改为5秒,
port 26379
daemonize yes
bind 192.168.15.130
logfile "/opt/applog/redis_6381/sentinel_26381.log"
dir "/opt/app/redis/redis_6381"
sentinel monitor mymaster 192.168.15.130 6379 2
sentinel auth-pass mymaster TestRedis
sentinel down-after-milliseconds mymaster 5000
启动这三个Sentinel,
/redis-sentinel ../sentinel.conf
此时系统中应该存在3个Redis和3个Sentinel,
ps -ef | grep redis
./redis-server 192.168.15.130:6379
./redis-server 192.168.15.130:6380
./redis-server 192.168.15.130:6381
./redis-sentinel 192.168.15.130:26379 [sentinel]
./redis-sentinel 192.168.15.130:26380 [sentinel]
./redis-sentinel 192.168.15.130:26381 [sentinel]
从任何的一个Sentinel登录看他的状态都是相同的,可以看到status状态是ok,说明Sentinel安装正确,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 26379 -a TestRedis info
# Server
redis_version:3.0.3
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:5a9fdf887b7c26a3
redis_mode:sentinel
os:Linux 3.10.0-862.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:13743
run_id:7686d355aa2ca3aedb9bf5d4a8334770a2dcf26b
tcp_port:26379
uptime_in_seconds:1204
uptime_in_days:0
hz:12
lru_clock:2266534
config_file:/opt/oracle/app/redis/redis_6379/sentinel.conf
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=mymaster,status=ok,address=192.168.15.130:6379,slaves=2,sentinels=3
我们验证下Sentinel所起的作用,即kill了Redis master,看Sentinel能不能帮这套Redis选出新的master。
首先,kill了Redis master进程,
kill -9 6379的redis进程号
可以看到6380的redis.log,试探5秒,Discarding previously cached master state,丢弃原来缓存的master状态信息,MASTER MODE enabled,启动master模式,和6381同步,让其成为slave,
16016:S 09 Feb 22:04:46.574 # Connection with master lost.
16016:S 09 Feb 22:04:46.574 * Caching the disconnected master state.
16016:S 09 Feb 22:04:46.867 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:46.867 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:46.867 # Error condition on socket for SYNC: Connection refused
16016:S 09 Feb 22:04:47.877 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:47.878 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:47.878 # Error condition on socket for SYNC: Connection refused
16016:S 09 Feb 22:04:48.886 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:48.887 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:48.887 # Error condition on socket for SYNC: Connection refused
16016:S 09 Feb 22:04:49.892 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:49.893 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:49.893 # Error condition on socket for SYNC: Connection refused
16016:S 09 Feb 22:04:50.902 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:50.902 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:50.902 # Error condition on socket for SYNC: Connection refused
16016:S 09 Feb 22:04:51.908 * Connecting to MASTER 192.168.15.130:6379
16016:S 09 Feb 22:04:51.908 * MASTER <-> SLAVE sync started
16016:S 09 Feb 22:04:51.908 # Error condition on socket for SYNC: Connection refused
16016:M 09 Feb 22:04:51.940 * Discarding previously cached master state.
16016:M 09 Feb 22:04:51.941 * MASTER MODE enabled (user request)
16016:M 09 Feb 22:04:51.946 # CONFIG REWRITE executed with success.
16016:M 09 Feb 22:04:53.727 * Slave 192.168.15.130:6381 asks for synchronization
16016:M 09 Feb 22:04:53.727 * Full resync requested by slave 192.168.15.130:6381
16016:M 09 Feb 22:04:53.727 * Starting BGSAVE for SYNC with target: disk
16016:M 09 Feb 22:04:53.728 * Background saving started by pid 15376
15376:C 09 Feb 22:04:53.737 * DB saved on disk
15376:C 09 Feb 22:04:53.739 * RDB: 0 MB of memory used by copy-on-write
16016:M 09 Feb 22:04:53.824 * Background saving terminated with success
16016:M 09 Feb 22:04:53.824 * Synchronization with slave 192.168.15.130:6381 succeeded
6381的redis.log,等待5秒,执行了“SLAVE OF 192.168.15.130:6380 enabled”,从作为6379的slave改为6380的slave,
13516:S 09 Feb 22:04:46.575 # Connection with master lost.
13516:S 09 Feb 22:04:46.575 * Caching the disconnected master state.
13516:S 09 Feb 22:04:46.665 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:46.665 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:46.665 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:47.674 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:47.675 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:47.675 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:48.684 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:48.684 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:48.684 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:49.693 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:49.693 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:49.693 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:50.701 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:50.702 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:50.702 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:51.708 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:51.709 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:51.709 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:52.717 * Connecting to MASTER 192.168.15.130:6379
13516:S 09 Feb 22:04:52.717 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:52.717 # Error condition on socket for SYNC: Connection refused
13516:S 09 Feb 22:04:52.866 * Discarding previously cached master state.
13516:S 09 Feb 22:04:52.866 * SLAVE OF 192.168.15.130:6380 enabled (user request)
13516:S 09 Feb 22:04:52.868 # CONFIG REWRITE executed with success.
13516:S 09 Feb 22:04:53.723 * Connecting to MASTER 192.168.15.130:6380
13516:S 09 Feb 22:04:53.724 * MASTER <-> SLAVE sync started
13516:S 09 Feb 22:04:53.724 * Non blocking connect for SYNC fired the event.
13516:S 09 Feb 22:04:53.725 * Master replied to PING, replication can continue...
13516:S 09 Feb 22:04:53.727 * Partial resynchronization not possible (no cached master)
13516:S 09 Feb 22:04:53.729 * Full resync from master: 1c7fdca18b4b509b634f6050219394860858982f:1
13516:S 09 Feb 22:04:53.824 * MASTER <-> SLAVE sync: receiving 18 bytes from master
13516:S 09 Feb 22:04:53.825 * MASTER <-> SLAVE sync: Flushing old data
13516:S 09 Feb 22:04:53.825 * MASTER <-> SLAVE sync: Loading DB in memory
13516:S 09 Feb 22:04:53.825 * MASTER <-> SLAVE sync: Finished with success
再来看下6379的sentinel.log,
发现6379不可用
13743:X 09 Feb 22:04:51.673 # +sdown master mymaster 192.168.15.130 6379
当前配置版本被更新
13743:X 09 Feb 22:04:51.742 # +new-epoch 5
进行投票选举slave服务器
13743:X 09 Feb 22:04:51.746 # +vote-for-leader 3a00a8cb8433aaaf8a59c35a05b3af835827c5f2 5
投票环节有两个sentinel发现master不能用
13743:X 09 Feb 22:04:51.750 # +odown master mymaster 192.168.15.130 6379 #quorum 3/2
13743:X 09 Feb 22:04:51.750 # Next failover delay: I will not start a failover before Tue Feb 9 22:10:52 2021
13743:X 09 Feb 22:04:52.869 # +config-update-from sentinel 192.168.15.130:26380 192.168.15.130 26380 @ mymaster 192.168.15.130 6379
master地址发生改变
13743:X 09 Feb 22:04:52.869 # +switch-master mymaster 192.168.15.130 6379 192.168.15.130 6380
检测slave并添加到slave列表
13743:X 09 Feb 22:04:52.869 * +slave slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6380
13743:X 09 Feb 22:04:52.869 * +slave slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
13743:X 09 Feb 22:04:57.875 # +sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
6381的sentinel.log和6379的基本一致,
13514:X 09 Feb 22:04:51.641 # +sdown master mymaster 192.168.15.130 6379
13514:X 09 Feb 22:04:51.744 # +new-epoch 5
13514:X 09 Feb 22:04:51.748 # +vote-for-leader 3a00a8cb8433aaaf8a59c35a05b3af835827c5f2 5
13514:X 09 Feb 22:04:52.802 # +odown master mymaster 192.168.15.130 6379 #quorum 3/2
13514:X 09 Feb 22:04:52.802 # Next failover delay: I will not start a failover before Tue Feb 9 22:10:52 2021
13514:X 09 Feb 22:04:52.864 # +config-update-from sentinel 192.168.15.130:26380 192.168.15.130 26380 @ mymaster 192.168.15.130 6379
13514:X 09 Feb 22:04:52.865 # +switch-master mymaster 192.168.15.130 6379 192.168.15.130 6380
13514:X 09 Feb 22:04:52.865 * +slave slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6380
13514:X 09 Feb 22:04:52.865 * +slave slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
13514:X 09 Feb 22:04:57.888 # +sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
6380的sentinel.log会多记录些,但是主流程还是这些,
13809:X 09 Feb 22:04:51.651 # +sdown master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.735 # +odown master mymaster 192.168.15.130 6379 #quorum 2/2
13809:X 09 Feb 22:04:51.735 # +new-epoch 5
13809:X 09 Feb 22:04:51.735 # +try-failover master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.738 # +vote-for-leader 3a00a8cb8433aaaf8a59c35a05b3af835827c5f2 5
13809:X 09 Feb 22:04:51.747 # 192.168.15.130:26379 voted for 3a00a8cb8433aaaf8a59c35a05b3af835827c5f2 5
13809:X 09 Feb 22:04:51.749 # 192.168.15.130:26381 voted for 3a00a8cb8433aaaf8a59c35a05b3af835827c5f2 5
13809:X 09 Feb 22:04:51.791 # +elected-leader master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.791 # +failover-state-select-slave master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.848 # +selected-slave slave 192.168.15.130:6380 192.168.15.130 6380 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.848 * +failover-state-send-slaveof-noone slave 192.168.15.130:6380 192.168.15.130 6380 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:51.940 * +failover-state-wait-promotion slave 192.168.15.130:6380 192.168.15.130 6380 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:52.812 # +promoted-slave slave 192.168.15.130:6380 192.168.15.130 6380 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:52.812 # +failover-state-reconf-slaves master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:52.864 * +slave-reconf-sent slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:53.891 * +slave-reconf-inprog slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:53.891 * +slave-reconf-done slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:53.956 # -odown master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:53.957 # +failover-end master mymaster 192.168.15.130 6379
13809:X 09 Feb 22:04:53.957 # +switch-master mymaster 192.168.15.130 6379 192.168.15.130 6380
13809:X 09 Feb 22:04:53.957 * +slave slave 192.168.15.130:6381 192.168.15.130 6381 @ mymaster 192.168.15.130 6380
13809:X 09 Feb 22:04:53.957 * +slave slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
13809:X 09 Feb 22:04:59.013 # +sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
此时,6380成为Redis master,6379和6381是slave,6379还尚未启动。
启动6379的redis进程,我们看到他的redis.log,因为已经超过了检测阈值,6379不再是master,“Connecting to MASTER 192.168.15.130:6380”,连接到了新的master,并进行数据同步,
15665:M 09 Feb 22:09:39.134 # Server started, Redis version 3.0.3
15665:M 09 Feb 22:09:39.135 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
15665:M 09 Feb 22:09:39.136 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
15665:M 09 Feb 22:09:39.136 * DB loaded from disk: 0.000 seconds
15665:M 09 Feb 22:09:39.136 * The server is now ready to accept connections on port 6379
15665:S 09 Feb 22:09:49.209 * SLAVE OF 192.168.15.130:6380 enabled (user request)
15665:S 09 Feb 22:09:49.210 # CONFIG REWRITE executed with success.
15665:S 09 Feb 22:09:49.260 * Connecting to MASTER 192.168.15.130:6380
15665:S 09 Feb 22:09:49.261 * MASTER <-> SLAVE sync started
15665:S 09 Feb 22:09:49.261 * Non blocking connect for SYNC fired the event.
15665:S 09 Feb 22:09:49.262 * Master replied to PING, replication can continue...
15665:S 09 Feb 22:09:49.263 * Partial resynchronization not possible (no cached master)
15665:S 09 Feb 22:09:49.265 * Full resync from master: 1c7fdca18b4b509b634f6050219394860858982f:61491
15665:S 09 Feb 22:09:49.363 * MASTER <-> SLAVE sync: receiving 18 bytes from master
15665:S 09 Feb 22:09:49.363 * MASTER <-> SLAVE sync: Flushing old data
15665:S 09 Feb 22:09:49.363 * MASTER <-> SLAVE sync: Loading DB in memory
15665:S 09 Feb 22:09:49.364 * MASTER <-> SLAVE sync: Finished with success
6380的redis.log,记录了slave同步请求,
16016:M 09 Feb 22:09:49.263 * Slave 192.168.15.130:6379 asks for synchronization
16016:M 09 Feb 22:09:49.263 * Full resync requested by slave 192.168.15.130:6379
16016:M 09 Feb 22:09:49.264 * Starting BGSAVE for SYNC with target: disk
160816:M 09 Feb 22:09:49.264 * Background saving started by pid 15677
15677:C 09 Feb 22:09:49.275 * DB saved on disk
15677:C 09 Feb 22:09:49.276 * RDB: 0 MB of memory used by copy-on-write
16016:M 09 Feb 22:09:49.361 * Background saving terminated with success
16016:M 09 Feb 22:09:49.362 * Synchronization with slave 192.168.15.130:6379 succeeded
6381的redis.log为空,没做任何事情。
3个Sentinel的sentinel.log,看着好像有些怪,
6379的sentinel.log,
13743:X 09 Feb 22:09:39.300 # -sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
6380的sentinel.log,
13809:X 09 Feb 22:09:39.268 # -sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
13809:X 09 Feb 22:09:49.208 * +convert-to-slave slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
6381的sentine.log,
13514:X 09 Feb 22:09:39.269 # -sdown slave 192.168.15.130:6379 192.168.15.130 6379 @ mymaster 192.168.15.130 6380
此时看到6380已经是master,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 26379 -a TestRedis info
# Server
redis_version:3.0.3
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:5a9fdf887b7c26a3
redis_mode:sentinel
os:Linux 3.10.0-862.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.8.5
process_id:13743
run_id:7686d355aa2ca3aedb9bf5d4a8334770a2dcf26b
tcp_port:26379
uptime_in_seconds:2409
uptime_in_days:0
hz:14
lru_clock:2267739
config_file:/opt/oracle/app/redis/redis_6379/sentinel.conf
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
master0:name=mymaster,status=ok,address=192.168.15.130:6380,slaves=2,sentinels=3
6379的role是slave,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6379 -a TestRedis info replication
# Replication
role:slave
master_host:192.168.15.130
master_port:6380
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:215403
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
6380的role是master,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6380 -a TestRedis info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.15.130,port=6381,state=online,offset=217562,lag=0
slave1:ip=192.168.15.130,port=6379,state=online,offset=217705,lag=0
master_repl_offset:217705
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:217704
6381的role是slave,
[bisal@bisal src]$ ./redis-cli -h 192.168.15.130 -p 6381 -a TestRedis info replication
# Replication
role:slave
master_host:192.168.15.130
master_port:6380
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:219864
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
一起恢复了宁静。