一、研究问题1:redis 配置文件设置了选项timeout后,是否会导致大量close_wait状态连接
注:redis配置文件timeout选项说明如下
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 60
(1)窗口1: 为了进行抓包测试,可以通过python manage.py shell 创建redis的客户端连接,指令如下:
>>> from django.core.cache import caches
>>> caches['default']
>>> cl = caches['default'].client
>>> con = cl.connect()
>>> connection1 = con.connection_pool.make_connection()
>>> connection1.connect()
>>> connection1._sock.getsockname() 这里我们就建立了tcp连接了,通过getsockname函数获取连接的端口号为55310
('127.0.0.1', 55310)
此时我们通过netstat -anpl |grep 55310可以看到tcp连接已经建立:
/home $ netstat -anpl |grep 55310
netstat: showing only processes with your user ID
tcp 0 0 127.0.0.1:6379 127.0.0.1:55310 ESTABLISHED 16693/redis-server
tcp 7 0 127.0.0.1:55310 127.0.0.1:6379 ESTABLISHED 2905/python
(2)另开一个ssh界面进行抓包,我们称之为窗口2
抓包情况如下:
/home $ sudo ./tcpdump -i lo port 55310 -nn 这里使用-i lo 是因为容器中服务端和客户端是通过回环网络进行通信的
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
由于没有任何请求,所以此时抓不到任何包
此时在窗口1 python shell中执行ping 请求:
>>> connection1.send_command("PING", check_health=False)
可以看到窗口2抓包界面出现了ping包:
/home $ sudo ./tcpdump -i lo port 55310 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
15:45:30.142212 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [P.], seq 4047013069:4047013083, ack 3246937596, win 342, options [nop,nop,TS val 257508686 ecr 257484577], length 14
15:45:30.142386 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [P.], seq 1:8, ack 14, win 342, options [nop,nop,TS val 257508687 ecr 257508686], length 7
15:45:30.142402 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 8, win 342, options [nop,nop,TS val 257508687 ecr 257508687], length 0
(3) 由于我们设置了timeout 为60s, 过了60s 后发现redis_server会主动断开连接:
/home $ sudo ./tcpdump -i lo port 55310 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
15:45:30.142212 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [P.], seq 4047013069:4047013083, ack 3246937596, win 342, options [nop,nop,TS val 257508686 ecr 257484577], length 14
15:45:30.142386 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [P.], seq 1:8, ack 14, win 342, options [nop,nop,TS val 257508687 ecr 257508686], length 7
15:45:30.142402 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 8, win 342, options [nop,nop,TS val 257508687 ecr 257508687], length 0
15:46:31.584184 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [F.], seq 8, ack 14, win 342, options [nop,nop,TS val 257570128 ecr 257508687], length 0
15:46:31.623320 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 9, win 342, options [nop,nop,TS val 257570168 ecr 257570128], length 0
此时再查看下tcp连接的状态:
/home $ netstat -anpl |grep 55310
netstat: showing only processes with your user ID
tcp 0 0 127.0.0.1:6379 127.0.0.1:55310 FIN_WAIT2 -
tcp 8 0 127.0.0.1:55310 127.0.0.1:6379 CLOSE_WAIT 2905/python
但是过一会后又变成下面状态,也就是FIN_WAIT2状态的连接消失了,但是CLOSE_WAIT状态的连接不会消失
/home $ netstat -anpl |grep 55310
netstat: showing only processes with your user ID
tcp 8 0 127.0.0.1:55310 127.0.0.1:6379 CLOSE_WAIT 2905/pytho
这里可以解释下原因:
redis_server作为连接断开的发起者,首先发出FIN请求给redis_client,redis_client收到FIN请求后,返回ACK给redis_server,此时
redis_server侧状态变成FIN_WAIT2 , 处于FIN_WAIT2 状态的连接会等待服务端调用close,然后发出FIN请求。但是由于redis_client一直占用连接,并没有
发送FIN请求给redis_server, 所以会短暂的处于FIN_WAIT2 状态。为什么是短暂的处于该状态呢?因为处于该状态的连接无法再发送和接收数据,所以不能持续太久,
linux会关闭这个状态的连接,持续时间由tcp_fin_timeout决定。
redis_client返回ACK后,此时redis_client连接状态处于CLOSE_WAIT,处于该状态的连接是可以持续很久的,因为该连接状态可以在半关闭状态收发数据
(调用shutdown来关闭连接会出现这种半关闭状态),所以linux 没有限制CLOSE_WAIT的持续时间.
解决办法:
给连接加上保活机制,cache配置中添加keep-alive参数:
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://%s:%s" % (url_ipv6_sup(REDIS_HOST), REDIS_PORT),
"TIMEOUT": 60 * 60,
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"PASSWORD": REDIS_PASSWD,
"CONNECTION_POOL_KWARGS": {
'retry_on_timeout': True,
'health_check_interval': REDIS_HEALTH_CHECK_INTERVAL,
'socket_timeout': 120,
'socket_keepalive': True,
'socket_keepalive_options': {
socket.TCP_KEEPIDLE: REDIS_TCP_KEEPIDLE,
socket.TCP_KEEPINTVL: REDIS_TCP_KEEPINTVL, 常数值为10min
socket.TCP_KEEPCNT: REDIS_TCP_KEEPCNT
}
}
}
}
}
加上保活机制后,可以看到10min后redis_client发了个keep-alive包,但是由于redis_server已经关闭了连接,所以返回RSET请求,最后再查看redis_client连接,发现redis_client处于
CLOSE_WAIT连接已经没了
/home $ sudo ./tcpdump -i lo port 55310 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
15:45:30.142212 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [P.], seq 4047013069:4047013083, ack 3246937596, win 342, options [nop,nop,TS val 257508686 ecr 257484577], length 14
15:45:30.142386 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [P.], seq 1:8, ack 14, win 342, options [nop,nop,TS val 257508687 ecr 257508686], length 7
15:45:30.142402 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 8, win 342, options [nop,nop,TS val 257508687 ecr 257508687], length 0
15:46:31.584184 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [F.], seq 8, ack 14, win 342, options [nop,nop,TS val 257570128 ecr 257508687], length 0
15:46:31.623320 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 9, win 342, options [nop,nop,TS val 257570168 ecr 257570128], length 0
15:56:41.808327 IP 127.0.0.1.55310 > 127.0.0.1.6379: Flags [.], ack 9, win 342, options [nop,nop,TS val 258180353 ecr 257570128], length 0
15:56:41.808347 IP 127.0.0.1.6379 > 127.0.0.1.55310: Flags [R], seq 3246937604, win 0, length 0
标签:127.0,0.1,redis,6379,55310,nop,连接,抓包 From: https://www.cnblogs.com/kevin-zsq/p/16844014.html