首页 > 其他分享 >msyql排查锁超时和死锁

msyql排查锁超时和死锁

时间:2024-09-19 16:14:53浏览次数:11  
标签:thread lock trx 死锁 msyql sql 超时 id schema

一、锁超时

  • 先查看当前事务,看看有没有事务时间超时的

    SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;

    查到如下结果:

 INSERT INTO information_schema.INNODB_TRX (trx_id,
 trx_state,
 trx_started,
 trx_requested_lock_id, 
 trx_wait_started, trx_weight,
 trx_mysql_thread_id, 
 trx_query, trx_operation_state,
 trx_tables_in_use,
 trx_tables_locked,
 trx_lock_structs,
 trx_lock_memory_bytes,
 trx_rows_locked,
 trx_rows_modified,
 trx_concurrency_tickets,
 trx_isolation_level,
 trx_unique_checks,
 trx_foreign_key_checks,
 trx_last_foreign_key_error,
 trx_adaptive_hash_latched,
 trx_adaptive_hash_timeout,
 trx_is_read_only,
 trx_autocommit_non_locking)
VALUES ('560205677', 'RUNNING', '2022-08-13 12:07:28', null, null, 4, 8297971, null, null, 0, 1, 3, 1136, 2, 1, 0, 'READ COMMITTED', 1, 1, null, 0, 0, 0, 0);

(上面的数据是我从数据库里直接copy出来的)
如果trx_started这个时间距离现在已经过去很久了,说明这个事务一直没提交,我们需要把这个进程杀掉,这个放在最后处理,因为我们还要查看多少事务被这个影响了,以及查到这个长事务正在执行什么sql?

  • 查看锁表或者锁行的事务
  SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS;

(mysql 8.0以上这个表已经替换成了performance_schema.data_locks)
  • 查看等待锁释放的有哪些事务
  SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS;

(mysql 8.0以上这个表已经替换成了performance_schema.data_lock_waits)
  • 查看超时的事务是在执行什么sql
  SELECT t.trx_mysql_thread_id                        AS connection_id
      ,t.trx_id                                     AS trx_id
      ,t.trx_state                                  AS trx_state
      ,t.trx_started                                AS trx_started
      ,TIMESTAMPDIFF(SECOND,t.trx_started, now())   AS "trx_run_time(s)"
      ,t.trx_requested_lock_id                      AS trx_requested_lock_id
      ,t.trx_operation_state                        AS trx_operation_state
      ,t.trx_tables_in_use                          AS trx_tables_in_use
      ,t.trx_tables_locked                          AS trx_tables_locked
      ,t.trx_rows_locked                            AS trx_rows_locked
      ,t.trx_isolation_level                        AS trx_isolation_level
      ,t.trx_is_read_only                           AS trx_is_read_only
      ,t.trx_autocommit_non_locking                 AS trx_autocommit_non_locking
      ,e.event_name                                 AS event_name
      ,e.timer_wait / 1000000000000                 AS timer_wait
      ,e.sql_text
FROM   information_schema.innodb_trx t,
       performance_schema.events_statements_current e,
       performance_schema.threads c
WHERE  t.trx_mysql_thread_id = c.processlist_id
   AND e.thread_id = c.thread_id;
  • 最后查到结果后,把长事务给杀掉

    把第一步查到的trx_mysql_thread_id拿出来,kill掉

    kill trx_mysql_thread_id;

  • mysql无法检测长事务,会导致线上sql一直被这个事务占据

    我们可以使用pt-kill写个脚本去自动处理,具体操作这里不明细

附上其他查询的sql

1.查询锁住数据库事务正在执行的sql

-- 查询锁住数据库事务正在执行的sql
select hs.THREAD_ID
     ,hs.EVENT_ID
     ,iet.trx_id
     ,hs.SQL_TEXT as historySql -- 这个事务历史执行的sql
     ,iet.SQL_TEXT as currentSql -- 这个事务当前执行的sql
     ,iet.trx_started -- 事务开始时间

, DATE_SUB(NOW(),INTERVAL (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='UPTIME')- TIMER_START*10e-13 second) AS start_time
, DATE_SUB(NOW(),INTERVAL (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='UPTIME')- TIMER_END*10e-13 second) AS end_time

from performance_schema.events_statements_history hs,
     (select e.thread_id,t.trx_id,t.trx_started,e.SQL_TEXT
FROM   information_schema.innodb_trx t,
       performance_schema.events_statements_current e,
       performance_schema.threads c
WHERE  t.trx_mysql_thread_id = c.processlist_id
   AND e.thread_id = c.thread_id) iet
where hs.THREAD_ID =iet.THREAD_ID
order by THREAD_ID,start_time asc;

2.查看被锁住的sql正在等待什么事务,执行什么sql

-- 查看被锁住的sql是在等待什么事务,执行什么sql
select hs.THREAD_ID
     ,hs.EVENT_ID
     ,iet.trx_id
     ,hs.SQL_TEXT as historySql -- 这个事务历史执行的sql
     ,iet.SQL_TEXT as currentSql -- 这个事务当前执行的sql
     ,iet.trx_started -- 事务开始时间

, DATE_SUB(NOW(),INTERVAL (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='UPTIME')- TIMER_START*10e-13 second) AS start_time
, DATE_SUB(NOW(),INTERVAL (SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME='UPTIME')- TIMER_END*10e-13 second) AS end_time

from performance_schema.events_statements_history hs,
     (select e.thread_id,t.trx_id,t.trx_started,e.SQL_TEXT
FROM   information_schema.innodb_trx t,
       performance_schema.events_statements_current e,
       performance_schema.threads c
WHERE  t.trx_mysql_thread_id = c.processlist_id
   AND e.thread_id = c.thread_id) iet,
    performance_schema.data_lock_waits dlw
where hs.THREAD_ID =iet.THREAD_ID
  and hs.THREAD_ID=dlw.BLOCKING_THREAD_ID
  and hs.EVENT_ID=dlw.BLOCKING_EVENT_ID -- 加上这行查的是最后一条执行的sql,这行注释掉就是查事务历史的执行sql
order by THREAD_ID,start_time asc;

二、死锁

死锁的排查和锁超时差不多,死锁的信息可以通过show innodb status命令从数据库查询到,里面有死锁执行的具体sql

------------------------
LATEST DETECTED DEADLOCK
------------------------
2023-08-09 08:53:11 140130527762176
*** (1) TRANSACTION:
TRANSACTION 18445008, ACTIVE 19 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 4 row lock(s), undo log entries 1
MySQL thread id 22862, OS thread handle 140126242903808, query id 405981864 192.168.60.123 root updating
/* ApplicationName=DataGrip 2022.3.3 */ update edu_health_student.tb_student_attend_1 set school_name ="龙港市第九小学1" where student_base_id=206470646604544

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 1253 page no 16 n bits 792 index indx_student_id of table `edu_health_student`.`tb_student_attend_1` trx id 18445008 lock_mode X
Record lock, heap no 51 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 8000bcefa1ab2b00; asc       + ;;
 1: len 8; hex 8000bcefa1ab2b00; asc       + ;;


*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1253 page no 16 n bits 792 index indx_student_id of table `edu_health_student`.`tb_student_attend_1` trx id 18445008 lock_mode X waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 8000bbc8b18c2300; asc       # ;;
 1: len 8; hex 8000bbc8b18c2300; asc       # ;;


*** (2) TRANSACTION:
TRANSACTION 18445009, ACTIVE 15 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 4 row lock(s), undo log entries 1
MySQL thread id 22833, OS thread handle 140126251333376, query id 405981874 192.168.60.123 root updating
/* ApplicationName=DataGrip 2022.3.3 */ update edu_health_student.tb_student_attend_1 set school_name ="温州市蒲鞋市小学龟湖校区1" where student_base_id=207737395555072

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1253 page no 16 n bits 792 index indx_student_id of table `edu_health_student`.`tb_student_attend_1` trx id 18445009 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 8000bbc8b18c2300; asc       # ;;
 1: len 8; hex 8000bbc8b18c2300; asc       # ;;


*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1253 page no 16 n bits 792 index indx_student_id of table `edu_health_student`.`tb_student_attend_1` trx id 18445009 lock_mode X waiting
Record lock, heap no 51 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 8000bcefa1ab2b00; asc       + ;;
 1: len 8; hex 8000bcefa1ab2b00; asc       + ;;

*** WE ROLL BACK TRANSACTION (2)

这里很清楚地能看到了死锁正在执行的sql,HOLDS THE LOCK(持有的锁)和WAITING FOR THIS LOCK TO BE GRANTED(等待被释放的锁),这种就是两个事务发生互相等待了,一般会有两种情况:

  • 业务逻辑上操作相反的sql
  • 逻辑上顺序一致,但是sql产生了间隙锁,两个sql互相锁住了间隙。可以看下上面日志里是否有gap关键字来判断

标签:thread,lock,trx,死锁,msyql,sql,超时,id,schema
From: https://www.cnblogs.com/leecoder5/p/18420838

相关文章

  • 线上锁超时排查(手动事务transactionTemplate导致的诡异锁超时)---此篇篇幅很长,带好瓜子
    事情起因起因是某天线上突然不停报锁超时,重启后又会变正常,但是在某个时刻又会重复发生。这个是报错的日志(日志对检测这种bug不一定有用,唯一的作用就是告诉我们啥表被锁了,但是看不出因为啥被锁的)###SQL:INSERTINTOt_check_record(id,create_time,update_time,creator,opera......
  • [Java手撕]手撕死锁
    一种不安全的写法publicclassMain{publicstaticfinalObjectlock1=newObject();publicstaticfinalObjectlock2=newObject();publicstaticvoidmain(String[]args){Threadthread1=newThread(newRunnable(){@Ov......
  • 优化 OR 条件过多导致的查询超时
    优化OR条件过多导致的查询超时文章目录优化OR条件过多导致的查询超时背景问题分析方案分析方案一:入参去重方案二:分页或者分批查询方案三:UNION代替OR方案四:IN代替OR1.分别对列进行`IN`查询,在代码中进行数据筛选。2.对多列进行`IN`查询。小结方案五:JOIN查......
  • SQL Server 的死锁
    死锁经常与正常阻塞混淆。当一个事务请求对另一个事务锁定的资源加锁时,请求加锁的事务会等待直到锁被释放。默认情况下,除非设置了LOCK_TIMEOUT,否则SQLServer事务不会超时。请求事务被阻塞,而不是死锁,因为请求事务没有做任何事情来阻塞持有锁的事务。最终,持有锁的事务将完成并释......
  • 排它锁、共享锁与死锁的全面解析
    ‍前言在数据库技术中,封锁是实现并发控制的核心手段。它保证了多个事务同时进行时的数据一致性,避免了数据冲突和错误。对于初学者来说,理解封锁技术是迈向数据库高手的第一步。这篇文章将通过简洁明了的方式,帮助你轻松掌握封锁的相关知识。什么是封锁?为什么重要?封锁,顾......
  • 完全解决git clone超时和git子模块无法下载问题
    1.解决gitclone连接超时的问题国内可用的镜像网站有:https://github.com.cnpmjs.org#服务器位于香港https://gitclone.com#服务器位于杭州https://doc.fastgit.org#服务器位于香港例如:将gitclonehttps://github.com/XXXXX修改为gitclonehttps://gi......
  • 美团面试:Redis锁如何续期?Redis锁超时,任务没完怎么办?
    文章很长,且持续更新,建议收藏起来,慢慢读!疯狂创客圈总目录博客园版为您奉上珍贵的学习资源:免费赠送:《尼恩Java面试宝典》持续更新+史上最全+面试必备2000页+面试必备+大厂必备+涨薪必备免费赠送:《尼恩技术圣经+高并发系列PDF》,帮你实现技术自由,完成职业升级,薪......
  • 谈一谈数据库中的死锁问题
    文章目录死锁是什么?死锁的四个必要条件避免死锁的策略本篇文章是基于《MySQL45讲》来写的个人理解与感悟。死锁是什么?死锁是指两个或两个以上的进程在执行过程中,由于竞争资源或者由于彼此通信而造成的一种阻塞的现象。若无外力作用,它们都将无法推进下去。此时称系......
  • 解决rabbitmq队列超时timeout问题【win环境】
    解决rabbitmq队列超时timeout问题【win环境】1.安装RabbitMQ-PluginscdC:\ProgramFiles\RabbitMQServer\rabbitmq_server-3.11.3\sbinrabbitmq-pluginsenablerabbitmq_management浏览器打开http://localhost:15672来访问web端的管理界面,用户名:guest,密码:guest进入管理......
  • 网络编程(setsockopt、超时检测)
    【1】setsockopt:设置套接字属性set:设置sock:套接字option:属性intsetsockopt(intsockfd,intlevel,intoptname,void*optval,socklen_toptlen)功能:获得/设置套接字属性参数:sockfd:套接字描述符level:协议层optname:选项名optval:选项值optlen:选项值大小返回值:......