首页 > 其他分享 >中断嵌套导致的panic

中断嵌套导致的panic

时间:2023-09-25 11:46:03浏览次数:41  
标签:11 casa 17 23 中断 1688029807 嵌套 panic 0629

串口日志:

[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:01:42]casa_switch_client_tx: ioctl rc -1
[0629_17:02:03][Thu Jun 29 09:00:47.695 UTC 2023]-AL-SNMP-1: smm6: docs_ev_receive(pid=2182): docs onoff rb is full, wait cnt 2, log cnt 262145
[0629_17:02:03]docs onoff rb is full, wait cnt 2, log cnt 262145
[0629_17:02:16]casa_switch_client_tx: ioctl rc -1
[0629_17:02:16]casa_switch_client_tx: ioctl rc -1
[0629_17:02:16]casa_switch_client_tx: ioctl rc -1
[0629_17:02:16]casa_switch_client_tx: ioctl rc -1
[0629_17:02:16]casa_switch_client_tx: ioctl rc -1
[0629_17:02:18]casa_switch_client_tx: ioctl rc -1
[0629_17:02:18]casa_switch_client_tx: ioctl rc -1
[0629_17:07:48][Thu Jun 29 09:06:32.679 UTC 2023]-AL-SNMP-1: smm6: docs_ev_receive(pid=2182): docs onoff rb is full, wait cnt 1, log cnt 524289
[0629_17:07:48]docs onoff rb is full, wait cnt 1, log cnt 524289
[0629_17:11:23][1688029807] 0:NMI watchdog occurred during interrupt
[0629_17:11:23][1688029807] 0:[#1]:
[0629_17:11:23][1688029807] 0:Cpu 0
[0629_17:11:23][1688029807] 0:$ 0   : 00000000 1000dc00 00000001 00000001
[0629_17:11:23][1688029807] 0:$ 4   : 873ee08c 1000dc01 00000000 00000000
[0629_17:11:23][1688029807] 0:$ 8   : 8c887b58 00000008 87a90000 bef1f000
[0629_17:11:23][1688029807] 0:$12   : 00000000 8c887c38 82378210 00000000
[0629_17:11:23][1688029807] 0:$16   : 873ee000 00000022 00000000 00000000
[0629_17:11:23][1688029807] 0:$20   : 82048c98 82048c90 8201075c 8208cd64
[0629_17:11:23][1688029807] 0:$24   : 00000008 8201194c                  
[0629_17:11:23][1688029807] 0:$28   : 8c886000 8c887b40 82400950 c00b2534
[0629_17:11:23][1688029807] 0:Hi    : 00000000
[0629_17:11:23][1688029807] 0:Lo    : 00000000
[0629_17:11:23][1688029807] 0:epc   : 82010cd4 _write_unlock_irq+0x38/0x154
[0629_17:11:23][1688029807] 0:    Tainted: P         
[0629_17:11:23][1688029807] 0:ra    : c00b2534 casa_bkn_get_congestion_data+0x2370/0x2b9c [linux_bcm_knet]
[0629_17:11:23][1688029807] 0:Status: 1000dc05    KERNEL ERL IE 
[0629_17:11:23][1688029807] 0:Cause : 10800000
[0629_17:11:23][1688029807] 0:PrId  : 000c4003 (XLS616 Rev B1)
[0629_17:11:23][1688029807] 0:Modules linked in: kmod_util(P) kcasalog(P) linux_casa_knet(P) linux_bcm_knet(P) linux_uk_proxy(P) linux_user_bde(P) linux_kernel_bde(P) zbbusctr
[0629_17:11:23][1688029807] 0:Process cat (pid: 31182, threadinfo=8c886000, task=8ca8b6f0, tls=2aacbea0)
[0629_17:11:23][1688029807] 0:Stack : c0078ee0 c0078eb8 8201075c 8208cd64 c008c308 00000022 00000000 00000000
[0629_17:11:23][1688029807] 0:        82048c98 82048c90 c0081588 c0081574 89b21480 85c87218 85c87248 89b214b8
[0629_17:11:23][1688029807] 0:        8733b520 8208b940 8c887bd8 00000001 82412900 00000022 82412900 00000022
[0629_17:11:23][1688029807] 0:        8733b520 82412930 87b50000 8208b8a4 8208bb7c 8208bbfc 00000022 85c87218
[0629_17:11:23][1688029807] 0:        00000400 00400000 00000022 00000000 8c887e08 8ca9e000 ffffffff ffffffff
[0629_17:11:23][1688029807] 0:        ...
[0629_17:11:23][1688029807] 0:Call Trace:
[0629_17:11:23][1688029807] 0:[<82010cd4>] _write_unlock_irq+0x38/0x154
[0629_17:11:23][1688029807] 0:[<c00b2534>] casa_bkn_get_congestion_data+0x2370/0x2b9c [linux_bcm_knet]
[0629_17:11:23][1688029807] 0:
[0629_17:11:23][1688029807] 0:
[0629_17:11:23][1688029807] 0:Code: 1000fe9b  24020000  c0820000 <1440fffe> 24020001  1000fe9f  00000000  c0820000  1440fffe 
[0629_17:11:23][1688029807] 0: xsave_dmesg:[cat 31182]: preempt=10020000 ctxSwtch:[v=2 iv=19]
[0629_17:11:23][1688029807] 0: part_panic_wite mtd=[8e8505e0,DMESG-SAVE] write[off=0 len=82796] [erasesz=20000 writesz=1]
[0629_17:11:23][1688029807] 0: !!!cfi_casa_panic_write: mtd[8e859c00, Casa10G Flash]: write[87b136f8] to off=420000 len=82796 In_RANGE=1
[0629_17:11:23][1688029807] 0:!! map=82405ab0 cfi=8e859cc0 cfi->numchips=1 chip[0]=8e859cf8 shift=26
[0629_17:11:23][1688029807] 0: **cfi_casa_panic_write():Good to go for [82796 bytes)!!
[0629_17:11:23][1688029807] 0: **cfi_casa_panic_write():Done [82796 bytes in 0 ticks]!!
[0629_17:11:23][1688029807] 0: xsave_dmesg:[cat 31182]: preempt=10020000 ctxSwtch:[v=2 iv=19]
[0629_17:11:23][1688029807] 0: **XSAVE_DMESG: DONE saving 82796 bytes from 87b136f8 to dmesg_dump
[0629_17:11:23][1688029807] 0:Kernel panic - not syncing: Fatal exception in interrupt


当时测试正在跑脚本查看 /proc 下的内容:

croot@cts01brnsor:/proc/bcm/knet# while true ; do grep -a -A 8 "Tx DCB info"
dma && grep -i drop stats ; usleep 500000 ; done

发现在自己的环境上跑这个命令也能够复现
并且在串口上有这样的输出

[1688031351] 0:NMI watchdog occurred during interrupt
[1688031351] 0:[#1]:
[1688031351] 0:Cpu 0
console login: [1688037020] 0:NMI watchdog configured to reboot
[1688037020] 0:[#1]:
[1688037020] 0:Cpu 0
console login: [1688038743] 0:NMI watchdog occurred during interrupt
[1688038743] 0:[#1]:
[1688038743] 0:Cpu 0
[1688040014] 0:NMI watchdog occurred during interrupt
[1688040014] 0:[#1]:
[1688040014] 0:Cpu 0
console login: [1688046101] 0:NMI watchdog occurred during interrupt
[1688046101] 0:[#1]:
[1688046101] 0:Cpu 0


查看一下kernel里对应的源码:

313 //--- This is really the combined regular IRQ and NMI handler
314 // The cpu ack register doesn't tell who failed to report
315 void  watchdog_irq_handle(int isnmi, struct pt_regs *regs)
316 {
...............
368     if (in_interrupt()) {
369       kill = 1;
370       strcpy(str, "NMI watchdog occurred during interrupt\n");
371     }
372     else if (isnmi && (phoenix_wdt_status & 1)) {
373       kill = 1;
374       strcpy(str, "NMI watchdog configured to reboot\n");
375     }
"./drivers/watchdog/phoenix_wdt.c" 926L, 24481C

看起来是由于死锁导致喂狗失败导致的,并且 proc read dma 的实现中也有 spin_lock() 上锁
查了一下spin_lock()spin_lock_irqsave()的区别,发现spin_lock_irqsave()会禁用本地中断并且禁止内核抢占,而spin_lock()不会
关于NMI watchdog中断的说明如下:


看一下给出的Call Trace

[0629_17:11:23][1688029807] 0:Call Trace:
[0629_17:11:23][1688029807] 0:[<82010cd4>] _write_unlock_irq+0x38/0x154
[0629_17:11:23][1688029807] 0:[<c00b2534>] casa_bkn_get_congestion_data+0x2370/0x2b9c [linux_bcm_knet]

这里最顶层就是尝试去unlock,这是在另一个 check rx 线程里的动作
而在 proc read dma 中,read 的实现全程都占用了sinfo->lock
所以很明显的就是两个线程同时访问锁,导致死锁,进而看门狗超时,设备重启

改成用spin_lock_irqsave()上锁就好了

标签:11,casa,17,23,中断,1688029807,嵌套,panic,0629
From: https://www.cnblogs.com/fallenmoon/p/17727603.html

相关文章

  • 访问空指针出panic
    panic信息:<4>[1670581299]15:Failedonwritei2cdev=23(I2CDEV_RFSW0)<4>[1670581299]15:txretryfailed<4>[1670581299]15:Failedonwritei2cdev=23(I2CDEV_RFSW0)<1>[1670595592]4:CPU4Unabletohandlekernelpagingrequestatvir......
  • python基础 循环嵌套
    嵌套语句使用星号打印foriinrange(1,6):forxinrange(1,i+1):print('*',end='')print('')foriinrange(1,6):forxinrange(i,6):print('*',end='')print('')99乘法表  while......
  • 嵌入式专题研究1:USART和中断系统
    USART通信同步通信和异步通信:同步通信:通信双方根据同步信号进行通信的方法。异步通信:依赖于双方彼此的独立时钟,约定好通信速度。串行通信和并行通信寄存器介绍:控制寄存器USART_CR:数据位,停止位,奇偶校验波特率寄存器USART_BRR状态寄存器USART_SR:发送寄存器和接受寄存器的状......
  • 6 定时器 &中断管理&资源管理
    生成函数周期,多久触发一次定时器(从定时器启动开始计算) 触发什么呢?回调函数被调用,被谁调用呢1,tick中断去调用timer函数 Linux中使用这个 (定时器回调函数若执行时间过长,会阻碍其他定时器函数执行)2,freeRTOS中一般定时器守护函数设置优先级比较高使用定时器的话,需要定义......
  • label嵌套radio对齐方法
    1.暂时使用flex和margin-top来对齐<style>.rd{display:flex;align-items:center;}.rdinput{margin-top:1px}</style><div><labelclass="rd"><inputtype="radio"value="1"name="gender"......
  • INT13H 是一个基本的硬盘控制器中断,用于在实模式下进行对硬盘的读写操作。它提供了多
    INT13H是一个基本的硬盘控制器中断,用于在实模式下进行对硬盘的读写操作。它提供了多个功能来操控硬盘,包括磁盘读、磁盘写、磁盘参数查询等。以下是一些常见的INT13H功能:磁盘读取:功能号2:从磁盘上读取扇区到内存缓冲区。寄存器AH=02h,AL表示要读取的扇区数量,CH表示磁道号......
  • 中断-CH32V103如何关闭全局中断
    CH32V103相较于其他RISCV芯片,无法关闭全局中断。目前EVT源码工程中只预留了关闭单个线中断的函数,如下图所示:实现CH32V103关闭全局中断的操作方式如下:1、替换使用特定的起始文件和core_riscv.h文件,因博客无法上传附件,可在此链接下获取:https://www.wch.cn/bbs/thread-109625-1.ht......
  • 07_IO外部中断
    IO外部中断cubemx上拉,用杜邦线,将PA11连接GND,将打印gpioisinputcubemx下拉,用杜邦线,将PA11连接V3.3,将打印gpioisinput......
  • 汇编-dos中断大全
     中断INT21H:1、字符功能调用类(Character-OrientedFunction)01H、07H和08H—从标准输入设备输入字符02H—字符输出03H—辅助设备的输入04H—辅助设备的输出05H—打印输出06H—控制台输入/输出09H—显示字符串0AH—键盘缓冲输入0BH—检测输入状态0CH—清输入缓冲......
  • 汇编-BIOS中断大全:“INT 10H ~ INT 1AH”
     BIOS中断大全:“INT10H~INT1AH”1、显示服务(VideoService——INT10H)00H—设置显示器模式01H—设置光标形状02H—设置光标位置03H—读取光标信息04H—读取光笔位置05H—设置显示页06H、07H—初始化或滚屏08H—读光标处的字符及其属性09H—在光标处按指定属性......