首页 > 其他分享 >调试遗漏IoStartNextPacket引起的阻塞

调试遗漏IoStartNextPacket引起的阻塞

时间:2022-11-07 20:34:00浏览次数:59  
标签:NULL IRP 遗漏 00000000 线程 IoStartNextPacket devCtx irp 调试


    前面 ​​ driver verifier检测驱动死锁 ​​ 一文中本想检测一下驱动中潜在的死锁来解决驱动无响应的bug,然而并没有实质性的进展。后来通过一系列的调试终于找到了根源所在,本文用于记录查找问题的过程。

    下面是测试程序,主线程循环读驱动,子线程循环写驱动,由于是异步操作,每次读写结束后,线程都要调用WaitForSingleObject等待驱动完成IRP。可是由于驱动处理有误,ReadFile调用2次后,主线程就彻底卡死在Wait函数上了:

hDev = CreateFileA(interfaceBuff, 
GENERIC_ALL,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED,
NULL);

InitializeCriticalSection(&cs);

CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)ThreadProc, NULL, 0, &threadId);

while (1)
{
memset(readBuff, 0, 4096);
BOOL bRead = ReadFile(hDev, readBuff, 4096, &readLen, &overlapRd); 1)
WaitForSingleObject(overlapRd.hEvent, INFINITE); 2)

if (readLen)
{
EnterCriticalSection(&cs);
printf("_tmain read:%s\n",readBuff);
LeaveCriticalSection(&cs);
}

}

CloseHandle(hDev);
return 0;
}

DWORD ThreadProc(void* param)
{
OVERLAPPED overlapRd = { 0 },overlapWr = {0};
overlapRd.hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
overlapWr.hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);

char writeBuff[4096] = { "_ThreadProc" }, readBuff[4096] = { 0 };
DWORD len = 0, writeLen, readLen;

while (1)
{
WriteFile(hDev, writeBuff, strlen(writeBuff), &writeLen, &overlapWr);
WaitForSingleObject(overlapWr.hEvent, INFINITE);

}
}

    对于异步IRP,调用ReadFile会传入OVERLAPPED结构,之后WaitForSingleObject会在OVERLAPPED!HEVENT上等待。当驱动中完成读请求(调用IoCompleteRequest)时,会调用KeSetEvnet(irp->UserEvent);使得用户层的等待函数返回。基于上述分析,可以推知,ReadFile的IRP请求被挂起后一直没有被完成(悬而未决的IRP请求)。为了验证我的猜测,可以查看一下线程的IRP请求:

1.首先查看R3下线程ID,当前进程有3个线程,0/1分别对应读写线程,其ID号分别为:75c.3a8/75c.828;2号线程可能是windbg调试线程

0:002> ~*kb ;注意这里的0:002> 暗示当前是在R3下调试

0 Id: 75c.3a8 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr Args to Child
0042d53c 77165e6c 754f179c 0000007c 00000000 ntdll!KiFastSystemCallRet
0042d540 754f179c 0000007c 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0042f7d0 001b400a 00000001 005f96d0 005f9738 onlyForWrite!main+0x22b [c:\studio\samplechar\onlyforwrite\onlyforwrite\testsamplechar.cpp @ 173]

1 Id: 75c.828 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr Args to Child
01c7db38 77165e6c 754f179c 00000090 00000000 ntdll!KiFastSystemCallRet
01c7db3c 754f179c 00000090 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
01c7fd20 76ef1174 00000000 01c7fd6c 7717b3f5 onlyForWrite!ThreadProc+0x139 [c:\studio\samplechar\onlyforwrite\onlyforwrite\testsamplechar.cpp @ 200]

# 2 Id: 75c.8dc Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr Args to Child
007efa94 771bd279 7764a5c0 00000000 00000000 ntdll!DbgBreakPoint
007efac4 76ef1174 00000000 007efb10 7717b3f5 ntdll!DbgUiRemoteBreakin+0x3c

2.利用上面的线程号,在R0 windbg上查看线程挂起的IRP:

kd> !process 0 0 ;查找所有进程  注意这里调试开头是kd>
**** NT ACTIVE PROCESS DUMP ****
PROCESS 89a4bd40 SessionId: 1 Cid: 075c Peb: 7ffd4000 ParentCid: 0f8c
DirBase: 7ff7e480 ObjectTable: 8a2a16d0 HandleCount: 57.
Image: onlyForWrite.exe ;找到onlyForWrite.exe进程的信息


kd> !process 89a4bd40 6 ;查找进程onlyForWrite.exe下所有线程
PROCESS 89a4bd40 SessionId: 1 Cid: 075c Peb: 7ffd4000 ParentCid: 0f8c
DirBase: 7ff7e480 ObjectTable: 8a2a16d0 HandleCount: 57.
Image: onlyForWrite.exe

THREAD 8a927d48 Cid 075c.03a8 Teb: 7ffdf000 Win32Thread: fd3c7dd8 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8a927f10 Semaphore Limit 0x2
IRP List: ;这个线程有IRP队列
8773eef8: (0006,0100) Flags: 00060970 Mdl: 00000000


THREAD 8b3a37d8 Cid 075c.0828 Teb: 7ffde000 Win32Thread: 00000000 WAIT: (Suspended) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1
8b3a39a0 Semaphore Limit 0x2
Not impersonating
DeviceMap 946136a0
Owning Process 89a4bd40 Image: onlyForWrite.exe
Attached Process N/A Image: N/A


THREAD 8b328cd8 Cid 075c.08dc Teb: 7ffdd000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
SuspendCount 1
92ddd6e0 SynchronizationEvent
Not impersonating
DeviceMap 946136a0
Owning Process 89a4bd40 Image: onlyForWrite.exe
Attached Process N/A Image: N/A

对比windbg输出的3个线程信息,发现唯独Cid==075c.03a8线程有IRP队列。结合R3 windbg中查到的读线程ID:0x3a8,可以确定读线程的确出了什么问题。再进一步确认IRP是不是应用层发出的:

kd> !irp 8773eef8 ;8773ee8从前面THREAD 8a927d48的IRP List中获取
Irp is active with 2 stacks 2 is current (= 0x8773ef8c)
No Mdl: System buffer=8776e000: Thread 8a927d48: Irp stack trace.
cmd flg cl Device File Completion-Context
[N/A(0), N/A(0)]
0 0 00000000 00000000 00000000-00000000

Args: 00000000 00000000 00000000 00000000
>[IRP_MJ_READ(3), N/A(0)]
0 1 8a7762b8 876c7988 00000000-00000000 pending -->irp状态是挂起的
\Driver\SampleChar
Args: 00001000 00000000 00000000 00000000
kd> da 8776e000
8776e000 "................................"
8776e020 "................................"

果不其然,这是程序发出的IRP请求未被处理。至少可以肯定应该是驱动中StartIo函数有错误了。

重启电脑后,再次运行测试程序,在驱动程序读派遣函数和StartIo函数中下断点:

#pragma code_seg()
void SampleStartIo(PDEVICE_OBJECT devObj, PIRP irp)
{
KEVENT workEvt, completeEvt;
KIRQL origIrql;
NTSTATUS status = STATUS_SUCCESS;
unsigned long readLen;
SampleCharDevContext* devCtx = (SampleCharDevContext*)devObj->DeviceExtension;
IO_STACK_LOCATION* curStack = IoGetCurrentIrpStackLocation(irp);

KeInitializeEvent(&workEvt,SynchronizationEvent,FALSE);
KeInitializeEvent(&completeEvt, NotificationEvent, FALSE);

if (curStack->Parameters.Read.Length > 4096)
{
status = irp->IoStatus.Status = STATUS_BUFFER_OVERFLOW;
irp->IoStatus.Information = 0;

IoCompleteRequest(irp, IO_NO_INCREMENT);
IoStartNextPacket(devObj, FALSE);
return;
}

KeAcquireSpinLock(&devCtx->devSpinLock, &origIrql);

if(devCtx->buffPos != 0x00UL)
{
readLen = devCtx->buffPos >= curStack->Parameters.Read.Length ? curStack->Parameters.Read.Length : devCtx->buffPos;
RtlCopyMemory(irp->AssociatedIrp.SystemBuffer,
devCtx->SampleBuff,
readLen);
devCtx->buffRemained += readLen;
devCtx->buffPos -= readLen;
}
else
{
KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);

irp->IoStatus.Status = STATUS_SUCCESS;
irp->IoStatus.Information = 0x00UL;
IoCompleteRequest(irp, IO_NO_INCREMENT); ->1)
//maybe blocked at here,if without this
//IoStartNextPacket(devObj, FALSE); ->4)
return; ->2)
}

KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);

IoCopyCurrentIrpStackLocationToNext(irp);
IoSetCompletionRoutine(irp, IrpAsyncReadCompleteRoutine, &completeEvt, TRUE, TRUE, TRUE);

status = IoCallDriver(devCtx->lowerDev,irp);
if (status == STATUS_PENDING)
{
KeWaitForSingleObject(&completeEvt, Executive, KernelMode, FALSE, NULL);
}

irp->IoStatus.Status = STATUS_SUCCESS;
irp->IoStatus.Information = readLen;
IoCompleteRequest(irp, IO_NO_INCREMENT);

IoStartNextPacket(devObj,FALSE);
}

NTSTATUS SampleCharReadAsync(PDEVICE_OBJECT devObj, PIRP irp)
{
IoMarkIrpPending(irp);
IoStartPacket(devObj,irp,NULL,NULL); ->3)
return STATUS_PENDING;
}

第一次调用ReadFile时,驱动依次经过代码中标注3)->1)->2)的顺序处理IRP;当第二次调用ReadFile时,程序只经历了步骤3),之后StartIo就再也没有被调用。给我的感觉是设备默认的IRP队列出问题了。检查一下设备队列的情况:

kd> !devobj 8a7762b8 -------------->8a7762b8是前面!irp 命令输出的irp相关的devobj
Device object (8a7762b8) is for:
SampleChar \Driver\SampleChar DriverObject 8774cb18
Current Irp 8773eef8 RefCount 0 Type 00000022 Flags 0000204c
Dacl 8fb049c0 DevExt 8a776370 DevObjExt 8a7763f0
ExtensionFlags (0x30000000) DOE_RAW_FDO
Unknown flags 0x10000000
Characteristics (0x00000100) FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) 8761a400 \Driver\PnpManager
Device queue is busy - Queue flink = blink ------->windbg显示设备队列忙

    结合wrk源码,只有当设备队列为空闲时,Io管理器会将设备队列中的IRP出队交给StartIo处理。而调用IoStartPacket会置设备队列为Busy;调用IoStartNextPacket会置设备队列为Free。这么分析下来,就是说我的代码中少做一次IoStartNextPacket,进一步说,前一次处理IRP后,漏调用IoStartNextPacket....基于这样的猜测,错误就明朗了,只要把代码中标注为4)的位置前的注释符去掉,重新编译即可。再次验证,果然解决了驱动没有响应的情况。

标签:NULL,IRP,遗漏,00000000,线程,IoStartNextPacket,devCtx,irp,调试
From: https://blog.51cto.com/u_13927568/5831330

相关文章

  • windbg调试窗口过程WindowProc(winxp 32bit)
      ollydbg在调试窗口程序方面做得很便捷,虽然windbg在这方面不如od,但通过命令的组合也能达到类似的效果。我借winxp的calc.exe为例来谈谈如何用windbg调试窗口过程。 ......
  • 编写windbg调试器扩展 入门篇1
      我博客的左侧专栏曾经转过windows下编写调试器的一系列文章,这类文章是从零打造调试器,而这篇文章是介绍如何为windbg编写调试器扩展命令。0.前言  windbg的命令......
  • Linux性能调试——stress压测工具详解
    目录一.stress简介1.stress简介2.stress安装二.stress使用1.stress命令2.使用三.stress测试场景四.stress-ng简介1.stress-ng简介2.stress-ng安装3.stress-ng命令一.stres......
  • 移动端调试工具
    移动端调试工具一般都用的VConsole现在分享一个其他调试工具 eruda个人感觉挺好用的//调试工具https://cdn.jsdelivr.net/npm/eruda除VConsole外另一个移动端调试......
  • NodeJs断点调试详细说明
    node在运行js的时候,可以传递参数让其运行在调试模式,运行在调试模式后,引擎会向某个端口发送调试相关信息。一、在devTools中调试1.在chrome中接收调试信息在浏览器上打开:ch......
  • node.js inspect 浏览器 断点调试技巧与原理
    做前端开发你一定会用到浏览器自带的各种调试工具firebug谷歌的debugtools等,我们太过于熟悉使用这些工具了,以致于在node开发中同样的js文件我们是否也可以用浏览器就行调......
  • sshd开启调试模式
    今天使用私钥方式无法登录服务器,做了以下排查,都没解决问题:1.删除root目录下.ssh目录,重新解压自己秘钥2.查看.ssh目录及子文件权限3.替换正常的sshd_config文件只能尝试......
  • 【node专题】node几种调试方式
    【node专题】node几种调试方式前端工作日2021-11-24223点击上方【前端工作日】关注我哦~node项目调试和我们写web界面的调试基本是一致的,本质都是在某个地方查......
  • 11 Sonic - 调试 opencv
    sonic项目本身支持opencv图像识别库,但我在mac上调试却一直报错,最后在mac上也没有成功。没办法只能在windows上调试opencv代码了。一、mac上报错信息在mac上一......
  • stm32cubeIDE STLINK连接SWD接口调试,启动GDB server失败的处理
    stm32cubeIDESTLINK连接SWD接口调试,启动GDBserver失败的处理报警提示Failedtobindtoport61234,errorcode-1:NoerrorFailurestartingGDBserver:TCPpor......