设备在反复休眠唤醒后SystemServer挂掉
背景:
同事自测发现,当反复按压Power键后概率性出现系统挂掉的问题。
分析过程
查看系统log如下:
12-05 11:43:27.530 1509 1759 I Watchdog: Collecting Binder Transaction Status Information
12-05 11:43:29.704 1509 1759 E Watchdog: First set of traces taken from /data/anr/anr_2022-12-05-11-42-54-564
12-05 11:43:29.784 1509 1759 E Watchdog: Second set of traces taken from /data/anr/anr_2022-12-05-11-43-26-061
12-05 11:43:29.971 1509 1759 W Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS: Blocked in monitor com.android.server.am.ActivityManagerService on foreground thread (android.fg), Blocked in handler on main thread (main), Blocked in handler on ui thread (android.ui), Blocked in handler on ActivityManager (ActivityManager), Blocked in handler on PowerManagerService (PowerManagerService)
12-05 11:43:29.972 1509 1759 W Watchdog: android.fg annotated stack trace:
12-05 11:43:29.972 1509 1759 W Watchdog: at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:26334)
12-05 11:43:29.973 1509 1759 W Watchdog: - waiting to lock <0x0ca849e2> (a com.android.server.am.ActivityManagerService)
12-05 11:43:29.973 1509 1759 W Watchdog: at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:212)
12-05 11:43:29.973 1509 1759 W Watchdog: at android.os.Handler.handleCallback(Handler.java:873)
12-05 11:43:29.973 1509 1759 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
12-05 11:43:29.973 1509 1759 W Watchdog: at android.os.Looper.loop(Looper.java:193)
12-05 11:43:29.973 1509 1759 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:65)
12-05 11:43:29.973 1509 1759 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
12-05 11:43:29.974 1509 1759 W Watchdog: main annotated stack trace:
12-05 11:43:29.974 1509 1759 W Watchdog: at com.android.server.am.ActivityManagerService.onWakefulnessChanged(ActivityManagerService.java:13424)
12-05 11:43:29.974 1509 1759 W Watchdog: - waiting to lock <0x0ca849e2> (a com.android.server.am.ActivityManagerService)
12-05 11:43:29.974 1509 1759 W Watchdog: at com.android.server.am.ActivityManagerService$LocalService.onWakefulnessChanged(ActivityManagerService.java:26567)
12-05 11:43:29.974 1509 1759 W Watchdog: at com.android.server.power.Notifier$1.run(Notifier.java:379)
12-05 11:43:29.974 1509 1759 W Watchdog: at android.os.Handler.handleCallback(Handler.java:873)
12-05 11:43:29.974 1509 1759 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
12-05 11:43:29.975 1509 1759 W Watchdog: at android.os.Looper.loop(Looper.java:193)
12-05 11:43:29.975 1509 1759 W Watchdog: at com.android.server.SystemServer.run(SystemServer.java:467)
12-05 11:43:29.975 1509 1759 W Watchdog: at com.android.server.SystemServer.main(SystemServer.java:303)
12-05 11:43:29.975 1509 1759 W Watchdog: at java.lang.reflect.Method.invoke(Native Method)
12-05 11:43:29.975 1509 1759 W Watchdog: at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
12-05 11:43:29.975 1509 1759 W Watchdog: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:838)
12-05 11:43:29.975 1509 1759 W Watchdog: android.ui annotated stack trace:
12-05 11:43:29.976 1509 1759 W Watchdog: at com.android.server.wm.WindowManagerService.updateRotationUnchecked(WindowManagerService.java:3906)
12-05 11:43:29.976 1509 1759 W Watchdog: - waiting to lock <0x09a5faf0> (a com.android.server.wm.WindowHashMap)
12-05 11:43:29.976 1509 1759 W Watchdog: at com.android.server.wm.WindowManagerService.updateRotation(WindowManagerService.java:3859)
12-05 11:43:29.976 1509 1759 W Watchdog: at com.android.server.policy.PhoneWindowManager.updateRotation(PhoneWindowManager.java:8066)
12-05 11:43:29.976 1509 1759 W Watchdog: at com.android.server.policy.PhoneWindowManager$MyOrientationListener$UpdateRunnable.run(PhoneWindowManager.java:1121)
12-05 11:43:29.976 1509 1759 W Watchdog: at android.os.Handler.handleCallback(Handler.java:873)
12-05 11:43:29.976 1509 1759 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:99)
12-05 11:43:29.976 1509 1759 W Watchdog: at android.os.Looper.loop(Looper.java:193)
12-05 11:43:29.976 1509 1759 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:65)
12-05 11:43:29.976 1509 1759 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
12-05 11:43:29.977 1509 1759 W Watchdog: at com.android.server.UiThread.run(UiThread.java:43)
12-05 11:43:29.978 1509 1759 W Watchdog: ActivityManager annotated stack trace:
12-05 11:43:29.978 1509 1759 W Watchdog: at com.android.server.wm.WindowManagerService.onDisplayChanged(WindowManagerService.java:6824)
12-05 11:43:29.978 1509 1759 W Watchdog: - waiting to lock <0x09a5faf0> (a com.android.server.wm.WindowHashMap)
12-05 11:43:29.978 1509 1759 W Watchdog: at com.android.server.am.ActivityStackSupervisor.handleDisplayChanged(ActivityStackSupervisor.java:4454)
12-05 11:43:29.979 1509 1759 W Watchdog: - locked <0x0ca849e2> (a com.android.server.am.ActivityManagerService)
12-05 11:43:29.979 1509 1759 W Watchdog: at com.android.server.am.ActivityStackSupervisor.access$200(ActivityStackSupervisor.java:197)
12-05 11:43:29.979 1509 1759 W Watchdog: at com.android.server.am.ActivityStackSupervisor$ActivityStackSupervisorHandler.handleMessage(ActivityStackSupervisor.java:4810)
12-05 11:43:29.979 1509 1759 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:106)
12-05 11:43:29.979 1509 1759 W Watchdog: at android.os.Looper.loop(Looper.java:193)
12-05 11:43:29.979 1509 1759 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:65)
12-05 11:43:29.979 1509 1759 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
12-05 11:43:29.990 1509 1759 W Watchdog: PowerManagerService annotated stack trace:
12-05 11:43:29.990 1509 1759 W Watchdog: at android.view.SurfaceControl.openTransaction(SurfaceControl.java:734)
12-05 11:43:29.991 1509 1759 W Watchdog: - waiting to lock <0x04d6d38f> (a java.lang.Class)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.ColorFade.createSurface(ColorFade.java:572)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.ColorFade.prepare(ColorFade.java:153)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.DisplayPowerState.prepareColorFade(DisplayPowerState.java:179)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.DisplayPowerController.animateScreenStateChange(DisplayPowerController.java:1349)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.DisplayPowerController.updatePowerState(DisplayPowerController.java:778)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.DisplayPowerController.access$500(DisplayPowerController.java:81)
12-05 11:43:29.991 1509 1759 W Watchdog: at com.android.server.display.DisplayPowerController$DisplayControllerHandler.handleMessage(DisplayPowerController.java:1756)
12-05 11:43:29.992 1509 1759 W Watchdog: at android.os.Handler.dispatchMessage(Handler.java:106)
12-05 11:43:29.992 1509 1759 W Watchdog: at android.os.Looper.loop(Looper.java:193)
12-05 11:43:29.992 1509 1759 W Watchdog: at android.os.HandlerThread.run(HandlerThread.java:65)
12-05 11:43:29.992 1509 1759 W Watchdog: at com.android.server.ServiceThread.run(ServiceThread.java:44)
12-05 11:43:29.992 1509 1759 W Watchdog: *** GOODBYE!
可以知道是SystemServer的AMS线程和PMS线程被阻塞导致了WatchDog杀掉了SystemServer进程。首先需要知道当WatchDog抛出异常时会在data/anr/目录下生成当时的trace文件,导出Trace文件后即可查看当时系统被卡在了那一步:并且从WatchDog的pid可以知道当前是那个进程出现了阻塞,可以通过pid去锁定对应的trace文件
本次错误的trace文件重点如下:
traces_SystemServer_WDT05_12_11_43_29.703_pid1509
"main" prio=5 tid=1 Blocked //主线程被阻塞了
| group="main" sCount=1 dsCount=0 flags=1 obj=0x74fa9a78 self=0x70a5014c00
| sysTid=1549 nice=-2 cgrp=default sched=0/0 handle=0x712b094548
| state=S schedstat=( 3270263651 1775835539 8648 ) utm=205 stm=122 core=0 HZ=100
| stack=0x7fe8a4a000-0x7fe8a4c000 stackSize=8MB
| held mutexes=
at com.android.server.am.ActivityManagerService.broadcastIntent(ActivityManagerService.java:22082)
- waiting to lock <0x095134ef> (a com.android.server.am.ActivityManagerService) held by thread 12
at android.app.ActivityManager.broadcastStickyIntent(ActivityManager.java:4078)
at android.app.ActivityManager.broadcastStickyIntent(ActivityManager.java:4068)
at com.android.server.BatteryService.lambda$sendBatteryChangedIntentLocked$0(BatteryService.java:685)
at com.android.server.-$$Lambda$BatteryService$2x73lvpB0jctMSVP4qb9sHAqRPw.run(lambda:-1)
at android.os.Handler.handleCallback(Handler.java:873)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:193)
at com.android.server.SystemServer.run(SystemServer.java:467)
at com.android.server.SystemServer.main(SystemServer.java:303)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:838)
上面可以知道主线程被Block住了,接着看Trace可以看到这个线程在等待一个锁 0x095134ef,而这个锁被线程12持有,接着搜索这个锁和tid = 12 可以锁定如下线程:
"ActivityManager" prio=5 tid=12 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x131c0660 self=0x70a51ef000
| sysTid=1606 nice=-2 cgrp=default sched=0/0 handle=0x708847c4f0
| state=S schedstat=( 4268581309 1291222446 6398 ) utm=169 stm=257 core=3 HZ=100
| stack=0x7088379000-0x708837b000 stackSize=1041KB
| held mutexes=
at com.android.server.wm.WindowManagerService.containsShowWhenLockedWindow(WindowManagerService.java:2904)
- waiting to lock <0x0e0577cd> (a com.android.server.wm.WindowHashMap) held by thread 41
at com.android.server.am.ActivityRecord.canShowWhenLocked(ActivityRecord.java:2929)
at com.android.server.am.ActivityStack.checkKeyguardVisibility(ActivityStack.java:2055)
at com.android.server.am.ActivityStack.ensureActivitiesVisibleLocked(ActivityStack.java:1918)
at com.android.server.am.ActivityStackSupervisor.ensureActivitiesVisibleLocked(ActivityStackSupervisor.java:3789)
at com.android.server.am.ActivityStackSupervisor.ensureActivitiesVisibleLocked(ActivityStackSupervisor.java:3773)
at com.android.server.am.ActivityStackSupervisor.activityIdleInternalLocked(ActivityStackSupervisor.java:2080)
at com.android.server.am.ActivityStackSupervisor$ActivityStackSupervisorHandler.activityIdleInternal(ActivityStackSupervisor.java:4743)
- locked <0x095134ef> (a com.android.server.am.ActivityManagerService)
at com.android.server.am.ActivityStackSupervisor$ActivityStackSupervisorHandler.handleMessage(ActivityStackSupervisor.java:4773)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loop(Looper.java:193)
at android.os.HandlerThread.run(HandlerThread.java:65)
at com.android.server.ServiceThread.run(ServiceThread.java:44)
同样的这个线程在等待41号线程的0x0e0577cd锁。继续往下追:
"UEventObserver" prio=5 tid=41 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x131c21b8 self=0x70874a4800
| sysTid=1690 nice=-4 cgrp=default sched=0/0 handle=0x7085f384f0
| state=S schedstat=( 387472992 511139673 1912 ) utm=29 stm=9 core=0 HZ=100
| stack=0x7085e35000-0x7085e37000 stackSize=1041KB
| held mutexes=
kernel: __switch_to+0xac/0xb8
kernel: binder_thread_read+0x404/0x1284
kernel: binder_ioctl_write_read.constprop.47+0x1e0/0x31c
kernel: binder_ioctl+0x224/0x6d0
kernel: do_vfs_ioctl+0x774/0x85c
kernel: SyS_ioctl+0x6c/0x94
kernel: __sys_trace_return+0x0/0x4
native: #00 pc 000000000007cac8 /system/lib64/libc.so (__ioctl+4)
native: #01 pc 000000000002c8f0 /system/lib64/libc.so (ioctl+132)
native: #02 pc 000000000005ccb0 /system/lib64/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+244)
native: #03 pc 000000000005da5c /system/lib64/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+60)
native: #04 pc 000000000005d8b0 /system/lib64/libbinder.so (android::IPCThreadState::transact(int, unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+176)
native: #05 pc 00000000000518c8 /system/lib64/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+72)
native: #06 pc 000000000007c174 /system/lib64/libgui.so (android::BpSurfaceComposer::setTransactionState(android::Vector<android::ComposerState> const&, android::Vector<android::DisplayState> const&, unsigned int)+512)
native: #07 pc 0000000000095f14 /system/lib64/libgui.so (android::SurfaceComposerClient::Transaction::apply(bool)+584)
at android.view.SurfaceControl.nativeApplyTransaction(Native method)
at android.view.SurfaceControl.access$400(SurfaceControl.java:60)
at android.view.SurfaceControl$Transaction.apply(SurfaceControl.java:1397)
at android.view.SurfaceControl.closeTransaction(SurfaceControl.java:751)
- locked <0x0ffec3c9> (a java.lang.Class<android.view.SurfaceControl>)
at android.view.SurfaceControl.closeTransaction(SurfaceControl.java:770)
at com.android.server.wm.WindowManagerService.closeSurfaceTransaction(WindowManagerService.java:848)
- locked <0x0e0577cd> (a com.android.server.wm.WindowHashMap)
at com.android.server.wm.RootWindowContainer.performSurfacePlacement(RootWindowContainer.java:600)
at com.android.server.wm.WindowSurfacePlacer.performSurfacePlacementLoop(WindowSurfacePlacer.java:207)
at com.android.server.wm.WindowSurfacePlacer.performSurfacePlacement(WindowSurfacePlacer.java:155)
at com.android.server.wm.WindowSurfacePlacer.performSurfacePlacement(WindowSurfacePlacer.java:145)
at com.android.server.wm.WindowManagerService.updateRotationUnchecked(WindowManagerService.java:3915)
- locked <0x0e0577cd> (a com.android.server.wm.WindowHashMap)
at com.android.server.wm.WindowManagerService.updateRotation(WindowManagerService.java:3859)
at com.android.server.policy.PhoneWindowManager.updateRotation(PhoneWindowManager.java:8075)
at com.android.server.policy.PhoneWindowManager.setHdmiPlugged(PhoneWindowManager.java:6103)
at com.android.server.policy.PhoneWindowManager$3.onUEvent(PhoneWindowManager.java:999)
at android.os.UEventObserver$UEventThread.sendEvent(UEventObserver.java:210)
at android.os.UEventObserver$UEventThread.run(UEventObserver.java:187)
从这个线程的trace中可以知道UEventObserver正在通过binder调用android::BpSurfaceComposer::setTransactionState并等待回应。那么需要知道
android::BpSurfaceComposer::setTransactionState的服务端是那个进程,在Android图形系统这篇文档中我们知道BpSurfaceComposer的服务端是SurfaceFlinger去实现的,所以这里会调用到SurfaceFlinger的setTransactionState函数,接着我们在trace文件中搜索SurfaceFlinger的setTransactionState
"Binder:752_3" sysTid=1251
#00 pc 000000000001f3ac /system/lib64/libc.so (syscall+28)
#01 pc 00000000000225e4 /system/lib64/libc.so (__futex_wait_ex(void volatile*, bool, int, bool, timespec const*)+140)
#02 pc 0000000000092e3c /system/lib64/libc.so (NonPI::MutexLockWithTimeout(pthread_mutex_internal_t*, bool, timespec const*)+216)
#03 pc 00000000000c392c /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::setTransactionState(android::Vector<android::ComposerState> const&, android::Vector<android::DisplayState> const&, unsigned int)+124)
#04 pc 000000000007b67c /system/lib64/libgui.so (android::BnSurfaceComposer::onTransact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+4412)
#05 pc 00000000000c74ec /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::onTransact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+316)
#06 pc 000000000004fb1c /system/lib64/libbinder.so (android::BBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+136)
#07 pc 000000000005d1cc /system/lib64/libbinder.so (android::IPCThreadState::executeCommand(int)+520)
#08 pc 000000000005cf10 /system/lib64/libbinder.so (android::IPCThreadState::getAndExecuteCommand()+156)
#09 pc 000000000005d600 /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+108)
#10 pc 000000000007fae8 /system/lib64/libbinder.so (android::PoolThread::threadLoop()+24)
#11 pc 00000000000100dc /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+284)
#12 pc 0000000000092038 /system/lib64/libc.so (__pthread_start(void*)+36)
#13 pc 0000000000023968 /system/lib64/libc.so (__start_thread+68)
如下SurfaceFlinger进程的binder线程调用了setTransactionState但是却卡在了一个MutexLockWithTimeout这里也是在等待一个锁,接着我们去代码中看这个函数中在哪里获取了锁:
/frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp
3512 void SurfaceFlinger::setTransactionState(
3513 const Vector<ComposerState>& states,
3514 const Vector<DisplayState>& displays,
3515 uint32_t flags)
3516 {
3517 ATRACE_CALL();
3518
3519 handleDPTransactionIfNeeded(displays);
3520 Mutex::Autolock _l(mStateLock);//这里获取了mStateLock
3521 uint32_t transactionFlags = 0;
3522
3523 if (containsAnyInvalidClientState(states)) {
3524 return;
3525 }
.....
}
接着我们怎么知道哪里获取到了这个锁导致setTransactionState无法获取到锁呢?可以通过遍历法,主要就是查看哪些函数有获取mStateLock的代码并确定它是否出现在了trace文件中
查看整个SurfaceFlinger进程的trace文件可以看到很多线程中都有如下代码:
#01 pc 00000000000225e4 /system/lib64/libc.so (__futex_wait_ex(void volatile, bool, int, bool, timespec const)+140)
代表大部分都是在等待锁,并且大部分都是binder线程
SurfaceFlinger进程主线程正在进行的工作主要是处理刷新的请求:onMessageReceived
"surfaceflinger" sysTid=752
#00 pc 000000000007cac8 /system/lib64/libc.so (__ioctl+4)
#01 pc 000000000002c8f0 /system/lib64/libc.so (ioctl+132)
#02 pc 000000000001e784 /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::talkWithDriver(bool)+200)
#03 pc 000000000001f388 /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::waitForResponse(android::hardware::Parcel*, int*)+60)
#04 pc 000000000001b7fc /system/lib64/libhwbinder.so (android::hardware::BpHwBinder::transact(unsigned int, android::hardware::Parcel const&, android::hardware::Parcel*, unsigned int, std::__1::function<void (android::hardware::Parcel&)>)+132)
#05 pc 0000000000033a80 /system/lib64/[email protected] (android::hardware::graphics::composer::V2_1::BpHwComposerClient::_hidl_executeCommands(android::hardware::IInterface*, android::hardware::details::HidlInstrumentor*, unsigned int, android::hardware::hidl_vec<android::hardware::hidl_handle> const&, std::__1::function<void (android::hardware::graphics::composer::V2_1::Error, bool, unsigned int, android::hardware::hidl_vec<android::hardware::hidl_handle> const&)>)+388)
#06 pc 0000000000034bdc /system/lib64/[email protected] (android::hardware::graphics::composer::V2_1::BpHwComposerClient::executeCommands(unsigned int, android::hardware::hidl_vec<android::hardware::hidl_handle> const&, std::__1::function<void (android::hardware::graphics::composer::V2_1::Error, bool, unsigned int, android::hardware::hidl_vec<android::hardware::hidl_handle> const&)>)+160)
#07 pc 0000000000077128 /system/lib64/libsurfaceflinger.so (android::Hwc2::impl::Composer::execute()+1928)
#08 pc 0000000000079190 /system/lib64/libsurfaceflinger.so (android::Hwc2::impl::Composer::presentOrValidateDisplay(unsigned long, unsigned int*, unsigned int*, int*, unsigned int*)+248)
#09 pc 0000000000083c8c /system/lib64/libsurfaceflinger.so (HWC2::Display::presentOrValidate(unsigned int*, unsigned int*, android::sp<android::Fence>*, unsigned int*)+100)
#10 pc 000000000008a184 /system/lib64/libsurfaceflinger.so (android::HWComposer::prepare(android::DisplayDevice&)+356)
#11 pc 0000000000073bb8 /system/lib64/libsurfaceflinger.so (android::DisplayDevice::prepareFrame(android::HWComposer&)+32)
#12 pc 00000000000bdc4c /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::handleMessageRefresh()+2552)
#13 pc 00000000000f1354 /system/lib64/libsurfaceflinger.so (android::ExSurfaceFlinger::handleMessageRefresh()+16)
#14 pc 00000000000bd1c0 /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::onMessageReceived(int)+4080)
#15 pc 0000000000014e04 /system/lib64/libutils.so (android::Looper::pollInner(int)+336)
#16 pc 0000000000014c18 /system/lib64/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+60)
#17 pc 00000000000abbec /system/lib64/libsurfaceflinger.so (android::impl::MessageQueue::waitMessage()+84)
#18 pc 00000000000bb90c /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::run()+20)
#19 pc 00000000000031f0 /system/bin/surfaceflinger (main+932)
#20 pc 00000000000ca488 /system/lib64/libc.so (__libc_init+88)
但是追一下onMessageReceived的代码流程可以看到他在这个过程中并未获取mStateLock这把锁,所以还需要找其他线程接着注意到
下面这个线程:特点是没有等待其他的锁,并且有调用SurfaceFlinger函数中的具体函数
"surfaceflinger" sysTid=1216
#00 pc 000000000007cac8 /system/lib64/libc.so (__ioctl+4)
#01 pc 000000000002c8f0 /system/lib64/libc.so (ioctl+132)
#02 pc 000000000001e784 /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::talkWithDriver(bool)+200)
#03 pc 000000000001f3f8 /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::waitForResponse(android::hardware::Parcel*, int*)+172)
#04 pc 000000000001b7fc /system/lib64/libhwbinder.so (android::hardware::BpHwBinder::transact(unsigned int, android::hardware::Parcel const&, android::hardware::Parcel*, unsigned int, std::__1::function<void (android::hardware::Parcel&)>)+132)
#05 pc 0000000000032c1c /system/lib64/[email protected] (android::hardware::graphics::composer::V2_1::BpHwComposerClient::_hidl_setVsyncEnabled(android::hardware::IInterface*, android::hardware::details::HidlInstrumentor*, unsigned long, android::hardware::graphics::composer::V2_1::IComposerClient::Vsync)+248)
#06 pc 0000000000078df0 /system/lib64/libsurfaceflinger.so (android::Hwc2::impl::Composer::setVsyncEnabled(unsigned long, android::hardware::graphics::composer::V2_1::IComposerClient::Vsync)+44)
#07 pc 0000000000089c4c /system/lib64/libsurfaceflinger.so (android::HWComposer::setVsyncEnabled(int, HWC2::Vsync)+248)
#08 pc 00000000000cde8c /system/lib64/libsurfaceflinger.so (_ZNSt3__110__function6__funcIZN7android14SurfaceFlinger4initEvE3$_7NS_9allocatorIS4_EEFvbEEclEOb$1221f720135e8529bbfab98f8d4a5a4d+92)
#09 pc 000000000009684c /system/lib64/libsurfaceflinger.so (android::impl::EventControlThread::threadMain()+88)
#10 pc 0000000000096a30 /system/lib64/libsurfaceflinger.so
#11 pc 0000000000092038 /system/lib64/libc.so (__pthread_start(void*)+36)
#12 pc 0000000000023968 /system/lib64/libc.so (__start_thread+68)
这个线程是EventControlThread,SurfaceFlinger在init时创建了一个EventControlThread在并传进去了一个setVsyncEnabled函数,所以在EventControlThread运行时会调用到SurfaceFlinger的setVsyncEnabled函数,如下:
1537 void SurfaceFlinger::setVsyncEnabled(int disp, int enabled) {
1538 ATRACE_CALL();
1539 Mutex::Autolock lock(mStateLock);
1540 getHwComposer().setVsyncEnabled(disp,
1541 enabled ? HWC2::Vsync::Enable : HWC2::Vsync::Disable);
1542 }
上面可以看到这个函数获取了mStateLock这把锁并且卡在了和android.hardware.graphics.composer进行hwbiner通信的过程中。接下来就需要ARM组的同事去调查是什么导致了SurfaceFlinger发出了setVsyncEnabled请求但是长时间未得到回应的问题。
总结:
这类WatchDong主动去杀死某个进程基本都是程序卡在了某个地方,通过对应的Trace文件可以锁定到问题的进程和线程,但是在找根本原因时,还是需要对模块有足够的了解才能更好定位问题的根本原因。
标签:总结,11,12,java,05,Android9.0,SystemServerCrash,com,android From: https://blog.51cto.com/u_16071150/7435479