首页 > 系统相关 >Android进程冻结机制

Android进程冻结机制

时间:2023-11-01 22:32:10浏览次数:37  
标签:冻结 do sysTid freezer trap 进程 Android ADJ android

奇怪的ANR

今天遇到了个很有意思的anr问题, 应用出现了anr:

7696:08-29 14:12:59.564898  7904  8341 I WindowManager: ANR in Window{3b0709 u0 me.linjw.demo.anr}. Reason:3b0709 me.linjw.demo.anr (server) is not responding. Waited 5001ms for FocusEvent(hasFocus=false)
8367:08-29 14:13:11.713363  7904 27946 E ActivityManager: ANR in me.linjw.demo.anr

但是trace文件里面没有任何堆栈:

Subject: Input dispatching timed out (3b0709 me.linjw.demo.anr (server) is not responding. Waited 5001ms for FocusEvent(hasFocus=false))

--- CriticalEventLog ---
capacity: 20
timestamp_ms: 1693311179660
window_ms: 300000

libdebuggerd_client: failed to read status response from tombstoned: timeout reached?

----- Waiting Channels: pid 26859 at 2023-08-29 14:12:59.664895544+0200 -----
Cmd line: me.linjw.demo.anr

sysTid=26859     do_freezer_trap
sysTid=26864     do_freezer_trap
sysTid=26865     do_freezer_trap
sysTid=26866     do_freezer_trap
sysTid=26867     do_freezer_trap
sysTid=26868     do_freezer_trap
sysTid=26869     do_freezer_trap
sysTid=26870     do_freezer_trap
sysTid=26871     do_freezer_trap
sysTid=26872     do_freezer_trap
sysTid=26873     do_freezer_trap
sysTid=26874     do_freezer_trap
sysTid=26875     do_freezer_trap
sysTid=26877     do_freezer_trap
sysTid=26879     do_freezer_trap
sysTid=26880     do_freezer_trap
sysTid=26882     do_freezer_trap
sysTid=26883     do_freezer_trap
sysTid=26887     do_freezer_trap
sysTid=26912     do_freezer_trap
sysTid=26918     do_freezer_trap
sysTid=26919     do_freezer_trap
sysTid=26922     do_freezer_trap
sysTid=26923     do_freezer_trap
sysTid=26938     do_freezer_trap
sysTid=27772     do_freezer_trap
sysTid=27815     do_freezer_trap
sysTid=27826     do_freezer_trap
sysTid=27827     do_freezer_trap

----- end 26859 -----

libdebuggerd_client: failed to read status response from tombstoned: Try again

----- Waiting Channels: pid 26859 at 2023-08-29 14:13:09.677383215+0200 -----
Cmd line: me.linjw.demo.anr

sysTid=26859     do_freezer_trap
sysTid=26864     do_freezer_trap
sysTid=26865     do_freezer_trap
sysTid=26866     do_freezer_trap
sysTid=26867     do_freezer_trap
sysTid=26868     do_freezer_trap
sysTid=26869     do_freezer_trap
sysTid=26870     do_freezer_trap
sysTid=26871     do_freezer_trap
sysTid=26872     do_freezer_trap
sysTid=26873     do_freezer_trap
sysTid=26874     do_freezer_trap
sysTid=26875     do_freezer_trap
sysTid=26877     do_freezer_trap
sysTid=26879     do_freezer_trap
sysTid=26880     do_freezer_trap
sysTid=26882     do_freezer_trap
sysTid=26883     do_freezer_trap
sysTid=26887     do_freezer_trap
sysTid=26912     do_freezer_trap
sysTid=26918     do_freezer_trap
sysTid=26919     do_freezer_trap
sysTid=26922     do_freezer_trap
sysTid=26923     do_freezer_trap
sysTid=26938     do_freezer_trap
sysTid=27772     do_freezer_trap
sysTid=27815     do_freezer_trap
sysTid=27826     do_freezer_trap
sysTid=27827     do_freezer_trap

----- end 26859 -----

从日志上过滤进程pid可以看到正在正常的执行任务,还没有执行完就被am_freeze冻结了进程:

08-29 14:11:45.807967 26859 27815 V MessageEncoder: ... // 正常执行任务的打印
08-29 14:11:45.809835 26859 26859 D FloatView: ... // 正常执行任务的打印,任务没有执行完,后面应该还有打印但实际没有
08-29 14:11:45.884625  7904  8331 D ActivityManager: freezing 26859 me.linjw.demo.anr
08-29 14:11:45.885503  7904  8331 I am_freeze: [26859,me.linjw.demo.anr]
08-29 14:12:59.660658  7904 27946 I am_anr  : [0,26859,me.linjw.demo.anr,545832517,Input dispatching timed out (3b0709 me.linjw.demo.anr (server) is not responding. Waited 5001ms for FocusEvent(hasFocus=false))]

由于进程被冻结了,所以处理不了Input消息所以anr,由于进程被冻结了,所以anr的时候让进程去dump堆栈的请求也不会被处理。

Freeze

很多的进程在退出前台之后会长期在后台占用内存、cpu,影响用户体验。在内存不足的时候会触发lmk清除内存,但是如果内存充足的情况下为了加速应用的切换速度,是不会杀死后台进程的。为了解决应用在后台默默消化cpu资源的问题,高版本的安卓实现了一套冻结进程的机制。

我们可以在开发者选项里面找到"Suspend execution for cached apps"条目去控制后台进程冻结功能的开关,也可以用命令去控制:

adb shell settings put global cached_apps_freezer <enabled|disabled|default>

  • enable 打开
  • disabled 关闭
  • default 由系统决定是否打开

进程的OOM_ADJ (Out of Memory Adjustment)值除了决定系统内存不足的时候是否回收该进程,进程冻结策略也是依赖它去计算的。有下面的这些场景会触发进程oom adj值的重新计算,大概有切换Activity、启动广播、绑定服务、是否可见状态改变等:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
public class OomAdjuster {
    static final String TAG = "OomAdjuster";
    static final String OOM_ADJ_REASON_METHOD = "updateOomAdj";
    static final String OOM_ADJ_REASON_NONE = OOM_ADJ_REASON_METHOD + "_meh";
    static final String OOM_ADJ_REASON_ACTIVITY = OOM_ADJ_REASON_METHOD + "_activityChange";
    static final String OOM_ADJ_REASON_FINISH_RECEIVER = OOM_ADJ_REASON_METHOD + "_finishReceiver";
    static final String OOM_ADJ_REASON_START_RECEIVER = OOM_ADJ_REASON_METHOD + "_startReceiver";
    static final String OOM_ADJ_REASON_BIND_SERVICE = OOM_ADJ_REASON_METHOD + "_bindService";
    static final String OOM_ADJ_REASON_UNBIND_SERVICE = OOM_ADJ_REASON_METHOD + "_unbindService";
    static final String OOM_ADJ_REASON_START_SERVICE = OOM_ADJ_REASON_METHOD + "_startService";
    static final String OOM_ADJ_REASON_GET_PROVIDER = OOM_ADJ_REASON_METHOD + "_getProvider";
    static final String OOM_ADJ_REASON_REMOVE_PROVIDER = OOM_ADJ_REASON_METHOD + "_removeProvider";
    static final String OOM_ADJ_REASON_UI_VISIBILITY = OOM_ADJ_REASON_METHOD + "_uiVisibility";
    static final String OOM_ADJ_REASON_ALLOWLIST = OOM_ADJ_REASON_METHOD + "_allowlistChange";
    static final String OOM_ADJ_REASON_PROCESS_BEGIN = OOM_ADJ_REASON_METHOD + "_processBegin";
    static final String OOM_ADJ_REASON_PROCESS_END = OOM_ADJ_REASON_METHOD + "_processEnd";
    ...
}

冻结流程

例如Activity destroy的时候在ActivityRecord.setState里面就会去更新进程状态,更新进程状态的时候就会更新oom adj:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/wm/ActivityRecord.java
WindowProcessController app;      // if non-null, hosting application

void setState(State state, String reason) {
    ...
    switch (state) {
        ...
        case DESTROYING:
            if (app != null && !app.hasActivities()) {
                // Update any services we are bound to that might care about whether
                // their client may have activities.
                // No longer have activities, so update LRU list and oom adj.
                app.updateProcessInfo(true /* updateServiceConnectionActivities */,
                        false /* activityChange */, true /* updateOomAdj */,
                        false /* addPendingTopUid */);
            }
            break;
        ...
    }
    ...
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/wm/WindowProcessController.java
void updateProcessInfo(boolean updateServiceConnectionActivities, boolean activityChange,
        boolean updateOomAdj, boolean addPendingTopUid) {
    if (addPendingTopUid) {
        addToPendingTop();
    }
    if (updateOomAdj) {
        prepareOomAdjustment();
    }
    // Posting on handler so WM lock isn't held when we call into AM.
    // 这里是延迟去调用mListener的WindowProcessListener::updateProcessInfo方法,而mListener实际是实现了WindowProcessListener接口的ProcessRecord
    final Message m = PooledLambda.obtainMessage(WindowProcessListener::updateProcessInfo,
            mListener, updateServiceConnectionActivities, activityChange, updateOomAdj);
    mAtm.mH.sendMessage(m);
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/ProcessRecord.java
class ProcessRecord implements WindowProcessListener {
    ...
    @Override
    public void updateProcessInfo(boolean updateServiceConnectionActivities, boolean activityChange,
            boolean updateOomAdj) {
        ...
        if (updateOomAdj) {
            mService.updateOomAdjLocked(this, OomAdjuster.OOM_ADJ_REASON_ACTIVITY);
        }
        ...
    }
    ...
}

进程oom adj值的重新计算最终会去到OomAdjuster.applyOomAdjLSP,在里面就会调用updateAppFreezeStateLSP去更新进程的进程冻结状态:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
final void updateOomAdjLocked(String oomAdjReason) {
    mOomAdjuster.updateOomAdjLocked(oomAdjReason);
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
boolean updateOomAdjLocked(ProcessRecord app, String oomAdjReason) {
    synchronized (mProcLock) {
        return updateOomAdjLSP(app, oomAdjReason);
    }
}

private boolean performUpdateOomAdjLSP(ProcessRecord app, String oomAdjReason) {
    ...
    applyOomAdjLSP(app, false, SystemClock.uptimeMillis(),
                        SystemClock.elapsedRealtime(), oomAdjReason);
    ...
}

private boolean applyOomAdjLSP(ProcessRecord app, boolean doingAll, long now,
            long nowElapsed, String oomAdjReson) {
  ...
  updateAppFreezeStateLSP(app);
  ...
}

updateAppFreezeStateLSP里面判断adj >= CACHED_APP_MIN_ADJ(900)的时候就会去调用freezeAppAsyncLSP, 进程的adj在900 ~ 999代表它只有不可见的activity,可以随时被干掉,所以我们去冻结它也不会有影响:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
private void updateAppFreezeStateLSP(ProcessRecord app) {
    ...
    final ProcessStateRecord state = app.mState;
    // Use current adjustment when freezing, set adjustment when unfreezing.
    if (state.getCurAdj() >= ProcessList.CACHED_APP_MIN_ADJ && !opt.isFrozen()
            && !opt.shouldNotFreeze()) {
        mCachedAppOptimizer.freezeAppAsyncLSP(app);
    } else if (state.getSetAdj() < ProcessList.CACHED_APP_MIN_ADJ) {
        mCachedAppOptimizer.unfreezeAppLSP(app, oomAdjReason);
    }
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/ProcessList.java

// This is a process only hosting activities that are not visible,
// so it can be killed without any disruption.
public static final int CACHED_APP_MAX_ADJ = 999;
public static final int CACHED_APP_MIN_ADJ = 900;

freezeAppAsyncLSP里面会post一个10分钟的message在时间到了的时候去冻结进程(就是10分钟之后调用Process.setProcessFrozen):

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/CachedAppOptimizer.java
@VisibleForTesting static final long DEFAULT_FREEZER_DEBOUNCE_TIMEOUT = 600_000L;
@VisibleForTesting volatile long mFreezerDebounceTimeout = DEFAULT_FREEZER_DEBOUNCE_TIMEOUT;

void freezeAppAsyncLSP(ProcessRecord app) {
    final ProcessCachedOptimizerRecord opt = app.mOptRecord;
    if (opt.isPendingFreeze()) {
        // Skip redundant DO_FREEZE message
        return;
    }

    mFreezeHandler.sendMessageDelayed(
            mFreezeHandler.obtainMessage(
                SET_FROZEN_PROCESS_MSG, DO_FREEZE, 0, app),
            mFreezerDebounceTimeout);
    ...
}

public void handleMessage(Message msg) {
    switch (msg.what) {
        case SET_FROZEN_PROCESS_MSG:
            synchronized (mAm) {
                freezeProcess((ProcessRecord) msg.obj);
            }
            break;
        ...
    }
}

private void freezeProcess(final ProcessRecord proc) {
    ...
    Process.setProcessFrozen(pid, proc.uid, true);
    ...
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/core/java/android/os/Process.java

/**
 * Freeze or unfreeze the specified process.
 *
 * @param pid Identifier of the process to freeze or unfreeze.
 * @param uid Identifier of the user the process is running under.
 * @param frozen Specify whether to free (true) or unfreeze (false).
 *
 * @hide
 */
public static final native void setProcessFrozen(int pid, int uid, boolean frozen);

总结一下就是,如果进程的oom adj大于CACHED_APP_MIN_ADJ,就会启动一个10分钟的定时器,在10分钟之内如果进程的oom adj一直没有变回小于CACHED_APP_MIN_ADJ就会冻结进程。

解冻流程

同样Activity start的时候在ActivityRecord.setState里面就会去调用WindowProcessController.updateProcessInfo更新进程状态,更新进程状态的时候就会更新oom adj:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/wm/ActivityRecord.java
WindowProcessController app;      // if non-null, hosting application

void setState(State state, String reason) {
    ...
    switch (state) {
        ...
        case STARTED:
            ...
            app.updateProcessInfo(false /* updateServiceConnectionActivities */,
                    true /* activityChange */, true /* updateOomAdj */,
                    true /* addPendingTopUid */);
            ...
        ...
    }
    ...
}

最终也是会去到OomAdjuster.updateAppFreezeStateLSP,调用链路在上面的冻结流程里面已经追过,这里就省略了。可以看到如果adj小于CACHED_APP_MIN_ADJ就会调用CachedAppOptimizer.unfreezeAppLSP进行解冻:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java
private void updateAppFreezeStateLSP(ProcessRecord app) {
    ...
    final ProcessStateRecord state = app.mState;
    // Use current adjustment when freezing, set adjustment when unfreezing.
    if (state.getCurAdj() >= ProcessList.CACHED_APP_MIN_ADJ && !opt.isFrozen()
            && !opt.shouldNotFreeze()) {
        mCachedAppOptimizer.freezeAppAsyncLSP(app);
    } else if (state.getSetAdj() < ProcessList.CACHED_APP_MIN_ADJ) {
        mCachedAppOptimizer.unfreezeAppLSP(app, oomAdjReason);
    }
}

最终去到CachedAppOptimizer.unfreezeAppInternalLSP里面,如果还在10分钟的后悔时间里面就直接removeMessages删除定时器,如果进程已经冻结了就调用Process.setProcessFrozen解冻进程(frozen参数传入false)

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/CachedAppOptimizer.java
void unfreezeAppLSP(ProcessRecord app, String reason) {
    synchronized (mFreezerLock) {
        unfreezeAppInternalLSP(app, reason);
    }
}

void unfreezeAppInternalLSP(ProcessRecord app, String reason) {
    final int pid = app.getPid();
    final ProcessCachedOptimizerRecord opt = app.mOptRecord;
    if (opt.isPendingFreeze()) {
        // Remove pending DO_FREEZE message
        mFreezeHandler.removeMessages(SET_FROZEN_PROCESS_MSG, app);
        opt.setPendingFreeze(false);
        ...
    }

    opt.setFreezerOverride(false);
    if (pid == 0 || !opt.isFrozen()) {
        return;
    }

    ...
    Process.setProcessFrozen(pid, app.uid, false);
    ...
}

上面例子中,整个从退出Activity冻结进程到进入Activity解冻进程的流程如下:

Android进程冻结机制_ide

问题定位与规避

从日志上看这个进程在被kill的时候adj就是905:

08-29 14:13:11.716499  7904 27946 I ActivityManager: Killing 26859:me.linjw.demo.anr/1000 (adj 905): bg anr

而且它的启动时间和冻结时间刚好差10分钟:

08-29 14:01:45.124651  7904  8283 I ActivityManager: Start proc 26859:me.linjw.demo.anr/1000 for service {me.linjw.demo.anr/me.linjw.demo.anr.RemoteService}
08-29 14:11:45.885503  7904  8331 I am_freeze: [26859,me.linjw.demo.anr]

也就是说应用进程启动的时候adj就是905,然后就设置了10分钟的进程冻结定时器。

问题在于我们的应用的确只有一个Service,没有启动Activity而是通过WindowManager.addView添加的全局浮窗。

addView源码太多我没有找到更新oom adj的逻辑,但是复现问题使用cat /proc/{pid}/oom_adj命令获取oom adj发现并不是大于900的,也复现不出10分钟被冻结的现象。

那有可能是的确没有,也有可能是在某种情况下没有更新成功。在日志里没有看到任何报错,问题转给系统哥估计也解决不了,只能应用规避了。

规避的方式也很简单,将服务设置成前台服务主动触发OOM_ADJ_REASON_UI_VISIBILITY类型的oom adj重新计算:

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/ActiveServices.java
private void updateServiceForegroundLocked(ProcessServiceRecord psr, boolean oomAdj) {
    ...
    mAm.updateProcessForegroundLocked(psr.mApp, anyForeground, fgServiceTypes, oomAdj);
    ...
}

// https://cs.android.com/android/platform/superproject/+/android-13.0.0_r74:frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
final void updateProcessForegroundLocked(ProcessRecord proc, boolean isForeground,
        int fgServiceTypes, boolean oomAdj) {
    ...
    if (oomAdj) {
        updateOomAdjLocked(proc, OomAdjuster.OOM_ADJ_REASON_UI_VISIBILITY);
    }
}

标签:冻结,do,sysTid,freezer,trap,进程,Android,ADJ,android
From: https://blog.51cto.com/u_16175630/8133652

相关文章

  • android ebpf之uprobe原理和检测方法
    uprobe通过内核层对用户层进程的指定地址的原指令copy到其他位置,然后写入指定类型中断指令,然后内核中设置对应的中断处理程序,中断处理程序中执行uprobe设置的回调过滤函数,然后设置单步执行copy的原指令后恢复寄存器状态继续执行。ida查看被uprobehook的函数头部,指令被修改为了中断......
  • linux进程管理
    初识进程进程状态:进程介绍:进程唯一标识符:PID三种状态:就绪状态:已经具备运行条件,但CPU还没分配运行状态:占用并在cpu中运行阻塞状态:进程因等待某些事发生而暂停不能运行查看进程静态查看:   使用ps命......
  • Linux进程管理01
    查看进程静态查看进程:“psaux"查看当前目录的进程查看cpu占用率时,一般会希望进程按照cpu占用百分比的降序排列,使用“psaux--sort-%cpu”使用“psaxo”命令自定义显示的字段如:psaxopid,ppid,user,%cpu,command动态查看进程使用top查看命令动态查看进程进程优先级使......
  • Android GB28181历史视音频远程回放
     作为GB28181安卓客户端,实时视音频点播是必须支持的功能,对于历史视音频回放功能,不支持的话可以从设备上拷贝录像文件再播放,但有些场景没法拷贝,安卓支持回放还是需要的。 历史视音频的回放和实时视音频点播信令上很相似,音视频数据都是通过RTP传输,信令回放要处理SIPINFO消息,解析......
  • Hbuilderx运行uni-app项目到Android Studio模拟器只显示“同步手机端程序文件完成”界
    如图,开发工具也显示同步文件,模拟器也显示同步文件完成,但是就是不展示页面,遇到这种情况,一般是2种情况,一个是项目本身有问题跑不起来,另一个就是创建的模拟器设备参数不支持当前app。一.连接真机调试,排除项目本身问题:如果连接真机都跑不起来,那么看下控制台日志,先解决项目本身的问......
  • linux学习记录:进程管理
    1.进程:正在运行的程序,包括这个程序所占用的系统资源。每个进程都有唯一的进程标识pid,一个pid只能识别一个进程,ppid是父进程id。进程状态:就绪、运行、阻塞。2.查看进程静态查看进程:psaux(捕捉某一瞬间某一个进程的状态)-a:显示所有用户的进程,包括完整路径-u:显示使用者的名......
  • Windows bat命令脚本杀死进程
    前言通常情况下,我们在进行软件开发和服务器管理时,习惯性地使用Linux作为主要的开发服务器。但是有的项目中,由于系统的特殊性,不得不切换到Windows服务器,这样的转变会让人非常不适应。对于那些习惯了Linux的开发者来说,就像博主一样,经常会弄混Linux和windows的指令。这篇文章有意总结......
  • Android自动化测试框架:UiAutomator和UiAutomator2的区别与示例代码
    UiAutomator和UiAutomator2是两种常用的Android自动化测试框架,它们都是由Google开发的。然而,它们之间存在一些关键的区别:API级别:UiAutomator框架在Android4.3(API级别18)中引入,而UiAutomator2在Android5.0(API级别21)中引入。测试能力:UiAutomator只能测试Android系统应用......
  • Windows根据端口号查询pid并杀死进程
    1、例如现在要查杀8080端口netstat-aon|findstr"8080"可以看到pid是141242、查看该pid进程tasklist|findstr"14124"3、杀死进程taskkill-pid14124-f4、其它方式杀死进程上面找到pid则可以直接通过任务管理器看到进程,在这里更直观吧,然后右键结束就行了。......
  • 关于Android桌面小组件相关的开发,涉及到的一些点
    你可能用过一些AndroidAPP的小组件,比如:支付宝的小组件:之前疫情期间添加了对应小组件卡片在桌面,可点击小卡片上的查看健康码的按钮,可一键打开健康码。音乐类APP的小组件:添加对应对应小组件后,可在APP的主屏幕中轻松看到当前播放歌曲的相关信息:歌曲封面、歌曲名、歌手名称、所......