A/B System Updates
IN THIS DOCUMENT
- Kernel patches
- Kernel command line arguments
- Recovery
- Build variables
- Partitions
- Fstab
- Kernel slot arguments
- OTA package generation
A/B system updates ensure a workable booting system remains on the disk during an over-the-air (OTA) update. This reduces the likelihood of an inactive device afterward, which means less device replacements and device reflashes at repair/warranty centers.
Customers can continue to use their devices during an OTA. The only downtime during an update is when the device reboots into the updated disk partition. If the OTA fails, the device is still useable since it will boot into the pre-OTA disk partition. The download of the OTA can be attempted again. A/B system updates implemented through OTA are recommended for new devices only.
A/B system updates affect:
- Interactions with the bootloader
- Partition selection
- The build process
- OTA update package generation
The existing dm-verity feature guarantees the device will boot an uncorrupted image. If a device doesn't boot, because of a bad OTA or dm-verity issue, the device can reboot into an old image.
The A/B system is robust because any errors (such as I/O errors) affect only the unused partition set and can be retried. Such errors also become less likely because the I/O load is deliberately low to avoid degrading the user experience.
OTA updates can occur while the system is running, without interrupting the user. This includes the app optimizations that occur after a reboot. Additionally, the cache partition is no longer used to store OTA update packages; there is no need for sizing the cache partition.
Overview
A/B system updates use a background daemon called update_engine
and two sets of partitions. The two sets of partitions are referred to as slots, normally as slot A and slot B. The system runs from one slot, the current slot, while the partitions in the unused slot are not accessed by the running system (for normal operation).
The goal of this feature is to make updates fault resistant by keeping the unused slot as a fallback. If there is an error during an update or immediately after an update, the system can rollback to the old slot and continue to have a working system. To achieve this goal, none of the partitions used by the current slot should be updated as part of the OTA update (including partitions for which there is only one copy).
Each slot has a bootable attribute, which states whether the slot contains a correct system from which the device can boot. The current slot is clearly bootable when the system is running, but the other slot may have an old (still correct) version of the system, a newer version, or invalid data. Regardless of what the current slot is, there is one slot which is the active or preferred slot. The active slot is the one the bootloader will boot from on the next boot. Finally, each slot has a successful attribute set by the user space, which is only relevant if the slot is also bootable.
A successful slot should be able to boot, run, and update itself. A bootable slot that was not marked as successful (after several attempts were made to boot from it) should be marked as unbootable by the bootloader, including changing the active slot to another bootable slot (normally to the slot running right before the attempt to boot into the new, active one). The specific details of the interface are defined in boot_control.h
.
Bootloader state examples
The boot_control
HAL is used by update_engine
- Normal case: The system is running from its current slot, either slot A or B. No updates have been applied so far. The system's current slot is bootable, successful, and the active slot.
- Update in progress: The system is running from slot B, so slot B is the bootable, successful, and active slot. Slot A was marked as unbootable since the contents of slot A are being updated but not yet completed. A reboot in this state should continue booting from slot B.
- Update applied, reboot pending: The system is running from slot B, slot B is bootable and successful, but slot A was marked as active (and therefore is marked as bootable). Slot A is not yet marked as successful and some number of attempts to boot from slot A should be made by the bootloader.
- System rebooted into new update: The system is running from slot A for the first time, slot B is still bootable and successful while slot A is only bootable, and still active but not successful. A user space daemon should mark slot A as successful after some checks are made.
Update Engine features
The update_engine
daemon runs in the background and prepares the system to boot into a new, updated version. The update_engine
daemon is not involved in the boot process itself and is limited in what it can do during an update. The update_engine
- Read from the current slot A/B partitions and write any data to the unused slot A/B partitions as instructed by the OTA package
- Call the
boot_control
- Run a post-install program from the new partition after writing all the unused slot partitions, as instructed by the OTA package
The post-install step is described in detail below. Note that the update_engine
daemon is limited by the SELinuxpolicies and features in the current slot; those policies and features can't be updated until the system boots into a new version. To achieve a robustness goal, the update process should not:
- Modify the partition table
- Modify the contents of partitions in the current slot
- Modify the contents of non-A/B partitions that can't be wiped with a factory reset
Life of an A/B update
The update process starts when an OTA package, referred to in code as a payload, is available for downloading. Policies in the device may defer the payload download and application based on battery level, user activity, whether it is connected to a charger, or other policies. But since the update runs in the background, the user might not know that an update is in progress and the process can be interrupted at any point due to policies or unexpected reboots.
The steps in the update process after a payload is available are as follows:
Step 1: The current slot (or "source slot") is marked as successful (if not already marked) withmarkBootSuccessful()
.Step 2: The unused slot (or "target slot") is marked as unbootable by calling the functionsetSlotAsUnbootable()
.
The current slot is always marked as successful at the beginning of the update to prevent the bootloader from falling back to the unused slot, which will soon have invalid data. If the system has reached the point where it can start applying an update, the current slot is marked as successful even if other major components are broken (such as the UI in a crash loop) since it's possible to push new software to fix these major problems.
The update payload is an opaque blob with the instructions to update to the new version. The update payload consists of basically two parts: the metadata and the extra data associated with the instructions. The metadata is relatively small and contains a list of operations to produce and verify the new version on the target slot. For example, an operation could decompress a certain blob and write it to certain blocks in a target partition, or read from a source partition, apply a binary patch, and write to certain blocks in a target partition. The extra data associated to the operations, not included in the metadata, is the bulk of the update payload and would consist of the compressed blob or binary patch in these examples.
Step 3: The payload metadata is downloaded.
Step 4: For each operation defined in the metadata, in order, the associated data (if any) is downloaded to memory, the operation is applied, and the associated memory is discarded.
These two steps take most of the update time, as they involve writing and downloading large amounts of data, and are likely to be interrupted for reasons of policy or reboot.
Step 5: The whole partitions are re-read and verified against the expected hash.
Step 6: The post-install step (if any) is run.
In the case of an error during the execution of any step, the update fails and is re-attempted with possibly a different payload. If all the steps so far have succeeded, the update succeeds and the last step is executed.
Step 7: The unused slot is marked as active by calling setActiveBootSlot()
.
Marking the unused slot as active doesn't mean it will finish booting. The bootloader—or system itself—can switch the active slot back if it doesn't read a successful state.
Post-install step
The post-install step consists of running a program from the "new update" version while still running in the old version. If defined in the OTA package, this step is mandatory and the program must return with exit code 0
; otherwise, the update fails.For every partition where a post-install step is defined, update_engine
mounts the new partition into a specific location and executes the program specified in the OTA relative to the mounted partition. For example, if the post-install program is defined as usr/bin/postinstall
in the system partition, this partition from the unused slot will be mounted in a fixed location (for example, in /postinstall_mount
) and the/postinstall_mount/usr/bin/postinstall
- The old kernel needs to be able to mount the new filesystem format. The filesystem type cannot change unless there's support for it in the old kernel (which includes details such as the compression algorithm used if using a compressed filesystem like SquashFS).
- The old kernel needs to understand the new partition's post-install program format. If using an ELF binary, it should be compatible with the old kernel (e.g. a 64-bit new program running on an old 32-bit kernel if the architecture switched from 32- to 64-bit builds). Also, the libraries will be loaded from the old system image, not the new one, unless the loader (
ld
- ) is instructed to use other paths or build a static binary.
- The new post-install program will be limited by the SELinux policies defined in the old system.
An example case is to use a shell script as a post-install program (interpreted by the old system's shell binary with a #!
Another example case is to run the post-install step from a dedicated smaller partition, so the filesystem format in the main system partition can be updated without incurring backward compatibility issues or stepping-stone updates, allowing users to update straight to the latest version from a factory image.
Due to the SELinux policies, the post-install step is suitable for performing tasks required by design on a given device or other best-effort tasks: update the A/B-capable firmware or bootloader, prepare copies of some databases for the new version, etc. This step is not suitable for one-off bug fixes before reboot that require unforeseen permissions.
The selected post-install program runs in the postinstall
SELinux context. All the files in the new mounted partition will be tagged with postinstall_file
, regardless of what their attributes are after rebooting into that new system. Changes to the SELinux attributes in the new system won't impact the post-install step. If the post-install program needs extra permissions, those must be added to the post-install context.
Implementation
OEMs and SoC vendors who wish to implement the feature must add the following support to their bootloaders:
- Pass the correct parameters to the kernel
- Implement the
boot_control
- HAL (https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/boot_control.h)
- Implement the state machine as shown in Figure 1:
Figure 1.
The boot control HAL can be tested using the bootctl
utility.
Some tests have been implemented for Brillo:
- https://android.googlesource.com/platform/system/extras/+/refs/heads/master/tests/bootloader/
- https://chromium.googlesource.com/chromiumos/third_party/autotest/+/master/server/site_tests/brillo_BootLoader/brillo_BootLoader.py
Kernel patches
- https://android-review.googlesource.com/#/c/158491/
- https://android-review.googlesource.com/#/q/status:merged+project:kernel/common+branch:android-3.18+topic:A_B_Changes_3.18
Kernel command line arguments
The kernel command line arguments must contain the following extra arguments:
skip_initramfs rootwait ro init=/init root="/dev/dm-0 dm=system none ro,0 1 \
android-verity <public-key-id> <path-to-system-partition>"
openssl x509 -in <x509-pem-certificate> -outform der -out <x509-der-certificate>
angler:/# cat /proc/keys
1c8a217e I------ 1 perm 1f010000 0 0 asymmetri
Android: 7e4333f9bba00adfe0ede979e28ed1920492b40f: X509.RSA 0492b40f []
2d454e3e I------ 1 perm 1f030000 0 0 keyring
.system_keyring: 1/4
Successful inclusion of the .X509 certificate indicates the presence of the public key in the system keyring. The highlighted portion denotes the public key ID.
As the next step, replace the space with ‘#’ and pass it as <public-key-id>
in the kernel command line. For example, in the above case, the following is passed in the place of <public-key-id>
:Android:#7e4333f9bba00adfe0ede979e28ed1920492b40f
Recovery
The recovery RAM disk is now contained in the boot.img
file. When going into recovery, the bootloader cannotput the skip_initramfs
Build variables
Must define for the A/B target:
<path-to-block-device>/vendor /vendor ext4 ro
wait,verify=<path-to-block-device>/metadata,slotselect
Please note that there should be no partition named vendor
but instead the partition vendor_a
or vendor_b
will be selected and mounted on the /vendor
Kernel slot arguments
The current slot suffix should be passed either through a specific DT node (/firmware/android/slot_suffix
) or through the androidboot.slot_suffix
Optionally, if the bootloader implements fastboot, the following commands and variables should be supported:
Commands
set_active <slot-suffix>
Variables
has-slot:<partition-base-name-without-any-suffix>
current-slot
slot-suffixes
slot-successful:<slot-suffix>
slot-unbootable:<slot-suffix>
slot-retry-count:
- These variables should all appear under the following:
fastboot getvar all
OTA package generation
The OTA package tools follow the same commands as the commands for non-A/B devices. Thetarget_files.zip
For example, use the following to generate a full OTA:
./build/tools/releasetools/ota_from_target_files \
dist_output/tardis-target_files.zip ota_update.zip
Or, generate an incremental OTA:
./build/tools/releasetools/ota_from_target_files \
-i PREVIOUS-tardis-target_files.zip \
dist_output/tardis-target_files.zip incremental_ota_update.zip
Configuration
Partitions
The Update Engine can update any pair of A/B partitions defined in the same disk.
A pair of partitions has a common prefix (such as system
or boot
) and per-slot suffix (such as _a
or -a
) as defined by the boot_control
HAL in the function getSuffix()
. The list of partitions for which the payload generator defines an update is configured by the AB_OTA_PARTITIONS
make variable. For example, if a pair of partitions bootloader_a
and booloader_b
are included (assuming _a
and _b
AB_OTA_PARTITIONS := \
boot \
system \
bootloader
All the partitions updated by the Update Engine must not be modified by the rest of the system. During incremental or delta updates, the binary data from the current slot is used to generate the data in the new slot. Any modification may cause the new slot data to fail verification during the update process, and therefore fail the update.
Post-install
The post-install step can be configured differently for each updated partition using a set of key-value pairs.
To run a program located at /system/usr/bin/postinst
in a new image, specify the path relative to the root of the filesystem in the system partition. For example, usr/bin/postinst
is system/usr/bin/postinst
(if not using a RAM disk). Additionally, specify the filesystem type to pass to the mount(2)
system call. Add the following to the product or device .mk
AB_OTA_POSTINSTALL_CONFIG += \
RUN_POSTINSTALL_system=true \
POSTINSTALL_PATH_system=usr/bin/postinst \
FILESYSTEM_TYPE_system=ext4
App compilation in background
Compiling apps in the background for A/B updates requires the following two additions to the product's device configuration (in the product's device.mk):
- Include the native components in the build. This ensures the compilation script and binaries are compiled and included in the system image.
# A/B OTA dexopt package
PRODUCT_PACKAGES += otapreopt_script
- Connect the compilation script to
update_engine
- such that it is run as a post-install step.
# A/B OTA dexopt update_engine hookup
AB_OTA_POSTINSTALL_CONFIG += \
RUN_POSTINSTALL_system=true \
POSTINSTALL_PATH_system=system/bin/otapreopt_script \
FILESYSTEM_TYPE_system=ext4 \
POSTINSTALL_OPTIONAL_system=true
See First boot installation of DEX_PREOPT files to install the preopted files in the unused second system partition.
标签:slot,System,--,OTA,partition,update,system,Updates,new
From: https://blog.51cto.com/u_16248677/7385190