Facing the Challenges of Updating Complex Systems Putting it all together FOSDEM 2018, Enrico Jörns, Pengutronix e.K. 1/31
About Me ● Enrico Jörns ● Embedded Software Engineer ● ● RAUC update framework co-maintainer 2/31
Motivation Updating is a solved topic!..? device data application init watchdog bootloader 3/31
Motivation deployment testing server device data application init watchdog bootloader 4/31
Bootloader Support – Barebox select disk0.1 algorithm confjg disk0.2 boot targets persistent status bootchooser 6/31
Bootchooser Framework watchdog failed reset disk0.1 boot disk0.2 reset: power-on attempts -- system0: disk0.1 system1: disk0.2 highest priority attempts > 0 attempts > 0 attempts:=3 power on priority: 20 priority: 10 attempts: 3 attempts: 3 system0 system1 7/31
X86 – Pure UEFI Boot BootNext 0001 BootOrder 0001,0002 BootEntries HD(1,GPT,<UUID-1>/File(KernelA),rootfs=.... 0001 HD(1,GPT,<UUID-2>/File(KernelA),rootfs=.... 0002 kernel A kernel B rootfs A rootfs B system A system B 8/31
Updating The Bootloader? boot0 boot.img extCSD boot1 system user eMMC 10/31
Updating The Bootloader? boot0 boot.img extCSD boot1 system user eMMC 11/31
Updating The Bootloader? atomic boot0 boot.img extCSD system user eMMC 12/31
Detecting Freezes – Watchdogs! Boot- ROM Kernel System loader Loader 18/31
Detecting Freezes – Watchdogs! Watchdog start reset Boot- ROM Kernel System loader Loader 19/31
systemd – Watchdog Multiplexer app1.service WatchdogSec=10 watchdog.conf RuntimeWatchdogSec=10 ShutdownWatchdogSec=300 app2.service WatchdogSec=20 mux HW-Watchdog app2.service WathcdogSec=30 SW-Watchdogs 20/31
systemd ● Central control and overview! ● Service Failure Confjguration – Restart – RestartSec – ... ● Watchdog Multiplexer ● /system-update – bootstrapping confjg / data 21/31
Data Storage / Migration Data in rootfs ● copy by updater! 2. /etc ● migration: simple 1. ● fallback: old data! update rootfs rootfs 22/31
Data Storage / Migration Data in separate slot ● no copying datafs ● mount to /data /data /data ● migration: simple update ● fallback: tricky! rootfs rootfs 23/31
Data Storage / Migration Data in two separate slots ● copying by updater datafs datafs ● migration by /data /data app application ● mounting: tricky update rootfs rootfs ● fallback: old data! 24/31
Updating and Trusted Boot dm-verity dm-integrity Build System image tar hash tree extract install ext4 r/o dm-integrity journal tags block device r/w Target 25/31
Testing Updates – Labgrid update-test.py - provide update - trigger install - power cycle - test bootloader Bundle - test linux Labgrid HW / Qemu ShellDriver Linux BareboxDriver Barebox Power PowerDriver 26/31
casync ● Image updates over Network – Too large (slow connection) – Temporary storage required → delta updates → not reinvent the wheel “casync (content-addressable synchronisation) is a Linux software utility designed to distribute frequently-updated fjle system images over the Internet.“ [Wikipedia] 29/31
casync – Chunking block device / directory tree serialized stream hashing (ID) compressing #b389 #4a23 #007c #7f2b #ba32 #2ef5 .caidx index fjle chunk store 30/31
casync – Extracting chunk store https #b389 #4a23 #007c #7f2b #ba32 #2ef5 .caidx index fjle serialized stream block device / directory tree 31/31
casync – RAUC .caidx metadata chunk store update install slot A slot B seed store 32/31
Field Deployment update 33/31
HawkBit – Deployment Server Management Web UI API Device Integration API 34/31
Field Deployment – HawkBit error threshold group 1 group 2 group 3 35/31
Field Deployment – HawkBit error threshold group 1 group 2 group 3 36/31
Field Deployment – HawkBit error threshold group 1 group 2 group 3 stop! 37/31
Conclusion ● Update Frameworks cannot provide full solutions ● Not just stacking components ● Fine-grained confjguration ● Updating is highly use-case specifjc 38/31
Questions? 39/31
Links ● github.com/rauc ● rauc.readthedocs.io ● github.com/systemd/casync ● github.com/labgrid-project/labgrid ● github.com/eclipse/hawkbit 40/31
Recommend
More recommend