kumquat-buildroot/system
Yann E. MORIN 10c637ab06 package/skeleton-init-systemd: add option to use overlayfs on /var
Systemd requires /var to be writeable [1]. With read-only rootfs, we
need a solution that makes sure /var is writeable. We already have a
solution using a factory, with systemd-tmpfiles. This approach has a few
limitations:

- The behaviour of what happens when the rootfs is updated and the
  contents of the factory /var changes are not very intuitive.

- systemd-tmpfiles is not started super early in the boot, so there's a
  relatively long time that /var is not writeable. There is also no easy
  way in systemd to express dependencies on the subdirectories of /var
  to have been populated from the factory.

- The contents of /var is duplicated. If it is big, the rootfs size
  increases unnecessarily and it takes a long time before the copying is
  done. This is also not done atomically.

This commit adds an alternative using an overlay filesystem that has the
following characteristics:

-   Don't depend on anything being available, except the
    API File Systems [2]. In other words, this can be done very early in
    the boot process. This is useful because /var is meant to be
    available before normal and even some early services are running.

-   Be a clean drop-in, that can be trivially added / removed.

-   Make sure that overlayfs is available in the kernel.

-   Units are (partially) reusable for custom solutions. This goal is
    actually not fully reached yet: for that the service file should be
    converted into a template, and the mount unit should use a specifier
    for all repeated references to /var.

Mounting the overlay is slightly acrobatic and requires a few steps:

- First, we have to make sure the directories for overlayfs's upper,
  lower and work directories are available on a tmpfs. Note that
  "upper" and "work" must be on the same filesystem.

- The writeable overlay upper directory must be mounted.

- The original contents of /var must be bind-mounted to the overlay
  lower directory.

- Finally, the overlay must be mounted on /var.

For the overlayfs directories, we create a tree on /run. Since there is
no standard name convention for this, we create a new directory
"/run/buildroot" with subdirectory "mounts" for everything
mount-related. Below that, a subdirectory is created for every mount
point that needs helper directories. Thus, we arrive to
/run/buildroot/mounts/var as the base directory for the overlay. Below
this, the directories lower, upper and work are created.

The bind-mount of /var is done in the same service as the one creating
the overlay lower, upper and work directories. Creating those
directories can't be done in a mount unit, and bind-mounting /var in a
mount unit would create a circular dependency. Indeed, if we had a mount
unit to do the bind mount, then it sould look like:
    # run-buildroot-mounts-var-lower.mount
    [Mount]
    What=/var
    Where=/run/buildroot/mounts/var/lower
    Options=bind

and then the var.mount unit would need to have a dependency on that
unit:
    # var.mount
    [Unit]
    After=run-buildroot-mounts-var-lower.mount
    [Mount]
    Where=/var

However, the What=/var of the first unit automatically adds an implicit
dependency on /var, and since there is a unit providing Where=/var, we
would have run-buildroot-mounts-var-lower.mount depend on var.mount, but
we need var.mount to depend on run-buildroot-mounts-var-lower.mount, so
this is a circular dependency. There is no way to tell systemd no to add
the implicit dependency. So we do the bind mont manually in the service
unit that prepares the overlay structure.

For the writeable upper layer, we don't need to do anything. In the
default configuration, the upper layer is supposed to be a tmpfs, and
/run/buildroot/mounts/var/upper is already a tmpfs so it can serve as
is. To make it persistent, we suggest to the user to mount a writeable,
persistent filesystem on /run/buildroot/mounts/var. The
RequiresMountsFor dependency in the prepare-var-overlay service makes
sure that that mount is performed before the overlay is started. Using
/run/buildroot/mounts/var/upper as the mount point sounds more logical
at first, but since the work directory is supposed to be on the same
filesystem as the upper directory, this wouldn't work very well.

As example, consider using /dev/sdc1 as upper layer for var, this can be
achieved by adding the following line to fstab:

/dev/sdc1	/run/buildroot/mounts/var	ext4	defaults

Systemd will convert this into a mount unit with all the proper
dependencies.

Norbert provided some systemd units as a starting point, and that was
quite a huge help in understanding how to fit all those things together.

[1] - https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems/

Co-authored-by: Norbert Lange <nolange79@gmail.com>
Signed-off-by: Yann E. MORIN <yann.morin@orange.com>
Cc: Norbert Lange <nolange79@gmail.com>
Cc: Romain Naour <romain.naour@smile.fr>
Cc: Jérémy Rosen <jeremy.rosen@smile.fr>
Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
[Arnout:
 - Merge commit messages from Yann and from Norbert.
 - Remove the run-buildroot-mounts-var.mount unit; instead, just reuse
   the existing tmpfs for the upper layer in the default case.
 - Update the help text to explain how to mount a custom upper layer
   with fstab.
]
Signed-off-by: Arnout Vandecappelle <arnout@mind.be>
2023-10-08 20:12:01 +02:00
..
skeleton system/skeleton: provide run/lock directory 2022-01-12 20:38:09 +01:00
Config.in package/skeleton-init-systemd: add option to use overlayfs on /var 2023-10-08 20:12:01 +02:00
device_table_dev.txt
device_table.txt
system.mk