kumquat-buildroot/support/download
Yann E. MORIN 3035fc23de support/download: even more reproducible archives (until next time)
Currently, when we generate archives, we rely on a few assumptions and
mechanisms to ensure reproducilibity. So far, we mostly accounted for
the content (i.e. content, filenames, and path) of the files we
archived, and this is OK (git and svn should provide reproducilbe
content by design, and cargo and go vendoring are also supposed to be
generating reproducible content.

However, tarballs do not only contain the content of the files; they
also have a few metadata about those files. Beyond filenames and paths,
which are already reproducible, there is the timestamp, the user and
group name and ID. Those are also accounted for and made reproducible.

The final touch (so far!) is that files have access rights (aka mode),
and those too are stored in tarballs. So far we accounted for those by
ensuring that Buildroot would always run under a known umask, thus
generating files with reproducible modes.

That falls short in one case that we did not envision, though: a shared
download directory, where extended attributes are set to provide a
default ACL that is permissive, to allow two or more users (with
different uid and gid) to all read and write to such a directory. This
is trivially achieved with something like:

    $ mkdir -p "${BR2_DL_DIR}"
    $ setfacl -m 'default:user::rwx' "${BR2_DL_DIR}"
    $ setfacl -m 'default:group::rwx' "${BR2_DL_DIR}"
    $ setfacl -m 'default:other::rwx' "${BR2_DL_DIR}"

This has the effect that:

  - files below BR2_DL_DIR are all set with user, group, and world read
    and write access,
  - files executable by the owner will also be group and world
    executable,
  - directories are user, group, and world readable, writable, and
    searchable.

This means that all the archives we generate from files in BR2_DL_DIR
will have modes that are different from those generated on other systems,
where only the traditional umask is used.

There are various solutions to solve that issue:

  - detect the situation and abort: that's not nice, because users have
    a legitimiate reason to want to share that directory,

  - find a solution for each affected download mechanism: git, svn, hg,
    cvs, bzr... and for each of the affected vendoring mechanism: go and
    cargo [0]; this is not nice, because it means a lot of repetition,
    with the risk that they diverge over time (e.g. one is fixed for a
    newer issue, while the others are left out due to an oversight...)

  - find a single, common solution that works in all cases, whatever the
    download mechanism and/or vendoring: this is the best, because we
    can extend and fix it once and everything else benefits from it.

We obviously go for the third option.

The common solution is rather simple. When creating the tarball in
support/download/helpers, give an option to tar to set the group and
other permissions to those of the user, but without write permission.

This implies that we must bump the version-suffix for the download
backends [1] and for the vendoring post-processes. It also implies that
the hash may change, under the following circumstances:

- Symlinks normally have permissions 0777 (because symlink permissions
  are in fact meaningless). They will now have permission 0755 in the
  tarball.
- If the original tarball (for vendored go and cargo packages) contained
  files that are readable or executable by owner but not by group or
  other, they will now be readable resp. executable by group and other
  too. Note that for writeable it is not the case, because those were
  already handled by our 0022 umask (which makes them not writeable by
  group and other).

Because the hash may change, we need to update the BR_FMT_VERSION for
everything that creates tarballs. Go and cargo didn't have one up to
now, the the previous commit added the possibility to give one. The ones
for git and svn have to be updated. Since it is now possible to have a
suffix for both the VCS and the post-processing, change the suffix to
something more descriptive than "-brX", i.e. -git3 for git, -go1 for
golang, etc.

The hash updates and filename changes will be handled in a follow-up
commit.

[0] Note however that the vendoring is currently not done in a
sub-directory of BR2_DL_DIR, but the cargo and go caches are located
there. Files that get copied from there to the vendoring area would be
tainted as well, and thus we want to address that situation as well.

[1] we currently do not have a CVS version suffix, because we do not
guarantee the reproducilibity of CVS archives (we can't); for hg, we are
currently using hg's own archive tool, and presumably that does not have
the mode issue because it is not using the checked-out files. Still,
doing the mode fix in a single location will help extend those two
backends in the future (if that ever happens...).

Reported-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
Signed-off-by: Arnout Vandecappelle <arnout@mind.be>
2024-05-09 22:44:59 +02:00
..
bzr
cargo-post-process support/download: fix the cargo post-process in face of failed vendoring 2023-02-12 09:39:19 +01:00
check-hash support/download/check-hash: fix shellcheck errors 2024-04-01 17:03:17 +02:00
cvs
dl-wrapper support/download: teach dl-wrapper to handle more than one hash file 2023-11-07 11:48:45 +01:00
file
git support/download/git: handle git attributes 2024-05-09 22:44:51 +02:00
go-post-process support/download/go-post-process: drop -o pipefail 2022-01-09 11:07:37 +01:00
helpers support/download: even more reproducible archives (until next time) 2024-05-09 22:44:59 +02:00
hg
scp
sftp
svn support/download: add support to exclude svn externals 2023-08-06 16:35:52 +02:00
wget