9fbd3d8574
Commit768f9f80f6
(support/download: generate even more reproducible tarballs) causes non-reproducibility in tarballs we previousy generated, especially the archives for two cargo-vendored packages, ripgrep and sentry-cli. The cause is that those two pakcages eventually vendor a file that has the u+x bit set, but is otehrwise go-x. With768f9f80f6
, the files are now go+x, so the hash for those generated archives has changed. Besides, that commit was wrong: it did not account for the 'r' bit for go part, leaving some non-reproducibility still unaccounted for. So, to generate really reproducible archives, we would need to fix that read bit as well, and that has the potential to affect all the archives we generated so far. If we wanted to do so, we'd need a way to version all generated archives, like we do for git and svn, but now for all the different CVSes, as well as for all the vendoring post-processes. For768f9f80f6
, all that was of conern was the working copies of CVSes (i.e. git, svn, cvs...) that we cache in the Buildroot download dir, not the temporary files during post-processing. Indeed, in that latter case, the user has virtually no way to mangle with the mode of the intermediate extract before repack. And we do have a big fat warning that users should not attempt to meddle with the git tree that Buildroot caches. As768f9f80f6
however demonstrates, is that it took quite a long time between the introduction of the git caching, and the time someone eventually discovered they could meddle in there. This shows that the issue it not actually critical in most setups. Also, the tar manual [0] hints at a better solution to handle reproducibility, which even avoids touching the files on disk which is even nicer: ‘--mode='go+u,go-w'’ Omit irrelevant information about file permissions. If we were to actually handle the mode bit for reproducibility, we'd need to: - introduce archive versioning for all download backends and prost-processing - use the tar officially suggested method So, revert that change, as it was incomplete, was not really fixing much issues, and causes actual issues. This reverts commit768f9f80f6
. [0] https://www.gnu.org/software/tar/manual/tar.html#Reproducibility Thanks to Vincent and Arnout for pointing at the tar manual. Reported-by: Antoine Coutant <antoine.coutant@smile.fr> Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr> Cc: Vincent Fazio <vfazio@xes-inc.com> Cc: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> Tested-by: Romain Naour <romain.naour@smile.fr> Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
99 lines
2.9 KiB
Bash
Executable File
99 lines
2.9 KiB
Bash
Executable File
# Generate a reproducible archive from the content of a directory
|
|
#
|
|
# $1 : input directory
|
|
# $2 : leading component in archive
|
|
# $3 : ISO8601 date: YYYY-MM-DDThh:mm:ssZZ
|
|
# $4 : output file
|
|
# $5... : globs of filenames to exclude from the archive, suitable for
|
|
# find's -path option, and relative to the input directory $1
|
|
#
|
|
# Notes :
|
|
# - the timestamp is internally rounded to the highest entire second
|
|
# less than or equal to the timestamp (i.e. any sub-second fractional
|
|
# part is ignored)
|
|
# - must not be called with CWD as, or below, the input directory
|
|
# - some temporary files are created in CWD, and removed at the end
|
|
#
|
|
# Example:
|
|
# $ find /path/to/temp/dir
|
|
# /path/to/temp/dir/
|
|
# /path/to/temp/dir/some-file
|
|
# /path/to/temp/dir/some-dir/
|
|
# /path/to/temp/dir/some-dir/some-other-file
|
|
#
|
|
# $ mk_tar_gz /path/to/some/dir \
|
|
# foo_bar-1.2.3 \
|
|
# 1970-01-01T00:00:00Z \
|
|
# /path/to/foo.tar.gz \
|
|
# '.git/*' '.svn/*'
|
|
#
|
|
# $ tar tzf /path/to/foo.tar.gz
|
|
# foo_bar-1.2.3/some-file
|
|
# foo_bar-1.2.3/some-dir/some-other-file
|
|
#
|
|
mk_tar_gz() {
|
|
local in_dir="${1}"
|
|
local base_dir="${2}"
|
|
local date="${3}"
|
|
local out="${4}"
|
|
shift 4
|
|
local glob tmp pax_options
|
|
local -a find_opts
|
|
|
|
for glob; do
|
|
find_opts+=( -or -path "./${glob#./}" )
|
|
done
|
|
|
|
# Drop sub-second precision to play nice with GNU tar's valid_timespec check
|
|
date="$(date -d "${date}" -u +%Y-%m-%dT%H:%M:%S+00:00)"
|
|
|
|
pax_options="delete=atime,delete=ctime,delete=mtime"
|
|
pax_options+=",exthdr.name=%d/PaxHeaders/%f,exthdr.mtime={${date}}"
|
|
|
|
tmp="$(mktemp --tmpdir="$(pwd)")"
|
|
pushd "${in_dir}" >/dev/null
|
|
|
|
# Establish list
|
|
find . -not -type d -and -not \( -false "${find_opts[@]}" \) >"${tmp}.list"
|
|
# Sort list for reproducibility
|
|
LC_ALL=C sort <"${tmp}.list" >"${tmp}.sorted"
|
|
|
|
# Create POSIX tarballs, since that's the format the most reproducible
|
|
tar cf - --transform="s#^\./#${base_dir}/#S" \
|
|
--numeric-owner --owner=0 --group=0 --mtime="${date}" \
|
|
--format=posix --pax-option="${pax_options}" \
|
|
-T "${tmp}.sorted" >"${tmp}.tar"
|
|
|
|
# Compress the archive
|
|
gzip -6 -n <"${tmp}.tar" >"${out}"
|
|
|
|
rm -f "${tmp}"{.list,.sorted,.tar}
|
|
|
|
popd >/dev/null
|
|
}
|
|
|
|
post_process_unpack() {
|
|
local dest="${1}"
|
|
local tarball="${2}"
|
|
local one_file
|
|
|
|
mkdir "${dest}"
|
|
tar -C "${dest}" --strip-components=1 -xzf "${tarball}"
|
|
one_file="$(find "${dest}" -type f -print0 |LC_ALL=C sort -z |sed 's/\x0.*//')"
|
|
touch -r "${one_file}" "${dest}.timestamp"
|
|
}
|
|
|
|
post_process_repack() {
|
|
local in_dir="${1}"
|
|
local base_dir="${2}"
|
|
local out="${3}"
|
|
local date
|
|
|
|
date="@$(stat -c '%Y' "${in_dir}/${base_dir}.timestamp")"
|
|
|
|
mk_tar_gz "${in_dir}/${base_dir}" "${base_dir}" "${date}" "${out}"
|
|
}
|
|
|
|
# Keep this line and the following as last lines in this file.
|
|
# vim: ft=bash
|