[PING^4][PATCH v2] Generate reproducible output independently of the build-path

Ximin Luo infinity0@pwned.gg
Fri Jul 21 16:16:00 GMT 2017


(Please keep me on CC, I am not subscribed)


Proposal
========

This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value"
command-line arguments that precede any explicit ones. This makes the final
binary output reproducible, and also hides the unreproducible value (the source
path prefixes) from CFLAGS et. al. which many build tools (understandably)
embed as-is into their build output.

This environment variable also acts on the __FILE__ macro, mapping it in the
same way that debug-prefix-map works for debug symbols. We have seen that
__FILE__ is also a very large source of unreproducibility, and is represented
quite heavily in the 3k+ figure given earlier.

Finally, we tweak the mapping algorithm so that it applies only to whole path
components when matching prefixes. This is justified in further detail in the
patch header. It is an optional part of the patch series and could be dropped
if the GCC maintainers are not convinced by our arguments there.


Background
==========

We have prepared a document that describes how this works in detail, so that
projects can be confident that they are interoperable:

https://reproducible-builds.org/specs/build-path-prefix-map/

The specification is currently in DRAFT status, awaiting some final feedback,
including what the GCC maintainers think about it.

We have written up some more detailed discussions on the topic, including a
thorough justification on why we chose the mechanism of environment variables:

https://wiki.debian.org/ReproducibleBuilds/StandardEnvironmentVariables

The previous iteration of the patch series, essentially the same as the current
re-submission, is here:

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00513.html

An older version, that explains some GCC-specific background, is here:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00182.html

The current patch series applies cleanly to GCC-8 snapshot 20170716.


Reproducibility testing
=======================

Over the past 3 months, we have tested this patch backported to Debian GCC-6.
Together with a patched dpkg that sets the environment variable appropriately,
it allows us to reproduce ~1800 extra packages.

This is about 6.8% of ~26400 Debian source packages, and just over 1/2 of the
ones whose irreproducibility is due to build-path issues.

https://tests.reproducible-builds.org/debian/issues/unstable/gcc_captures_build_path_issue.html
https://tests.reproducible-builds.org/debian/unstable/index_suite_amd64_stats.html

The first major increase around 2017-04 is due to us deploying this patch. The
next major increase later in 2017-04 is unrelated, due to us deploying a patch
for R. The dip during the last part of 2017-06 is due to unpatched and patched
packages getting out-of-sync partly because of extra admin work around the
Debian stretch release, and we believe that the green will soon return to their
previous high after this situation settles.


Unit testing
============

I've tested these patches on a Debian unstable x86_64-linux-gnu schroot running
inside a Debian jessie system, on a full-bootstrap build. The output of
contrib/compare_tests is as follows:

~~~~
gcc-8-20170716$ contrib/compare_tests ../gcc-build-{0,1}
# Comparing directories
## Dir1=../gcc-build-0: 8 sum files
## Dir2=../gcc-build-1: 8 sum files

# Comparing 8 common sum files
## /bin/sh contrib/compare_tests  /tmp/gxx-sum1.13468 /tmp/gxx-sum2.13468
New tests that PASS:

gcc.dg/cpp/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-1.c execution test
gcc.dg/cpp/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-2.c execution test
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c scan-assembler DW_AT_comp_dir: "DWARF2TEST/gcc
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/

# No differences found in 8 common sum files
~~~~

I can also provide the full logs on request.


Fuzzing
=======

I've also fuzzed the prefix-map code using AFL with ASAN enabled. Due to how
AFL works I did not fuzz this patch directly but a smaller program with just
the parser and remapper, available here:

https://anonscm.debian.org/cgit/reproducible/build-path-prefix-map-spec.git/tree/consume

Over the course of about ~4k cycles, no crashes were found.

To reproduce, you could run something like:

$ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
$ make CC=afl-gcc clean reset-fuzz-pecsplit.c fuzz-pecsplit.c


Copyright disclaimer
====================

I've signed a copyright disclaimer and the FSF has this on record. (RT #1209764)



More information about the Gcc-patches mailing list