This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFC PATCH] Enhancements to profiledbootstrap
- From: Arvind Sankar <nivedita at alum dot mit dot edu>
- To: gcc-patches at gcc dot gnu dot org
- Date: Thu, 16 May 2019 17:35:10 -0400
- Subject: [RFC PATCH] Enhancements to profiledbootstrap
Hi, I've been playing some with the PGO build infrastructure and have a
few changes I thought I'd share and get feedback on whether they're
completely crazy or not. I'm not terribly familiar with the innards of
the build infra, so would appreciate any comments and suggestions.
First, a recap of the current PGO build process -- please let me know if
I'm wrong about anything:
For a profiledbootstrap build, we replace normal stage{2,3,4}
resepectively with an instrumented stageprofile, a non-instrumented
stagetrain and finally a stagefeedback using the profile created when
building stagetrain.
I had two main goals in doing these changes:
1. profiledbootstrap does not do any comparison build, unlike the
regular bootstrap, so it is possible that the end product is actually
broken. Goal 1: try to incorporate this.
2. The profiling data comes from the stageprofile -> stagetrain build
and that run does not include many optimization passes (at least by
default at -O2) because those would only get enabled when profiling data
is available. Goal 2: try to create a bootstrap target that would
incorporate data from these passes.
Goal 1: Comparison stage
I started on goal 1 with the idea that if we built stagetrain with
instrumentation as well, we could just compare stageprofile and
stagetrain like we do with stage2/3.
This runs into a few roadblocks however that I would appreciate if
someone could comment on:
a) profiling data starts getting generated while building stageprofile,
since parts of that process involve running newly compiled executables.
In the current build that doesn't cause any issues, we just throw that
away and only use profiling data generated during the profile->train
build, which should be extensive enough.
This will not work if stagetrain is built with instrumentation,
as it will be appending profiling data to the same files that it is
using, during the train->feedback build. To resolve this, I changed the
build to save profiling data in an external directory, so the two stages
write profiling data into different places. Unfortunately, this results
in the path to that location getting saved into each object file, which
makes it impossible to compare them -- it should be possible to compare
just the .text sections maybe, or pass different GCOV_PREFIX overrides
for build vs host tools, but instead I decided to just add the
possibility to rebuild stagefeedback a second time using the
profile->train data and use that as the comparison, this should anyway
be the right comparison to do as it would be of the final build product.
It may also be possible to solve this by saving the profile data in the
same place for the two stages, but make a copy of that to use for the
train->feedback run but I haven't explored this yet. It will result in
profiling data that is a mix of the stageprofile and stagetrain
compilers but that might be okay given that they should be identical in
control flow.
b) I do get a few differences that are somewhat random: it looks like in
some cases the second run arranges functions in a different order from
the first run even though it is using the same profile data. Is this
known/is there a way to prevent it?
Goal 2: Second feedback stage
Nothing special here, it builds a new stagefeedbackfull using the
train->feedback profile. It does produce a different compiler so there's
some effect but I haven't benchmarked improvements to see if it's
measurably better.
Testing was done on x86_64-pc-linux-gnu, with default configure settings
except for --enable-languages=c,c++ --disable-werror. I've bootstrapped
PGO with/without --with-build-config=bootstrap-lto.
Summary of changes:
a) Add three new stages -- feedbackcompare, feedbackfull,
feedbackfullcompare with the two *compare stages to be used for
comparing with the previous ones. Question about gcc/*/Make-lang.in: I
see that these have rules at the end for, for eg c.stage*. Are these
necessary or vestegial-- stagetrain is not there currently and I didn't
add any of the new ones either.
b) Modify stagetrain to be built instrumented, and change profiling
output directories. Note that this is currently wasteful of build time
if you're going to stop with profiledbootstrap, so perhaps this should
be controlled via a build-config so it is enabled only for the *full
bootstraps.
c) Cleaned up bootstrap-lto{-lean}.mk a bit. It appears unnecessary to
set all the individual stage flags -- if someone wants to customize them
they can just override STAGE{2,3,4}_FLAGS to get the same effect. I also
added STAGE4_CFLAGS in there, and added -frandom-seed=1 and do-compare3
in bootstrap-lto-lean in case the user wants to do a bootstrap4. For
bootstrap-lto-noplugin.mk I noticed that the profiling stages were added
but without -ffat-lto-objects, that should get fixed by the patch
although it appears unlikely someone would be doing such a build.
d) If one does a non-LTO PGO build currently, the LTO frontend doesn't
get profiled. I modified the main Makefile to add the LTO flag during
the generator build, similar to bootstrap-lto-lean.mk. For a c/c++
bootstrap the remaining unprofiled files that are warned about are
mostly libiberty.
The patch is attached, the top-level configure and Makefile.in need to
be regenerated.
Thank you.
diff --git a/Makefile.def b/Makefile.def
index 1aab271d8aa..d4312c9de52 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -628,12 +628,23 @@ bootstrap_stage = {
compare_target=compare3 ;
bootstrap_target=bootstrap4 ; };
bootstrap_stage = {
- id=profile ; prev=1 ; };
+ id=profile ; prev=1 ; profilegen=profile ; };
bootstrap_stage = {
- id=train; prev=profile ; } ;
+ id=train; prev=profile ; lean=1 ; profilegen=train ; } ;
bootstrap_stage = {
- id=feedback ; prev=train;
+ id=feedback ; prev=train; lean=profile ; profileuse=profile ;
bootstrap_target=profiledbootstrap ; };
+bootstrap_stage = {
+ id=feedbackcompare ; prev=feedback; lean=train ; profileuse=profile ;
+ compare_target=comparefeedback ;
+ bootstrap_target=profiledbootstrapcompare ; };
+bootstrap_stage = {
+ id=feedbackfull ; prev=feedback; lean=train ; profileuse=train ;
+ bootstrap_target=profiledbootstrapfull ; };
+bootstrap_stage = {
+ id=feedbackfullcompare ; prev=feedbackfull; lean=feedback ; profileuse=train ;
+ compare_target=comparefeedbackfull ;
+ bootstrap_target=profiledbootstrapfullcompare ; };
bootstrap_stage = {
id=autoprofile ; prev=1 ;
autoprofile="$$s/gcc/config/i386/$(AUTO_PROFILE)" ; };
diff --git a/Makefile.tpl b/Makefile.tpl
index 1cdc023c82f..229164da8b0 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -481,14 +481,23 @@ STAGE2_TFLAGS += -fno-checking
STAGE3_CFLAGS += -fchecking=1
STAGE3_TFLAGS += -fchecking=1
-STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-generate
+STAGEprofile_CFLAGS = $(STAGE2_CFLAGS) -fprofile-exclude-files=conftest -fprofile-generate=$$r/$(HOST_SUBDIR)/profile-stageprofile
STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
-STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
-STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS))
+STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) -fprofile-exclude-files=conftest -fprofile-generate=$$r/$(HOST_SUBDIR)/profile-stagetrain
+STAGEtrain_TFLAGS = $(filter-out -fchecking-1,$(STAGE3_TFLAGS))
-STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use
-STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
+[+ FOR bootstrap-stage +][+ IF profileuse +]
+STAGE[+id+]_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use=$$r/$(HOST_SUBDIR)/profile-stage[+profileuse+]
+STAGE[+id+]_TFLAGS = $(STAGE4_TFLAGS)
+[+ ENDIF profileuse +][+ ENDFOR bootstrap-stage +]
+
+# If we are building lto, but not using it during the build it will never get profiled.
+# Force add lto flags to the generators (like in config/bootstrap-lto-lean.mk)
+ifneq (,$(filter lto,@languages@))
+STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver
+STAGEfeedback_GENERATOR_CFLAGS += -flto=jobserver
+endif
STAGEautoprofile_CFLAGS = $(STAGE2_CFLAGS) -g
STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS)
@@ -498,6 +507,9 @@ STAGEautofeedback_TFLAGS = $(STAGE3_TFLAGS)
do-compare = @do_compare@
do-compare3 = $(do-compare)
+[+ FOR bootstrap-stage +][+ IF profileuse +][+ IF compare_target +]
+do-[+compare_target+] = $(do-compare3)
+[+ ENDIF compare_target +][+ ENDIF profileuse +][+ ENDFOR bootstrap-stage +]
# -----------------------------------------------
# Programs producing files for the TARGET machine
@@ -1657,6 +1669,13 @@ stage[+id+]-bubble:: [+ IF prev +]stage[+prev+]-bubble[+ ENDIF +]
fi[+ IF compare-target +]
$(MAKE) $(RECURSE_FLAGS_TO_PASS) [+compare-target+][+ ENDIF compare-target +]
+[+ IF profilegen +]
+.PHONY: clean-stage[+id+]-profile
+clean-stage[+id+]: clean-stage[+id+]-profile
+clean-stage[+id+]-profile:
+ rm -rf $(HOST_SUBDIR)/profile-stage[+id+]
+[+ ENDIF profilegen +]
+
.PHONY: all-stage[+id+] clean-stage[+id+]
do-clean: clean-stage[+id+]
@@ -1736,7 +1755,8 @@ distclean-stage[+id+]::
@: $(MAKE); $(stage)
@test "`cat stage_last`" != stage[+id+] || rm -f stage_last
rm -rf stage[+id+]-* [+
- IF compare-target +][+compare-target+] [+ ENDIF compare-target +]
+ IF compare-target +][+compare-target+] [+ ENDIF compare-target +][+
+ IF profilegen +]$(HOST_SUBDIR)/profile-stage[+id+] [+ ENDIF profilegen +]
[+ IF cleanstrap-target +]
.PHONY: [+cleanstrap-target+]
@@ -1755,19 +1775,6 @@ distclean-stage[+id+]::
[+ ENDFOR bootstrap-stage +]
-stageprofile-end::
- $(MAKE) distclean-stagefeedback
-
-stagefeedback-start::
- @r=`${PWD_COMMAND}`; export r; \
- s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
- for i in prev-*; do \
- j=`echo $$i | sed s/^prev-//`; \
- cd $$r/$$i && \
- { find . -type d | sort | sed 's,.*,$(SHELL) '"$$s"'/mkinstalldirs "../'$$j'/&",' | $(SHELL); } && \
- { find . -name '*.*da' | sed 's,.*,$(LN) -f "&" "../'$$j'/&",' | $(SHELL); }; \
- done
-
@if gcc-bootstrap
do-distclean: distclean-stage1
diff --git a/config/bootstrap-lto-lean.mk b/config/bootstrap-lto-lean.mk
index 79cea50a4c6..e751779f992 100644
--- a/config/bootstrap-lto-lean.mk
+++ b/config/bootstrap-lto-lean.mk
@@ -1,10 +1,10 @@
# This option enables LTO for stage4 and LTO for generators in stage3 with profiledbootstrap.
# Otherwise, LTO is used in only stage3.
-STAGE3_CFLAGS += -flto=jobserver
+STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1
+STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1
override STAGEtrain_CFLAGS := $(filter-out -flto=jobserver,$(STAGEtrain_CFLAGS))
STAGEtrain_GENERATOR_CFLAGS += -flto=jobserver
-STAGEfeedback_CFLAGS += -flto=jobserver
# assumes the host supports the linker plugin
LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/
@@ -15,3 +15,5 @@ LTO_EXPORTS = AR="$(LTO_AR)"; export AR; \
LTO_FLAGS_TO_PASS = AR="$(LTO_AR)" RANLIB="$(LTO_RANLIB)"
do-compare = /bin/true
+do-compare3 = $(SHELL) $(srcdir)/contrib/compare-lto $$f1 $$f2
+extra-compare = gcc/lto1$(exeext)
diff --git a/config/bootstrap-lto-noplugin.mk b/config/bootstrap-lto-noplugin.mk
index 0f50708e49d..613e0dc09d1 100644
--- a/config/bootstrap-lto-noplugin.mk
+++ b/config/bootstrap-lto-noplugin.mk
@@ -3,7 +3,5 @@
STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
-STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1
+STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1 -ffat-lto-objects
do-compare = /bin/true
diff --git a/config/bootstrap-lto.mk b/config/bootstrap-lto.mk
index 4de07e5b226..2f410e1b39a 100644
--- a/config/bootstrap-lto.mk
+++ b/config/bootstrap-lto.mk
@@ -1,10 +1,8 @@
-# This option enables LTO for stage2 and stage3 in slim mode
+# This option enables LTO for stage2 onward in slim mode
STAGE2_CFLAGS += -flto=jobserver -frandom-seed=1
STAGE3_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEprofile_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEtrain_CFLAGS += -flto=jobserver -frandom-seed=1
-STAGEfeedback_CFLAGS += -flto=jobserver -frandom-seed=1
+STAGE4_CFLAGS += -flto=jobserver -frandom-seed=1
# assumes the host supports the linker plugin
LTO_AR = $$r/$(HOST_SUBDIR)/prev-gcc/gcc-ar$(exeext) -B$$r/$(HOST_SUBDIR)/prev-gcc/
diff --git a/configure.ac b/configure.ac
index 9db4fd14aa2..ab31ca5f541 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2112,6 +2112,8 @@ Supported languages are: ${potential_languages}])
fi
AC_SUBST(stage1_languages)
+ languages=`echo "$enable_languages" | sed -e "s/,/ /g"`
+ AC_SUBST(languages)
ac_configure_args=`echo " $ac_configure_args" | sed -e "s/ '--enable-languages=[[^ ]]*'//g" -e "s/$/ '--enable-languages="$enable_languages"'/" `
fi