This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Bill Schmidt <wschmidt at linux dot ibm dot com>
- Cc: Segher Boessenkool <segher at kernel dot crashing dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 1 Jul 2019 12:30:41 +0200
- Subject: Re: [PATCH] rs6000: Enable -fvariable-expansion-in-unroller by default
- References: <email@example.com> <CAFiYyc3W=A1aKgL8d6PHgyhw65M7zYr+JVnJD+Pf736+Q00j7g@mail.gmail.com> <20190627114529.GU7313@gate.crashing.org> <firstname.lastname@example.org>
On Thu, Jun 27, 2019 at 2:19 PM Bill Schmidt <email@example.com> wrote:
> On 6/27/19 6:45 AM, Segher Boessenkool wrote:
> > On Thu, Jun 27, 2019 at 11:33:45AM +0200, Richard Biener wrote:
> >> On Thu, Jun 27, 2019 at 5:23 AM Bill Schmidt <firstname.lastname@example.org> wrote:
> >>> We've done some experimenting and realized that the subject option almost
> >>> always provide improved performance for Power when the loop unroller is
> >>> enabled. So this patch turns that flag on by default for us.
> >> I guess it creates more freedom for combine (more single-uses) and register
> >> allocation. I wonder in which cases this might pessimize things? I guess
> >> the pre-RA scheduler might make RAs life harder with creating overlapping
> >> life-ranges.
> >> I guess you didn't actually investigate the nature of the improvements you saw?
> > It breaks the length of dependency chains by a factor equal to the unroll
> > factor. I do not know why this doesn't help a lot everywhere. It of
> > course raises register pressure, maybe that is just it?
> Right, it's all about breaking dependencies to more efficiently exploit
> the microarchitecture. By default, variable expansion in GCC is quite
> conservative, creating only two reduction streams out of one, so it's
> pretty rare for it to cause spill. This can be adjusted upwards with
> --param max-variable-expansions-in-unroller=n. Our experiments show
> that raising n to 4 starts to cause some minor degradations, which are
> almost certainly due to pressure, so the default setting looks appropriate.
But it's probably only an issue for targets which enable pre-RA scheduling
by default? It might also increase RA compile-time (more allocnos).
> >> Do we want to adjust the flags documentation, saying whether this is enabled
> >> by default depends on the target (or even list them)?
> > Good idea, thanks.
> OK, I'll update the docs and make the change that Segher requested.
> Thanks for the reviews!
> > Segher