This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns

From: Segher Boessenkool <segher at kernel dot crashing dot org>
To: Bernd Schmidt <bschmidt at redhat dot com>
Cc: gcc-patches at gcc dot gnu dot org, dje dot gcc at gmail dot com
Date: Tue, 19 Jul 2016 09:46:03 -0500
Subject: Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns
Authentication-results: sourceware.org; auth=none
References: <cover.1465347472.git.segher@kernel.crashing.org> <019d5b4c3f6b8119e1511e33a16a8ea96078b094.1465347472.git.segher@kernel.crashing.org> <e73f805d-c6f6-43e6-34b3-1d2f89a92fc3@redhat.com> <20160718163411.GA5741@gate.crashing.org> <f4ebfec4-469d-a6cb-cff4-2e833dc727b6@redhat.com>

On Mon, Jul 18, 2016 at 07:03:04PM +0200, Bernd Schmidt wrote:
> >>>+  /* The frequency of executing the prologue for this BB and all BBs
> >>>+     dominated by it.  */
> >>>+  gcov_type cost;
> >>
> >>Is this frequency consideration the only thing that attempts to prevent
> >>placing prologue insns into loops?
> >
> >Yes.  The algorithm makes sure the prologues are executed as infrequently
> >as possible.  If a block that would get a prologue has the same frequency
> >as a predecessor does, and that predecessor always has that first block as
> >eventual successor, the prologue is moved to the earlier block (this
> >handles the case where both have a frequency of zero, and other cases
> >where the range of freq is too limited).
> 
> Ugh, that is really scaring me. I'd much prefer a classification of 
> valid blocks based on cfg structure alone - I'll need serious convincing 
> that the frequency data is reliable enough for what you are trying to do.

But you need the profile to make even reasonably good decisions.

The standard example:

   1
  / \
 2   3
  \ /
   4
  / \
 5   6
  \ /
   7

where 3 and 6 need some prologue, the rest do not.
If freq(3) + freq(6) > freq(1), it is better to put the prologue at 1;
if not, it is better to place it at 3 and 6.

If you do not use the profile, you cannot do better than the status quo,
i.e. always place it at 1.

In the general case, you have the choice between putting the prologue at
some basic block X, or at certain blocks dominated by X.  This algorithm
chooses the case that has the prologue executed the least often in total,
and that is really all there is to it.


Yes, our profile data sometimes is, uh, less than optimal.  But:
- All our other passes use it, too;
- What matters most here is comparing the execution frequency locally,
  and that is not usually messed up so badly;
- All our other passes use it, too;
- The important cases (loops, exceptional cases) normally have a pretty
  reasonable profile;
- All our other passes use it, too;
- Benchmarking shows big wins with this patch.


Segher

Follow-Ups:
- Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns
  - From: Bernd Schmidt

References:
- Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns
  - From: Bernd Schmidt
- Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns
  - From: Segher Boessenkool
- Re: [PATCH 8/9] shrink-wrap: shrink-wrapping for separate concerns
  - From: Bernd Schmidt

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]