This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2 0/9] Separate shrink-wrapping

From: Segher Boessenkool <segher at kernel dot crashing dot org>
To: Jeff Law <law at redhat dot com>
Cc: Bernd Schmidt <bschmidt at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Fri, 9 Sep 2016 10:28:06 -0500
Subject: Re: [PATCH v2 0/9] Separate shrink-wrapping
Authentication-results: sourceware.org; auth=none
References: <cover.1470015604.git.segher@kernel.crashing.org> <81710c02-05bf-fb65-dedc-8ba389c0d8e8@redhat.com> <20160826145001.GA21746@gate.crashing.org> <cd56e044-1061-ea55-8e2a-2932c76a64aa@redhat.com> <902e82bf-16c4-b302-ed44-90fb09b7c013@redhat.com>

On Thu, Sep 08, 2016 at 10:58:13AM -0600, Jeff Law wrote:
> >And that comment puzzles me. Surely prologue and epilogue are executed
> >only once currently, so how does frequency come into it? Again - please
> >provide an example.
> Right, they're executed once currently.  But the prologue could be sunk 
> into locations where they are not executed every time the function is 
> called.  That's the basis behind shrink wrapping.

Right.

> Segher's code essentially allows individual components of the prologue 
> to sink to different points within the function rather than forcing the 
> prologue to be sunk as an atomic unit.

It also allows prologue an epilogue components to be placed in multiple
places, and even allows them to be executed more than once, if that is
cheaper.

> >>The full-prologue algorithm makes as many blocks run without prologue as
> >>possible, by duplicating blocks where that helps.  If you do this for
> >>every component you can and up with 2**40 blocks for just 40 components,
> >
> >Ok, so why wouldn't we use the existing code with the duplication part
> >disabled? That's a later addition anyway and isn't necessary to do
> >shrink-wrapping in the first place.
> I think the concern here is the balance between code explosion and the 
> runtime gains.

The code explosion can be terrible if you have only a few components,
already.  The runtime gains are not so great either.  Here's a simple
example, showing part of a cfg, the exits to the right are calls to
abort and need some prologue component:

|
1
|\
| \
2  X1
|\
| \
|  X2
|

With the "old" algorithm, you need to place the prologue at 1 (because
you can only have one), and then the fall-through path also necessarily
gets that component's prologue.  With the "separate" algorithm, the
component prologues can be placed at X1 and X2 instead (and that will
most likely be cheapest according to the cost model, too).

You can also put this in a loop.  I'll try to make a simple example
with that.

Segher

Follow-Ups:
- Re: [PATCH v2 0/9] Separate shrink-wrapping
  - From: Jeff Law

References:
- Re: [PATCH v2 0/9] Separate shrink-wrapping
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]