This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH v2 0/9] Separate shrink-wrapping
- From: Segher Boessenkool <segher at kernel dot crashing dot org>
- To: Jeff Law <law at redhat dot com>
- Cc: Bernd Schmidt <bschmidt at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 9 Sep 2016 10:17:18 -0500
- Subject: Re: [PATCH v2 0/9] Separate shrink-wrapping
- Authentication-results: sourceware.org; auth=none
- References: <cover.1470015604.git.segher@kernel.crashing.org> <81710c02-05bf-fb65-dedc-8ba389c0d8e8@redhat.com> <20160826145001.GA21746@gate.crashing.org> <cd56e044-1061-ea55-8e2a-2932c76a64aa@redhat.com> <20160826162709.GA30044@gate.crashing.org> <2c1fee68-4753-779c-5d75-90e6c7f86776@redhat.com>
On Thu, Sep 08, 2016 at 10:41:37AM -0600, Jeff Law wrote:
> So can you expand on the malloc example a bit -- I'm pretty sure I
> understand what you're trying to do, but a concrete example may help
> Bernd and be useful for archival purposes.
Sure, but it's big (which is the problem :-) )
> I also know that Carlos is interested in the malloc example -- so I'd
> like to be able to pass that along to him.
>
> Given the multiple early exit and fast paths through the allocator, I'm
> not at all surprised that sinking different components of the prologue
> to different locations is useful.
>
> Also if there's a case where sinking into a loop occurs, definitely
> point that out.
Not sure that happens in there, I'll find out.
> >>That's a later addition anyway and isn't necessary to do
> >>shrink-wrapping in the first place.
> >
> >No, it always did that, just not as often (it only duplicated straight-line
> >code before).
> Presumably (I haven't looked yet), the duplication is so that we can
> isolate one or more paths which in turn allows sinking the prologue
> further on some of those paths.
It duplicates as many blocks as it needs to dup, to make as many exits
as possible reachable without *any* prologue/epilogue.
As the header comment before the older code says:
/* Try to perform a kind of shrink-wrapping, making sure the
prologue/epilogue is emitted only around those parts of the
function that require it.
There will be exactly one prologue, and it will be executed either
zero or one time, on any path. Depending on where the prologue is
placed, some of the basic blocks can be reached via both paths with
and without a prologue. Such blocks will be duplicated here, and the
edges changed to match.
Paths that go to the exit without going through the prologue will use
a simple_return instead of the epilogue. We maximize the number of
those, making sure to only duplicate blocks that can be duplicated.
If the prologue can then still be placed in multiple locations, we
place it as early as possible.
> This is something I'll definitely want to look at -- block duplication
> to facilitate code elimination (or in this case avoid code insertion)
> hits several areas of interest to me -- and how we balance duplication
> vs runtime savings is always interesting.
Yeah. If you use separate shrink-wrapping you still *also* get the
"old" algorithm, if that wasn't clear.
Segher