[PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

Jeff Law law@redhat.com
Mon Oct 10 21:21:00 GMT 2016

On 09/30/2016 04:34 AM, Segher Boessenkool wrote:
> [ whoops, message too big, resending with the attachment compressed ]
> On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
>> With transposition issue addressed, the only blocker I see are some
>> simple testcases we can add to the suite.  They don't have to be real
>> extensive.  And one motivating example for the list archives, ideally
>> the glibc malloc case.
> And here is the malloc testcase.
> A very important (for performance) function is _int_malloc, which starts
> with
[ ... ]
THanks.  What I think is important to note with this example is the bits 
that were pushed into the path with the sysmalloc/alloc_perturb calls. 
That's an unlikely path.

We have to extrapolate a bit from the assembly provided.  In the not 
separately shrink-wrapped version, we have a full prologue of stores and 
two instances of a full epilogue (though only one ever executes) provided.

With separate shrink wrapping the (presumably) very cold path where we 
error has virtually no prologue/epilogue.  That's probably a nop from a 
performance standpoint.

More interesting is the path where we call sysmalloc/alloc_perturb, it's 
a cold path, but not as cold as the error path.  We save/restore 4 regs 
in that case.  Rather than a full prologue/epilogue.  So there's clearly 
a savings there, though again, via the expect it's a cold path.

Where we have to extrapolate is the hot path.  Presumably on the hot 
path we're saving/restoring ~4 fewer registers.   I haven't verified 
that, but that is kindof the whole point here.


More information about the Gcc-patches mailing list