[PATCH][AArch64] Separate shrink wrapping hooks implementation

Fri Nov 11 10:18:00 GMT 2016

On 10/11/16 23:39, Segher Boessenkool wrote:
> On Thu, Nov 10, 2016 at 02:42:24PM -0800, Andrew Pinski wrote:
>> On Thu, Nov 10, 2016 at 6:25 AM, Kyrill Tkachov
>>> I ran SPEC2006 on a Cortex-A72. Overall scores were neutral but there were
>>> some interesting swings.
>>> 458.sjeng     +1.45%
>>> 471.omnetpp   +2.19%
>>> 445.gobmk     -2.01%
>>>
>>> On SPECFP:
>>> 453.povray    +7.00%
>>
>> Wow, this looks really good.  Thank you for implementing this.  If I
>> get some time I am going to try it out on other processors than A72
>> but I doubt I have time any time soon.
> I'd love to hear what causes the slowdown for gobmk as well, btw.

I haven't yet gotten a direct answer for that (through performance analysis tools)
but I have noticed that load/store pairs are not generated as aggressively as I hoped.
They are being merged by the sched fusion pass and peepholes (which runs after this)
but it still misses cases. I've hacked the SWS hooks to generate pairs explicitly and that
increases the number of pairs and helps code size to boot. It complicates the logic of
the hooks a bit but not too much.

I'll make those changes and re-benchmark, hopefully that
will help performance.

Thanks,
Kyrill

>
> Segher