[poweprc] RFA: patch changing expected code generation for test vsx-simode2.c

Sat Feb 9 22:14:00 GMT 2019

On Sat, Feb 09, 2019 at 04:13:57PM -0500, Vladimir Makarov wrote:
> 
> On 2019-02-09 8:28 a.m., Segher Boessenkool wrote:
> >Hi Vlad,
> >
> >On Fri, Feb 08, 2019 at 02:18:40PM -0500, Vladimir Makarov wrote:
> >>Recently I committed a patch solving
> >>
> >>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560
> >>
> >>The patch resulted in test vsx-simode2.c failure.Â  Here is the
> >>difference in generated code:
> >>
> >>@@ -13,9 +13,8 @@ foo:
> >> Â .LFB0:
> >> Â Â Â Â Â Â Â  .cfi_startproc
> >> Â Â Â Â Â Â Â  std 3,-16(1)
> >>-Â Â Â Â Â Â  ori 2,2,0
> >>-Â Â Â Â Â Â  lwz 9,-12(1)
> >>-Â Â Â Â Â Â  mtvsrwz 32,9
> >>+Â Â Â Â Â Â  addi 9,1,-12
> >>+Â Â Â Â Â Â  lxsiwzx 32,0,9
> >>
> >>The new version is one insn less.Â  So I propose the following patch
> >>changing the expected code generation.
> >>
> >>Is it ok to commit it?
> >This is not okay.  The test is supposed to test that we get a direct
> >move instruction instead of going via memory.  But, trunk does the
> >std+lwz as you see; this is because IRA decides this pseudo needs to
> >go to memory:
> >
> >     r125: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
> >
> >   a1(r125,l0) costs: BASE_REGS:14004,14004 GENERAL_REGS:14004,14004 
> >   LINK_REGS:24010,24010 CTR_REGS:24010,24010 LINK_OR_CTR_REGS:24010,24010 
> >   SPEC_OR_GEN_REGS:24010,24010 MEM:12000,12000
> 
> Thank you for informing me what we expect from the test.
> 
> Apparently, the test did not catch what was supposed to be catched.

Yes, exactly.

> Although the new generated code is better than the old one (2 insns vs 3 
> insns, one insn is a load in the both cases), I see there is no sense 
> for this patch.Â  Simply, the test did not fail before even if the code 
> was bad.Â  Now the test fails as it should be.
> 
> >Is there something wrong in our tuning?
> >
> I have no idea.Â  It needs more investigation.

Where do the above costs come from?  Regs 14k, mem 12k.

> >For reference, 7 and 8 do just
> >
> >         mtvsrwz 32,3
> >#APP
> >  # 10 "vsx-simode2.c" 1
> >         xxlor 32,32,32  # v, v constraints
> >  # 0 "" 2
> >#NO_APP
> >         mfvsrwz 3,32
> >         blr
> >
> >which is the expected code.  The test really should check there is no
> >memory used, or that there are no extra insns other than the 4 expected.
> >
> >Your patch seems to be fine btw, this breakage was really there already,
> >just not detected by the testcase.
> >
> Yes, the patch is fine in a sense that the code is a bit better.

If we decided to assign memory to this pseudo, it now uses better code
for that.  (Not really fewer insns though, the ori 2,2,0 went missing,
and that is still required for good performance on Power8 at least).

> But 
> still the generated code is bad and the test started to fail. I don't 
> think we need to change the test.Â  The original test now reminds us to 
> fix the bad code generation.

Yeah.  And I'll improve the test a bit so it would have failed earlier.

It didn't fail before because combine used to usurp the RA job, doing some
kind of greedy register allocation, increasing the lifetime of argument
registers.

I opened PR89271 (and put you on cc:).

Segher