[Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64

Mon May 18 08:26:00 GMT 2015

Hi Jeff,

Thanks for the suggestion. I did a bootstrap x86_64 build before and after my
patch and looked for differences in the last stage object files and there were
plenty of them. I chose a nice simple function (check_callers) from
ipa-inline-analysis.c and reduced it to a small test case. Hope this is ok.

Testing done:

 *  aarch64 built, "make check" no regressions
 *  aarch64_be built, "make check" no regressions
 *  x86_64 built, "make check" no regressions

ChangeLog:
    2015-05-15  David Sherwood  <david.sherwood@arm.com>

        * loop-invariant.c (create_new_invariant): Don't calculate address cost
        if mode is not a scalar integer.
         (get_inv_cost): Increase computational cost for unused invariants.
        * testsuite/gcc.dg/loop-invariant.c: New testcase.

Regards,
David.

> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: 11 May 2015 20:27
> To: David Sherwood; gcc-patches@gcc.gnu.org
> Subject: Re: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion
discovered by
> spec2k6 on aarch64
> 
> On 05/11/2015 06:02 AM, David Sherwood wrote:
> > Hi,
> >
> > This patch fixes a couple of issues I saw during the compilation of shell_lam.f
> > for 410.bwaves test in spec2006:
> >
> > * create_new_invariant: We shouldn't bother attempting to calculate the address
> > cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P check
> > to eliminate the most obvious cases, such as vector modes and so on.
> > * get_inv_cost: Change the code so that we don't treat something as an address
> > if it is never actually used as an address, i.e. we didn't use it at all. This
> > occurs due to the way we process the innermost loop first - invariants get
> > pulled out of inner most loop and then get processed during next innermost loop.
> > Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
> > shows what currently happens:
> >
> >     4427 *****ending processing of loop 27 ******
> >
> >     4442 (const_vector:V2DF [
> >     4443         (const_double:DF 4.0e+0 [0x0.8p+3])
> >     4444         (const_double:DF 4.0e+0 [0x0.8p+3])
> >     4445     ])
> >     4446
> >     4447 Hot cost: 8 (final)
> >     4448 (set (reg:V2DF 3179)
> >     4449     (const_vector:V2DF [
> >     4450             (const_double:DF 4.0e+0 [0x0.8p+3])
> >     4451             (const_double:DF 4.0e+0 [0x0.8p+3])
> >     4452         ]))
> >     4453
> >     4454 Hot cost: 8 (final)
> >     4455 Set in insn 2764 is invariant (1), cost 8, depends on
> >
> >     4471 Decided to move invariant 1 -- gain 8
> >
> > move_invariant_reg moves invariant (the const_vector above) into 3490 and
> > replaces use of 3179 with 3490, but conservatively keeps the definition of 3179,
> > which gets processed in outer loop, but is never used.
> >
> > Debug for outer loop now follows:
> >
> >     4497 *****starting processing of loop 26 ******
> >
> >     6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
> >     6380         (const_vector:V2DF [
> >     6381                 (const_double:DF 4.0e+0 [0x0.8p+3])
> >     6382                 (const_double:DF 4.0e+0 [0x0.8p+3])
> >     6383             ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
> >     6384      (nil))
> >     6385 ;;   UD chains for insn luid 88 uid 2766
> >
> >     6604 ;;   UD chains for insn luid 21 uid 3836
> >     6605 ;;      reg 3490 { d419(bb 110 insn 2764) }
> >     6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
> >     6607         (reg:V2DF 3490)) shell_lam.f:287 -1
> >     6608      (nil))
> > ^^^ additional insn generated by move_invariant_reg from loop 27
> >
> >     6836 ;;   UD chains for insn luid 40 uid 2790
> >     6837 ;;      reg 3161 { d272(bb 111 insn 2746) }
> >     6838 ;;      reg 3201 { d305(bb 111 insn 2788) }
> >     6839 ;;      reg 3490 { d419(bb 110 insn 2764) }
> >     6840 ;;   eq_note reg 3161 { d272(bb 111 insn 2746) }
> >     6841 ;;   eq_note reg 3201 { d305(bb 111 insn 2788) }
> >     6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
> >     6843         (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, index: ivtmp.679_3214,
> > offset: 0B] ])
> >     6844             (neg:V2DF (reg:V2DF 3490))
> >     6845             (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 1219 {fnmav2df4}
> >     6846      (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
> >     6847         (expr_list:REG_DEAD (reg:V2DF 3179) ...
> > ^^^ 3179 marked as DEAD
> >
> >     8366 *****ending processing of loop 26 ******
> >
> >     8472 (const_vector:V2DF [
> >     8473         (const_double:DF 4.0e+0 [0x0.8p+3])
> >     8474         (const_double:DF 4.0e+0 [0x0.8p+3])
> >     8475     ])
> >     8476
> >     8477 Hot cost: 8 (final)
> >     8478 (set (reg:V2DF 3490)
> >     8479     (const_vector:V2DF [
> >     8480             (const_double:DF 4.0e+0 [0x0.8p+3])
> >     8481             (const_double:DF 4.0e+0 [0x0.8p+3])
> >     8482         ]))
> >     8483
> >     8484 Hot cost: 8 (final)
> >     8485 Set in insn 2764 is invariant (12), cost 8, depends on
> >
> >     8505 (set (reg:V2DF 3179)
> >     8506     (reg:V2DF 3490))
> >     8507
> >     8508 Hot cost: 8 (final)
> >     8509 Set in insn 3836 is invariant (15), cost 8, depends on 12
> > ^^^ Never moves 3179 invariant out, even though it's never used.
> >
> > After my patch we now get this:
> >
> > 8586 Decided to move invariant 15 -- gain 16
> > 8587 Decided to move dependent invariant 12
> >
> > since get_inv_cost sees the number of uses is zero and the dependent invariant
> > (12) gets moved too. Of course, without my patch the definition of 3179 would
> > ultimately be eliminated as dead code in a later pass anyway.
> >
> > Is this ok to go in?
> >
> > Regards,
> > David Sherwood.
> >
> > ChangeLog entry follows ...
> >
> >      2015-05-08  David Sherwood  <david.sherwood@arm.com>
> >
> >          * loop-invariant.c (create_new_invariant): Don't calculate address cost
> >          if mode is not scalar integers.
> >          (get_inv_cost): Increase computational cost for unused invariants.
> So you'd need to bootstrap and regression test this change for it to be
> fully ready for review.
> 
> It would also be good to include a test for the testsuite where you can
> show the improvements in behaviour due to this change.  I'm not offhand
> sure of the license on bwaves, so you may or may not be able to reduce
> it and use the reduced code in the GCC testsuite.
> 
> With some instrumentation, you may easily find the second case
> triggering code differences in GCC itself, which you could then reduce
> and use as a test for the second change.
> 
> Conceptually both hunks seem reasonable, so really it's just a matter of
> building some testcases to verify behaviour and the bootstrap/regression
> test.
> 
> There's a number of ARM engineers contributing to GCC these days that
> can probably help.
> 
> Jeff
> 
> 
> >

-------------- next part --------------
A non-text attachment was scrubbed...
Name: loop-invariant2.patch
Type: application/octet-stream
Size: 2177 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20150518/e547735c/attachment.obj>