[Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64

Jeff Law law@redhat.com
Mon May 11 19:26:00 GMT 2015


On 05/11/2015 06:02 AM, David Sherwood wrote:
> Hi,
>
> This patch fixes a couple of issues I saw during the compilation of shell_lam.f
> for 410.bwaves test in spec2006:
>
> * create_new_invariant: We shouldn't bother attempting to calculate the address
> cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P check
> to eliminate the most obvious cases, such as vector modes and so on.
> * get_inv_cost: Change the code so that we don't treat something as an address
> if it is never actually used as an address, i.e. we didn't use it at all. This
> occurs due to the way we process the innermost loop first - invariants get
> pulled out of inner most loop and then get processed during next innermost loop.
> Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
> shows what currently happens:
>
>     4427 *****ending processing of loop 27 ******
>
>     4442 (const_vector:V2DF [
>     4443         (const_double:DF 4.0e+0 [0x0.8p+3])
>     4444         (const_double:DF 4.0e+0 [0x0.8p+3])
>     4445     ])
>     4446
>     4447 Hot cost: 8 (final)
>     4448 (set (reg:V2DF 3179)
>     4449     (const_vector:V2DF [
>     4450             (const_double:DF 4.0e+0 [0x0.8p+3])
>     4451             (const_double:DF 4.0e+0 [0x0.8p+3])
>     4452         ]))
>     4453
>     4454 Hot cost: 8 (final)
>     4455 Set in insn 2764 is invariant (1), cost 8, depends on
>
>     4471 Decided to move invariant 1 -- gain 8
>
> move_invariant_reg moves invariant (the const_vector above) into 3490 and
> replaces use of 3179 with 3490, but conservatively keeps the definition of 3179,
> which gets processed in outer loop, but is never used.
>
> Debug for outer loop now follows:
>
>     4497 *****starting processing of loop 26 ******
>
>     6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
>     6380         (const_vector:V2DF [
>     6381                 (const_double:DF 4.0e+0 [0x0.8p+3])
>     6382                 (const_double:DF 4.0e+0 [0x0.8p+3])
>     6383             ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
>     6384      (nil))
>     6385 ;;   UD chains for insn luid 88 uid 2766
>
>     6604 ;;   UD chains for insn luid 21 uid 3836
>     6605 ;;      reg 3490 { d419(bb 110 insn 2764) }
>     6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
>     6607         (reg:V2DF 3490)) shell_lam.f:287 -1
>     6608      (nil))
> ^^^ additional insn generated by move_invariant_reg from loop 27
>
>     6836 ;;   UD chains for insn luid 40 uid 2790
>     6837 ;;      reg 3161 { d272(bb 111 insn 2746) }
>     6838 ;;      reg 3201 { d305(bb 111 insn 2788) }
>     6839 ;;      reg 3490 { d419(bb 110 insn 2764) }
>     6840 ;;   eq_note reg 3161 { d272(bb 111 insn 2746) }
>     6841 ;;   eq_note reg 3201 { d305(bb 111 insn 2788) }
>     6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
>     6843         (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, index: ivtmp.679_3214,
> offset: 0B] ])
>     6844             (neg:V2DF (reg:V2DF 3490))
>     6845             (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 1219 {fnmav2df4}
>     6846      (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
>     6847         (expr_list:REG_DEAD (reg:V2DF 3179) ...
> ^^^ 3179 marked as DEAD
>
>     8366 *****ending processing of loop 26 ******
>
>     8472 (const_vector:V2DF [
>     8473         (const_double:DF 4.0e+0 [0x0.8p+3])
>     8474         (const_double:DF 4.0e+0 [0x0.8p+3])
>     8475     ])
>     8476
>     8477 Hot cost: 8 (final)
>     8478 (set (reg:V2DF 3490)
>     8479     (const_vector:V2DF [
>     8480             (const_double:DF 4.0e+0 [0x0.8p+3])
>     8481             (const_double:DF 4.0e+0 [0x0.8p+3])
>     8482         ]))
>     8483
>     8484 Hot cost: 8 (final)
>     8485 Set in insn 2764 is invariant (12), cost 8, depends on
>
>     8505 (set (reg:V2DF 3179)
>     8506     (reg:V2DF 3490))
>     8507
>     8508 Hot cost: 8 (final)
>     8509 Set in insn 3836 is invariant (15), cost 8, depends on 12
> ^^^ Never moves 3179 invariant out, even though it's never used.
>
> After my patch we now get this:
>
> 8586 Decided to move invariant 15 -- gain 16
> 8587 Decided to move dependent invariant 12
>
> since get_inv_cost sees the number of uses is zero and the dependent invariant
> (12) gets moved too. Of course, without my patch the definition of 3179 would
> ultimately be eliminated as dead code in a later pass anyway.
>
> Is this ok to go in?
>
> Regards,
> David Sherwood.
>
> ChangeLog entry follows ...
>
>      2015-05-08  David Sherwood  <david.sherwood@arm.com>
>
>          * loop-invariant.c (create_new_invariant): Don't calculate address cost
>          if mode is not scalar integers.
>          (get_inv_cost): Increase computational cost for unused invariants.
So you'd need to bootstrap and regression test this change for it to be 
fully ready for review.

It would also be good to include a test for the testsuite where you can 
show the improvements in behaviour due to this change.  I'm not offhand 
sure of the license on bwaves, so you may or may not be able to reduce 
it and use the reduced code in the GCC testsuite.

With some instrumentation, you may easily find the second case 
triggering code differences in GCC itself, which you could then reduce 
and use as a test for the second change.

Conceptually both hunks seem reasonable, so really it's just a matter of 
building some testcases to verify behaviour and the bootstrap/regression 
test.

There's a number of ARM engineers contributing to GCC these days that 
can probably help.

Jeff


>



More information about the Gcc-patches mailing list