[Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64
David Sherwood
david.sherwood@arm.com
Mon May 18 08:26:00 GMT 2015
Hi Jeff,
Thanks for the suggestion. I did a bootstrap x86_64 build before and after my
patch and looked for differences in the last stage object files and there were
plenty of them. I chose a nice simple function (check_callers) from
ipa-inline-analysis.c and reduced it to a small test case. Hope this is ok.
Testing done:
* aarch64 built, "make check" no regressions
* aarch64_be built, "make check" no regressions
* x86_64 built, "make check" no regressions
ChangeLog:
2015-05-15 David Sherwood <david.sherwood@arm.com>
* loop-invariant.c (create_new_invariant): Don't calculate address cost
if mode is not a scalar integer.
(get_inv_cost): Increase computational cost for unused invariants.
* testsuite/gcc.dg/loop-invariant.c: New testcase.
Regards,
David.
> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: 11 May 2015 20:27
> To: David Sherwood; gcc-patches@gcc.gnu.org
> Subject: Re: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion
discovered by
> spec2k6 on aarch64
>
> On 05/11/2015 06:02 AM, David Sherwood wrote:
> > Hi,
> >
> > This patch fixes a couple of issues I saw during the compilation of shell_lam.f
> > for 410.bwaves test in spec2006:
> >
> > * create_new_invariant: We shouldn't bother attempting to calculate the address
> > cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P check
> > to eliminate the most obvious cases, such as vector modes and so on.
> > * get_inv_cost: Change the code so that we don't treat something as an address
> > if it is never actually used as an address, i.e. we didn't use it at all. This
> > occurs due to the way we process the innermost loop first - invariants get
> > pulled out of inner most loop and then get processed during next innermost loop.
> > Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
> > shows what currently happens:
> >
> > 4427 *****ending processing of loop 27 ******
> >
> > 4442 (const_vector:V2DF [
> > 4443 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 4444 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 4445 ])
> > 4446
> > 4447 Hot cost: 8 (final)
> > 4448 (set (reg:V2DF 3179)
> > 4449 (const_vector:V2DF [
> > 4450 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 4451 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 4452 ]))
> > 4453
> > 4454 Hot cost: 8 (final)
> > 4455 Set in insn 2764 is invariant (1), cost 8, depends on
> >
> > 4471 Decided to move invariant 1 -- gain 8
> >
> > move_invariant_reg moves invariant (the const_vector above) into 3490 and
> > replaces use of 3179 with 3490, but conservatively keeps the definition of 3179,
> > which gets processed in outer loop, but is never used.
> >
> > Debug for outer loop now follows:
> >
> > 4497 *****starting processing of loop 26 ******
> >
> > 6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
> > 6380 (const_vector:V2DF [
> > 6381 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 6382 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 6383 ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
> > 6384 (nil))
> > 6385 ;; UD chains for insn luid 88 uid 2766
> >
> > 6604 ;; UD chains for insn luid 21 uid 3836
> > 6605 ;; reg 3490 { d419(bb 110 insn 2764) }
> > 6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
> > 6607 (reg:V2DF 3490)) shell_lam.f:287 -1
> > 6608 (nil))
> > ^^^ additional insn generated by move_invariant_reg from loop 27
> >
> > 6836 ;; UD chains for insn luid 40 uid 2790
> > 6837 ;; reg 3161 { d272(bb 111 insn 2746) }
> > 6838 ;; reg 3201 { d305(bb 111 insn 2788) }
> > 6839 ;; reg 3490 { d419(bb 110 insn 2764) }
> > 6840 ;; eq_note reg 3161 { d272(bb 111 insn 2746) }
> > 6841 ;; eq_note reg 3201 { d305(bb 111 insn 2788) }
> > 6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
> > 6843 (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, index: ivtmp.679_3214,
> > offset: 0B] ])
> > 6844 (neg:V2DF (reg:V2DF 3490))
> > 6845 (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 1219 {fnmav2df4}
> > 6846 (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
> > 6847 (expr_list:REG_DEAD (reg:V2DF 3179) ...
> > ^^^ 3179 marked as DEAD
> >
> > 8366 *****ending processing of loop 26 ******
> >
> > 8472 (const_vector:V2DF [
> > 8473 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 8474 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 8475 ])
> > 8476
> > 8477 Hot cost: 8 (final)
> > 8478 (set (reg:V2DF 3490)
> > 8479 (const_vector:V2DF [
> > 8480 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 8481 (const_double:DF 4.0e+0 [0x0.8p+3])
> > 8482 ]))
> > 8483
> > 8484 Hot cost: 8 (final)
> > 8485 Set in insn 2764 is invariant (12), cost 8, depends on
> >
> > 8505 (set (reg:V2DF 3179)
> > 8506 (reg:V2DF 3490))
> > 8507
> > 8508 Hot cost: 8 (final)
> > 8509 Set in insn 3836 is invariant (15), cost 8, depends on 12
> > ^^^ Never moves 3179 invariant out, even though it's never used.
> >
> > After my patch we now get this:
> >
> > 8586 Decided to move invariant 15 -- gain 16
> > 8587 Decided to move dependent invariant 12
> >
> > since get_inv_cost sees the number of uses is zero and the dependent invariant
> > (12) gets moved too. Of course, without my patch the definition of 3179 would
> > ultimately be eliminated as dead code in a later pass anyway.
> >
> > Is this ok to go in?
> >
> > Regards,
> > David Sherwood.
> >
> > ChangeLog entry follows ...
> >
> > 2015-05-08 David Sherwood <david.sherwood@arm.com>
> >
> > * loop-invariant.c (create_new_invariant): Don't calculate address cost
> > if mode is not scalar integers.
> > (get_inv_cost): Increase computational cost for unused invariants.
> So you'd need to bootstrap and regression test this change for it to be
> fully ready for review.
>
> It would also be good to include a test for the testsuite where you can
> show the improvements in behaviour due to this change. I'm not offhand
> sure of the license on bwaves, so you may or may not be able to reduce
> it and use the reduced code in the GCC testsuite.
>
> With some instrumentation, you may easily find the second case
> triggering code differences in GCC itself, which you could then reduce
> and use as a test for the second change.
>
> Conceptually both hunks seem reasonable, so really it's just a matter of
> building some testcases to verify behaviour and the bootstrap/regression
> test.
>
> There's a number of ARM engineers contributing to GCC these days that
> can probably help.
>
> Jeff
>
>
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loop-invariant2.patch
Type: application/octet-stream
Size: 2177 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20150518/e547735c/attachment.obj>
More information about the Gcc-patches
mailing list