This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH][compare-elim] Merge zero-comparisons with normal ops


Patch updated with all relevant comments and suggestions.

Bootstrapped and tested on arm-none-linux-gnueabihf, and aarch64-none-linux-gnu and x86_64.

Ok for trunk?

2017-08-05  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
	    Michael Collison <michael.collison@arm.com>

	* compare-elim.c: Include emit-rtl.h.
	(can_merge_compare_into_arith): New function.
	(try_validate_parallel): Likewise.
	(try_merge_compare): Likewise.
	(try_eliminate_compare): Call the above when no previous clobber
	is available.
	(execute_compare_elim_after_reload): Add DF_UD_CHAIN and DF_DU_CHAIN
	dataflow problems.

2017-08-05  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
	    Michael Collison <michael.collison@arm.com>
	    
	* gcc.target/aarch64/cmpelim_mult_uses_1.c: New test.

-----Original Message-----
From: Segher Boessenkool [mailto:segher@kernel.crashing.org] 
Sent: Saturday, September 2, 2017 12:07 AM
To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
Cc: Jeff Law <law@redhat.com>; Michael Collison <Michael.Collison@arm.com>; gcc-patches@gcc.gnu.org; nd <nd@arm.com>
Subject: Re: [PATCH][compare-elim] Merge zero-comparisons with normal ops

Hi!

On Tue, Aug 29, 2017 at 09:39:06AM +0100, Kyrill Tkachov wrote:
> On 28/08/17 19:26, Jeff Law wrote:
> >On 08/10/2017 03:14 PM, Michael Collison wrote:
> >>One issue that we keep encountering on aarch64 is GCC not making 
> >>good use of the flag-setting arithmetic instructions like ADDS, 
> >>SUBS, ANDS etc. that perform an arithmetic operation and compare the 
> >>result against zero.
> >>They are represented in a fairly standard way in the backend as 
> >>PARALLEL
> >>patterns:
> >>(parallel [(set (reg x1) (op (reg x2) (reg x3)))
> >>            (set (reg cc) (compare (op (reg x2) (reg x3)) (const_int 
> >>            0)))])

That is incorrect: the compare has to come first.  From md.texi:

@cindex @code{compare}, canonicalization of [ ... ]

@item
For instructions that inherently set a condition code register, the @code{compare} operator is always written as the first RTL expression of the @code{parallel} instruction pattern.  For example, [ ... ]

aarch64.md seems to do this correctly, fwiw.

> >>GCC isn't forming these from separate arithmetic and comparison 
> >>instructions as aggressively as it could.
> >>A particular pain point is when the result of the arithmetic insn is 
> >>used before the comparison instruction.
> >>The testcase in this patch is one such example where we have:
> >>(insn 7 35 33 2 (set (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
> >>         (plus:SI (reg:SI 0 x0 [ x ])
> >>             (reg:SI 1 x1 [ y ]))) "comb.c":3 95 {*addsi3_aarch64}
> >>      (nil))
> >>(insn 33 7 34 2 (set (reg:SI 1 x1 [77])
> >>         (plus:SI (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
> >>             (const_int 2 [0x2]))) "comb.c":4 95 {*addsi3_aarch64}
> >>      (nil))
> >>(insn 34 33 17 2 (set (reg:CC 66 cc)
> >>         (compare:CC (reg/v:SI 0 x0 [orig:73 <retval> ] [73])
> >>             (const_int 0 [0]))) "comb.c":4 391 {cmpsi}
> >>      (nil))
> >>
> >>This scares combine away as x0 is used in insn 33 as well as the 
> >>comparison in insn 34.
> >>I think the compare-elim pass can help us here.
> >Is it the multiple use or the hard register that combine doesn't 
> >appreciate.  The latter would definitely steer us towards compare-elim.
> 
> It's the multiple use IIRC.

Multiple use, and multiple set (of x1), and more complications...

7+33 won't combine to an existing insn.

7+34 will not even be tried (insn 33 is the first use of x0, not insn 34).
But it cannot work anyway, since x1 in insn 7 is clobbered in insn 33, so
7 cannot be merged into 34.

7+33+34 results in a parallel of a compare with the same invalid insn
as in the 7+33 case.  Combine would try to split it to two insns again, except it already has two insns (the arith and the compare).  It does not see that when it splits the insn it can combine the first half with the compare.

What would be needed is pulling insn 34 before insn 33 (which is fine, no conflicts there), and then we could combine 7+34 just fine.  But combine tries to be linear complexity, and it really cannot change insns around anyway.


Segher

Attachment: pr5198v2.patch
Description: pr5198v2.patch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]