RTL CSE picking simpler but more expensive expressions (re: PR 65932)

Jeff Law law@redhat.com
Wed Jan 13 19:28:00 GMT 2016


On 01/13/2016 04:33 AM, Kyrill Tkachov wrote:
>
> I've been able to get it to do the right thing by changing the line
> where it initially folds the source of the SET. That is line 4639 in
> cse.c: /* Simplify and foldable subexpressions in SRC.  Then get the
> fully- simplified result, which may not necessarily be valid.  */
> src_folded = fold_rtx (src, insn);
>
> In this instance SRC is: (plus:SI (mult:SI (sign_extend:SI (subreg:HI
> (reg:SI 136) 0)) (sign_extend:SI (subreg:HI (reg:SI 138) 0))) (reg:SI
> 141))
>
> and the resulting src_folded is: (plus:SI (mult:SI (reg:SI 136)
> (reg:SI 138)) (reg:SI 141))
>
> However, fold_rtx also modifies src itself, so after the call to
> fold_rtx both src and src_folded contain the plus+mult form without
> the extends. So further down in cse_insn where it does the cost
> analysis of src, src_folded and other expressions it never considers
> the original form of src. Changing that call to fold_rtx to not
> modify its argument like so: src_folded = fold_rtx (src, 0); // An
> argument of 0 means "make a copy of src before modifying"
>
> fixes the testcase and allows CSE to properly select the cheaper
> multiply-extend-add form and doesn't seem to regress codegen in any
> way on SPEC2006 for arm. Archeology says this line has been that way
> since forever, so does anyone know of the rationale for passing insn
> to fold_rtx in that callsite?
That callsite was added during gcc-2 development, in an era where we 
didn't even have public lists where such a change might have been 
discussed.  I didn't try to dig into the old, private gcc2 archives as 
it's unlikely there's going to be any rationale there.

ISTM the code clearly expects that SRC and SRC_FOLDED could be 
different.  I think you could make a case that INSN should just be 
replaced with NULL based on that alone.  Verifying across a reasonable 
body of code that there aren't any undesirable effects would be wise.

Alternately you could compute stuff for "SRC" prior to the call to 
fold_rtx.  It's less likely to have unexpected side effects.

My inclination would be to go with changing INSN to NULL though.  It 
seems to match the overall intent here better.

Jeff



More information about the Gcc mailing list