This is the mail archive of the
mailing list for the GCC project.
Re: Fix RTL sharing found during PPC bootstrap
> Actually, GCSE does not track values of pseudos. Classic PRE eliminates
> lexically equivalent expressions, but it can't track value equivalences.
> We fail to eliminate a number of value-equivalent common sub-expressions
> because they are not lexically equivalent. In this case, the effort of
> the RTL expanders is a bad thing.
Hmm. We must have some kind of miscommunication. According to
gcc/gcse.c (and backed up by it's implementation):
>> Expressions we are interested in GCSE-ing are of the form
>> (set (pseudo-reg) (expression)).
>> Function want_to_gcse_p says what these are.
So when given the RTL:
(set (reg1) (plus (reg) (subreg)))
we'll never see "(subreg)" as an independent expression. You'll
notice that we never call want_to_gcse_p on subexpressions or terms
within a complex operator/expression.
So when, we previously generate:
(set (reg1) (plus (reg2) (FOO)))
(set (reg3) (plus (reg4) (FOO)))
we're unable to recognize that FOO is repeated and may usefully be
hoisted, for example if it's a MEM, or non-trivial SUBREG, and so on.
Hence the RTL expanders, for the benefit of RTL optimizers like to
call "force_reg" on operands of unknown heritage.
(set (regTMP) (FOO))
(set (reg1) (plus (reg2) (regTMP)))
(set (reg3) (plus (reg4) (regTMP)))
This exposes the pseudo "regTMP" to the optimizers, whose LHS is more
likely to match lexically against other instances of FOO, and if
necessary combine, reload (and maybe fwprop) can un-CSE this expression
I'm a little confused by your "GCSE does not track values of pseudos".
If you look at hash_scan_set, you'll notice that we don't even record
"(set (subreg reg) (const_int 0))", i.e. by testing REG_P (dest), gcse.c
pretty much doesn't track the values of anything other than pseudos!
I know this isn't new to you, so I suspect that your comments about
"classical PRE" may refer to the algorithm's theoretical behaviour and
perhaps not the implementation that we currently have in the tree.
It's in the best interest of RTL expanders to keep the insn stream
RISC-like after expansion, exposing the semantics clearly, and then
relying on combine (or improved instruction selection) to reconstitute
more complex (but harder to analyse) instructions later.