This is the mail archive of the
mailing list for the GCC project.
Re: Fix RTL sharing found during PPC bootstrap
On 12/17/06, email@example.com <firstname.lastname@example.org> wrote:
We're taking this a bit off-topic, but you may find this interesting, so...
> Actually, GCSE does not track values of pseudos. Classic PRE eliminates
> lexically equivalent expressions, but it can't track value equivalences.
> We fail to eliminate a number of value-equivalent common sub-expressions
> because they are not lexically equivalent. In this case, the effort of
> the RTL expanders is a bad thing.
Hmm. We must have some kind of miscommunication.
Yes. When you speak of "value" you mean the expression assigned to a
pseudo register. But that is not the value but *just* the expression.
When I speak of "value", I mean the runtime evaluation result of an
Different expressions may represent the same value. This is the kind
of value equivalencing that CSE does, and what classic GCSE does.
We don't have classic GCSE anymore for RTL in GCC. We have LCM-based
PRE. PRE can only see expressions, but not values.
Note that this limitation of PRE is a classic point of discussion in
compiler-related publications. See e.g. Taylor Simpson's PhD thesis
"Value-Driven Redundancy Elimination" for some discussion about it.
Inventing a PRE algorithm that works on values is sort-of the holy
grail of compiler engineering. GVN-PRE is, as far as I know, the first
algorithm that achieved this goal. That is what we have for GIMPLE
now. For RTL, we're stuck with classic PRE that only looks at
gcc/gcse.c (and backed up by it's implementation):
>> Expressions we are interested in GCSE-ing are of the form
>> (set (pseudo-reg) (expression)).
>> Function want_to_gcse_p says what these are.
So when given the RTL:
(set (reg1) (plus (reg) (subreg)))
we'll never see "(subreg)" as an independent expression.
notice that we never call want_to_gcse_p on subexpressions or terms
within a complex operator/expression.
And as a result, something like this will never be optimized by PRE:
(set (pc) (if_then_else (cond) (label1) (label2)))
(set (reg0) (const 123)
(set (reg1) (plus (reg0) (subreg (reg2))))
(set (reg8) (const 123)
(set (reg 9) (plus (reg8) (subreg (reg2))))
Clearly, from a *value-equivalence* point of view, reg 9 and reg 1
would hold the same value (namely: "(plus (const 123) (subreg
(reg2)))"), so the set to reg 9 is partially redundant.
But PRE never sees this. PRE works on expressions, so even though the
value of the two expressions is the same, PRE doesn't notice that reg
8 and reg 0 hold the same value so it treats the two expressions as
In GCC, we currently can't optimize the above in gcse.c.
So when, we previously generate:
(set (reg1) (plus (reg2) (FOO)))
(set (reg3) (plus (reg4) (FOO)))
we're unable to recognize that FOO is repeated and may usefully be
hoisted, for example if it's a MEM, or non-trivial SUBREG, and so on.
Hence the RTL expanders, for the benefit of RTL optimizers like to
call "force_reg" on operands of unknown heritage.
Yes, that is useful for passes like CSE.
I'm a little confused by your "GCSE does not track values of pseudos".
If you look at hash_scan_set, you'll notice that we don't even record
"(set (subreg reg) (const_int 0))", i.e. by testing REG_P (dest), gcse.c
pretty much doesn't track the values of anything other than pseudos!
It tracks what expressions are assigned to the pseudos, but not the
value of the pseudos.
I know this isn't new to you, so I suspect that your comments about
"classical PRE" may refer to the algorithm's theoretical behaviour and
perhaps not the implementation that we currently have in the tree.
I don't do theory. I'm quite familiar with what we have in the tree,
thank you very much ;-) This is what we have in gcse.c now.
It's in the best interest of RTL expanders to keep the insn stream
RISC-like after expansion, exposing the semantics clearly, and then
relying on combine (or improved instruction selection) to reconstitute
more complex (but harder to analyse) instructions later.
Agreed. But the "harder to analyze" depends on the pass. For passes
like CSE, what we have now is excelent. For PRE (theory, or
implemented as in gcse.c) it would be helpful if equivalent
expressions would always end up in the same pseudo.
Compilers wouldn't be as much fun if all algorithms required the same
kind of input to produce optimal results :-)