This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fix RTL sharing found during PPC bootstrap
> Hi Steven,
>
> > Actually, GCSE does not track values of pseudos. Classic PRE eliminates
> > lexically equivalent expressions, but it can't track value equivalences.
> > We fail to eliminate a number of value-equivalent common sub-expressions
> > because they are not lexically equivalent. In this case, the effort of
> > the RTL expanders is a bad thing.
>
> Hmm. We must have some kind of miscommunication. According to
> gcc/gcse.c (and backed up by it's implementation):
>
> >> Expressions we are interested in GCSE-ing are of the form
> >> (set (pseudo-reg) (expression)).
> >> Function want_to_gcse_p says what these are.
>
>
> So when given the RTL:
>
> (set (reg1) (plus (reg) (subreg)))
>
> we'll never see "(subreg)" as an independent expression. You'll
> notice that we never call want_to_gcse_p on subexpressions or terms
> within a complex operator/expression.
Actually I think what Steven referred to is that we do PRE rather than
value number in GCSE. I.e. if RTL expanders produce
(set (reg1) (expr reg2))
...
(set (reg3) (expr reg2))
GCSE, assuming that first set domintate the second and nothing changes,
will replace second set with
(set (reg3) (reg1))
possibly saving computation expenses for expensive EXPRs, possibly
increasing register presure for noop EXPRs (like subregs in wast
majority are). Notice that want_to_gcse_p unconditionally returns 0 for
SUBREG. We perhaps want to care about the expensive subregs there by
checking rtx_cost (x)>0.
What we however will fail on is:
(set (reg1) (subreg reg2))
(set (reg3) (some_expr reg1))
(set (reg4) (subreg reg2))
(set (reg5) (some_expr reg4))
The GCSE algorithm won't be able to notice that reg3 and reg5 are
equivalent since it will never comonize the 0 cost subregs and thus the
expressions will be lexically different.
The fact that GCSE can't deal with cascaded insn chains is pretty
important. Once upon a time I got GCSE working on i386 and then
it produced a lot of register presure just because the new extra global
temporaries caused by the cascaded insn chains being comonized just to
fixed depth of 2.
Without quite substantial rewrite of GCSE all we can hope for is
that with GVN-PRE on trees and not too much noise introduced by RTL
expanders, all we have to deal with are the narrow expressions GCSE can
deal in 1 pass.
Honza