This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Thoughts on TER (was: [PATCH] Fix multiply-add regressions after expand-from-SSA)


Hi,

On Thu, 30 Apr 2009, Adam Nemet wrote:

> > That at least followed also my ideas how to implement something 
> > similar to TER with expand-from-SSA.
> 
> I see now.  To be honest, I just took the code from 'case SSA_NAME:' 
> which is probably something that was added now?

Exactly.

> > In an ideal world we would get rid of TER altogether, that's why 
> > knowing which cases are currently really helped by it is useful.
> 
> But it seems that the complexity comes from the fact that you either map 
> all RTL expanders to tree codes (maybe that's already the case) or you 
> would have to be able to look at the expression trees at expansion time 
> which is not as easy since we're no longer in SSA.

Sort of, yes.  We aren't in SSA form anymore, but not yet in RTL either, 
it's something in between, where SSA names exist, but they are related 
together in some partitions which themself have RTL expressions (pseudos 
or MEMs) assigned.  So looking up expression trees does work.

The problem TER tries to solve is that our RTL expander doesn't try to 
find a cover of the whole expression trees formed from gimple 
instructions, so it can't find an ideal mapping from gimple instructions 
to RTL instructions, if one RTL insn has the effect of multiple gimple 
insns.  TER tries to work around this problem by simply building 
expression trees as large as possible and hoping for the best, namely that 
expand will do something useful with these large trees; often it doesn't.

Then at the RTL side we also have combine which tries something similar 
(pulling multiple RTL insns into one with the combined effect).

So TER partly does something which later would have been done at combine.  
As combine has the three instruction limitation TER also does something 
more than combine could do.  widening multiplication is one case.

But TER also does significantly less due to its restriction of following 
only single-use chains.  E.g. in the widening mult example TER wouldn't 
have helped this case:

  short s = ...
  int i = (int)s * (int) s;

Optimization would have moved out the common subexpression "(int) s", and 
TER wouldn't have put it back in (because that temporary has then two 
uses).  So expand would have seen just the muliplication of two ints, 
where it would have been better if it saw that those were really just 
extended shorts.

Together, TER does things that can be done by combine, and things that 
can't be done by combine.  These latter things it does in a restricted 
way.  And for doing all that TER constructs large trees which most of the 
time are simply expanded again into individual instructions by expand, 
just undoing the effects of TER instead of expanding to less instructions.

So, one goal for the future would be to learn which transformations that 
TER does really help (probably not that many), and then either:
(a) convince ourself that these will be done by other RTL transformations,
    e.g. combine
(b) implement them in the expanders itself, via looking up use-def
    chains explicitely in the (hopefully few) places where it helps.
(c) move over the useful transformation to gimple, so that other 
    tree optimization passes can benefit from them, not only the RTL
    passes.

I currently know of two things that are helped by TER:
1) condition chains, when we have something like this in gimple:
     cond = A > B;
     if (cond != 0) ...
   without TER these are expanded to two jumps (and hence additional basic 
   blocks) hindering RTL transformers somewhat.  That's why I put in a 
   hack for this, with the expectation that this is solved somewhere else 
   in the future.
2) widening multiplication where it's usefull for expand to know that
   some operands are in reality just sign- or zero-extended variants of 
   short operands.  TER helps this case only sometimes (single-use 
   chains), so it better be implemented somewhere else anyway.

There surely are some more, but I'd hope that they all can be sorted into 
only few general themes, so that the decision between a/b/c is easy.


Ciao,
Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]