This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] for PR 18040


On Sun, 2004-10-17 at 20:21 -0700, Zack Weinberg wrote:
> kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
> 
> >     Your opinion on my alternative suggestion - hoist the type conversion
> >     to the outermost type, thus reducing the nested case to the unnested
> >     case - would be appreciated (and please read the whole message before
> >     answering the question, because I address another objection below).
> >
> > Aside from the issue of possible quadratic behavior, I don't completely
> > understand the proposal here, so I can't fully comment on it.  I should
> > point out, though, that some of these expressions in practice are
> > quite complex: there are lots of implicit conversions and dereferences
> > generated by the front end.
> 
> If you provide an example which produces a type conversion in the
> middle of a chain of dereference expressions, I will endeavor to
> clarify what I mean.
> 
> 
> >     > This stuff is *very* tricky and, as I said, we've been
> >     > throught it before.
> >
> >     The PR indicates that the solution which has already been
> >     implemented does not work.  Thus, the entire topic is open for
> >     reexamination.
> >
> > The PR indicates that a particular optimizer has problems with the
> > overall approach.  One way of dealing with that is to change the
> > approach.  But a more local way is to fix that particular optimizer.
> 
> Fair point, however, comments upthread indicate that lots of people
> are not happy with the overall approach.

Inserting STRIP_NOPS into everywhere that breaks apart component_refs
and looking for variables and things is not a pleasant idea, unless it
was hidden in component_ref iterators of some sort that you could tell
to auto-skip them (the way we have ssa operand iterators).

The optimizer's function in question (for_each_index in tree-ssa-loop-
ivopts.c) was written to assume what the gimple grammar specified, which
is that ((cast)var).field is not legal.
All of the optimizers are written in this manner (IE assuming that the
gimple grammar written is the gimple grammar).

Thus, if you want ((cast)var).field to be legal, you are going to have
to go through every single function in the optimizers that walks
component_refs and fix them.  A lot of them are simply returning NULL or
(whatever they have don't know) in the case where they hit the NOP_EXPR,
instead of aborting.  Some are probably silently doing bad things :)

A quick check of a random function handling component_refs shows they
will also be missing optimizations if you have 

[s/VAR_DECL/SSA_NAME/ is applicable for all the ssa optimizers i'm about
to reference.]

COMPONENT_REF <NOP_EXPR <VAR_DECL> > 
or 
COMPONENT_REF <NOP_EXPR <INDIRECT_REF <VAR_DECL>>

Let me give an example from tree-ssa-dom.c:

if (INDIRECT_REF_P (t))
          {
            tree op = TREE_OPERAND (t, 0);

            /* If the pointer is a SSA variable, then enter new
               equivalences into the hash table.  */
            while (TREE_CODE (op) == SSA_NAME)
              {

This will miss the indirect_ref (and an optimization opportunity) if you
have COMPONENT_REF <NOP_EXPR <INDIRECT_REF <SSA_NAME>>>AR_DECL>>

It will also fail if you have COMPONENT_REF <INDIRECT_REF <NOP_EXPR
<SSA_NAME>>>

Note that to fix both of these with your approach of making
((cast)var).field legal, you'd have to add 2 STRIP_NOPS to just this
small 5 lines of code.

In fact, you'd have to add STRIP_NOPS *everywhere*, and be vigilant
about it, or you'd miss optimizations or generate wrong code.
And that's a mess.
And a mess that other compilers don't have, so it must be possible to do
without it.
In fact, to be honest, i've never, looking at other compiler source,
seen the equivalent of NOP_EXPR (a conversion that doesn't generate
code) around.

For example, for intel, before lowering, etc (IE straight from the front
end) looks like this for zach's example:
Original Source line 9:
        return ((r *) &var)->f1 + var[2000];

9       1               return ( &var.1_V$2->f1_V$3 + var.1_V$2[2000] );

Note the lack of casts. :)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]