This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize zero_extend into paradoxical subreg

> On Sat, 28 Sep 2002, Richard Earnshaw wrote:
> > > A recent discussion on GCC's semantics for paradoxical subregs on
> > > this list suggested there was a need for additional optimizations
> > > to optimize their use.  In that thread I suggested that simplify_rtx
> > > could eliminate explicit zero_extension and sign_extension operations
> > > on targets with LOAD_EXTEND_OP when the extension operand was a MEM.
> > > The patch below implements this suggestion.
> > >
> > I'm not convinced this is a good idea.  Surely it would be better to go
> > the other way -- ie translate (subreg:M1 (mem:M2) 0) into zero_ or
> > sign_extend if M1 is wider than M2.
> >
> > There's a general principle for RTL that it's better to be explicit about
> > an operation than rely on implicit meaning -- your patch seems to go
> > against that.
> However, on many GCC targets zero_extend or sign_extend require real
> instructions to be generated, whereas automatic sign/zero extension
> when present in hardware does not.  Hence the optimization above, takes
> advantage of a special case (with well defined semantics) to optimize
> away those instructions.

You miss the point. 

1) Paradoxical subregs of a mem are *only* defined on machines which 
define LOAD_EXTEND_OP; in all other cases we can only have a paradoxical 
subreg of another register.

2) LOAD_EXTEND_OP documentation says:

     Define this macro to be a C expression indicating when insns that
     read memory in MODE, an integral mode narrower than a word, set the
     bits outside of MODE to be either the sign-extension or the
     zero-extension of the data read.  Return `SIGN_EXTEND' for values
     of MODE for which the insn sign-extends, `ZERO_EXTEND' for which
     it zero-extends, and `NIL' for other modes.

So there would be no point in defining this on a machine where zero/sign 
extension from a memory is not a natural side effect of a load (and 
therefore free), not least because if you did you'd almost certainly get 
wrong code.  So the issue about some machines needing several instructions 
to zero/sign extend from memory is not relevant here (indeed on the ARM 
prior to Architecture v4, loading a half-word from memory did not extend 
the result to a word, so LOAD_EXTEND_OP did return NIL to indicate this).

> My interpretation of this principle is that front and middle ends
> should keep the RTL as explicit as possible.  But its the job of
> the optimizers to take advantage of machine idioms where they
> exist.  Going from zero-extend to subreg is a lowering, and if
> the RTL is explicit about what its doing there shouldn't be a
> need to go the other way.

No, the point of the optimizers is never to throw away information, just 
to find a more efficient way of expressing the computation.  If we start 
throwing away important information then we can end up with ambiguities 
(which seems to be the root cause of all the problems here).

> In an ideal world we probably wouldn't need LOAD_EXTEND_OP,
> target patterns would appropriately recognize "(zero_extend (mem ...))""
> and GCC would recognize"(?rshift (?lshift ...))" as extend insns.
I think working on the elimination of LOAD_EXTEND_OP might be a good move. 
 It might mean that we can eliminate another case of paradoxical subregs, 
which (IMO) would be good.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]