This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Combine performance regression (was: Fix PR target/18701)

Hans-Peter Nilsson wrote:
> On Thu, 20 Jan 2005, Ulrich Weigand wrote:
> > A crucial optimization for the bzip2 case is thus to combine a
> > mem-to-reg load into such an AND and form a mem-to-reg zero-extend
> > pattern from it.  However, the AND in question necessarily uses a
> > paradoxical subreg, so we have this situation:
> No, I don't think it's necessary, or at least that the bugs
> that make it seem necessary should be fixed.

If you use AND to express zero-extend (which is b.t.w. the default
method optab uses), then you necessarily need to use a paradoxical
subreg (of a REG).  I do not see how this constitutes a bug ...

> > Combine would now substitute the (mem ...) for (reg 79) in
> > insn 226, and further optimization would detect that the
> > (and (subreg (mem))) pattern is actually equivalent to a
> > zero-extend.
> And that's my beef: it should have been expressed as a
> zero_extend here too.

Well, it would have *been* simplified to a zero_extend, during
this very invocation of try_combine, if your patch hadn't prevented
that simplification from being performed!

> > Thus, I do not like this patch ;-)  In fact, I rather disagree
> > with the reasoning leading up to it.
> What reasoning?  Do you mean that it's ok to widen MEM accesses
> or do you mean that paradoxical mem subregs are a valid method
> to express sign- and zero extension?

I mean the reasoning that a paradoxical subreg of a MEM necessarily
implies "widening a MEM access".  As I said, a paradoxical MEM
subreg (in the absence of LOAD_EXTEND_OP) is supposed to have exactly
the same meaning as loading that MEM (in its original mode) into a
register, and then accessing a paradoxical subreg of that register.

> > After combine has finished the whole simplifcation process it
> > will re-recognize the insn.  If a target doesn't want to accept
> > this particular construct, it should be rejected *here*, if it
> > is still present, not earlier.
> Well, the widening of a mem access must be stopped at the point
> where the widening, the change to a real access happens; it
> can't be done later, because then we no longer know that it's
> widened. :-)

Well, as I see it, simply forming a paradoxical MEM subreg
constitutes neither a widening nor any other change to the
'real access' to memory, so nothing needs to be stopped.

If some *later* pass thinks to convert a paradoxical MEM subreg
into an actual widened MEM access, then *that* pass needs to verify
whether this is OK.  (E.g. as reload does when it converts paradoxical 
subregs of spilled pseudos to widened MEMs.)

> >  (Alternatively, it might be
> > accepted, and later on cleaned up by reload -- there is code in
> > reload to do so, have you investigated why it didn't fix your
> > problem?)
> This suggestion suggests that you did not look at the whole of
> the analysis.  This was a secondary patch, for a bug discovered
> through the *other* bug.

I see I had misinterpreted the somewhat lengthy discussion; I
had thought that after the gen_lowpart_for_combine patch (which I
agree is correct) was applied, you were still seeing a bug.

> The whole point of *this* patch is to
> stop a paradoxical mem subreg from being *widened*.  If the
> s390x port relies on that happening, I suggest that's incorrect.

The only thing s390 relies upon is that combine is able to
combine a mem->reg load with a reg->reg zero-extend (expressed
using AND) into a mem->reg zero-extend (expressed as ZERO_EXTEND).

> I don't see your argument why actually widening is ok.  Perhaps
> I missed the forest for the trees?

Actually widening the MEM is *not* OK.  B.t.w. this is also what
gen_lowpart_for_combine did wrong before you patch: it would
replace a mem:HI with a mem:SI in your example.  This is *wrong*,
even if it had computed the address properly.  After your 
patch it should now return a (subreg:SI (mem:HI)), which is
just fine in my opinion.  This construct can then be either
rejected in recog, or else fixed up by reload.


  Dr. Ulrich Weigand
  Linux on zSeries Development

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]