This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Combine performance regression (was: Fix PR target/18701)


Hans-Peter Nilsson wrote:
> On Thu, 20 Jan 2005, Ulrich Weigand wrote:
> > A crucial optimization for the bzip2 case is thus to combine a
> > mem-to-reg load into such an AND and form a mem-to-reg zero-extend
> > pattern from it.  However, the AND in question necessarily uses a
> > paradoxical subreg, so we have this situation:
> 
> No, I don't think it's necessary, or at least that the bugs
> that make it seem necessary should be fixed.

If you use AND to express zero-extend (which is b.t.w. the default
method optab uses), then you necessarily need to use a paradoxical
subreg (of a REG).  I do not see how this constitutes a bug ...

> > Combine would now substitute the (mem ...) for (reg 79) in
> > insn 226, and further optimization would detect that the
> > (and (subreg (mem))) pattern is actually equivalent to a
> > zero-extend.
> 
> And that's my beef: it should have been expressed as a
> zero_extend here too.

Well, it would have *been* simplified to a zero_extend, during
this very invocation of try_combine, if your patch hadn't prevented
that simplification from being performed!

> > Thus, I do not like this patch ;-)  In fact, I rather disagree
> > with the reasoning leading up to it.
> 
> What reasoning?  Do you mean that it's ok to widen MEM accesses
> or do you mean that paradoxical mem subregs are a valid method
> to express sign- and zero extension?

I mean the reasoning that a paradoxical subreg of a MEM necessarily
implies "widening a MEM access".  As I said, a paradoxical MEM
subreg (in the absence of LOAD_EXTEND_OP) is supposed to have exactly
the same meaning as loading that MEM (in its original mode) into a
register, and then accessing a paradoxical subreg of that register.

> > After combine has finished the whole simplifcation process it
> > will re-recognize the insn.  If a target doesn't want to accept
> > this particular construct, it should be rejected *here*, if it
> > is still present, not earlier.
> 
> Well, the widening of a mem access must be stopped at the point
> where the widening, the change to a real access happens; it
> can't be done later, because then we no longer know that it's
> widened. :-)

Well, as I see it, simply forming a paradoxical MEM subreg
constitutes neither a widening nor any other change to the
'real access' to memory, so nothing needs to be stopped.

If some *later* pass thinks to convert a paradoxical MEM subreg
into an actual widened MEM access, then *that* pass needs to verify
whether this is OK.  (E.g. as reload does when it converts paradoxical 
subregs of spilled pseudos to widened MEMs.)

> >  (Alternatively, it might be
> > accepted, and later on cleaned up by reload -- there is code in
> > reload to do so, have you investigated why it didn't fix your
> > problem?)
> 
> This suggestion suggests that you did not look at the whole of
> the analysis.  This was a secondary patch, for a bug discovered
> through the *other* bug.

I see I had misinterpreted the somewhat lengthy discussion; I
had thought that after the gen_lowpart_for_combine patch (which I
agree is correct) was applied, you were still seeing a bug.

> The whole point of *this* patch is to
> stop a paradoxical mem subreg from being *widened*.  If the
> s390x port relies on that happening, I suggest that's incorrect.

The only thing s390 relies upon is that combine is able to
combine a mem->reg load with a reg->reg zero-extend (expressed
using AND) into a mem->reg zero-extend (expressed as ZERO_EXTEND).

> I don't see your argument why actually widening is ok.  Perhaps
> I missed the forest for the trees?

Actually widening the MEM is *not* OK.  B.t.w. this is also what
gen_lowpart_for_combine did wrong before you patch: it would
replace a mem:HI with a mem:SI in your example.  This is *wrong*,
even if it had computed the address properly.  After your 
patch it should now return a (subreg:SI (mem:HI)), which is
just fine in my opinion.  This construct can then be either
rejected in recog, or else fixed up by reload.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  Linux on zSeries Development
  Ulrich.Weigand@de.ibm.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]