This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Combine of compare & and oddity


On 09/03/2015 07:18 AM, Segher Boessenkool wrote:
On Thu, Sep 03, 2015 at 12:43:34PM +0100, Wilco Dijkstra wrote:
Combine canonicalizes certain AND masks in a comparison with zero into extracts of the
widest
register type. During matching these are expanded into a very inefficient sequence that
fails to
match. For example (x & 2) == 0 is matched in combine like this:

Failed to match this instruction:
(set (reg:CC 66 cc)
     (compare:CC (zero_extract:DI (subreg:DI (reg/v:SI 76 [ xD.2641 ]) 0)
             (const_int 1 [0x1])
             (const_int 1 [0x1]))
         (const_int 0 [0])))

Yes.  Some processors even have specific instructions to do this.

However there are 2 issues with this, one is the spurious subreg,

Combine didn't make that up out of thin air; something already used
DImode here.  It could simplify it to SImode in this case, that is
true, don't know why it doesn't; it isn't necessarily faster code to
do so, it can be slower, it might not match, etc.
Right. It may also be the case that on a 64 bit target, but the underlying object is 32 bits and combine wanted to do things in word_mode.

But yes, there's a reason why the subreg is in there and there are times when the subregs get in the way of the hand-written pattern matching that occurs in combine.c and elsewhere.

So it's generally useful to squash away the subregs when we can. However, it's also the case that the subregs can't always be squashed away -- so it's also helpful to dig into the transformations in combine.c that you want to fire and figure out if and how that code can be extended to handle the embedded subregs.


(*) I think that is another issue in combine - when both alternatives match you
want to select the lowest cost one, not the first one that matches.

That's recog, not combine.  And quite a few backends rely on "first match
wins", because it always has been that way.  It also is very easy to write
such patterns accidentally (sometimes even the exact same one twice in the
same machine description, etc.)
Note that it's also been documented that first match wins for 20+ years.




So my question is, is it combine's job to try all possible permutations that
constitute a bit or mask test?

Combine converts the merged instructions to what it thinks is the
canonical or cheapest form, and uses that.  It does not try multiple
options (the zero_ext* -> and+shift rewriting is not changing the
semantics of the pattern at all).
Right. Once combine finds something that works, it's done and moves onto the next set of insns to combine.



Neither matches the AArch64 patterns for ANDS/TST (which is just compare and AND). If the
immediate
is not a power of 2 or a power of 2 -1 then it matches correctly as expected.

I don't understand how ((x >> 1) & 1) != 0 could be a useful expansion

It is zero_extract(x,1,1) really.  This is convenient for (old and embedded)
processors that have special bit-test instructions.  If we now want combine
to not do this, we'll have to update all backends that rely on it.

Would any backend actually rely on this given it only does some specific masks,
has a redundant shift with 0 for the mask case and the odd subreg as well?

Such backends match the zero_extract patterns, of course.  Random example:
the h8300 patterns for the "btst" instruction.
PA, m68k and almost certainly others. I suspect it's fairly common in older ports.


Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]