This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/30829] extra register zero extends
- From: "dje at google dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 9 Nov 2007 00:57:45 -0000
- Subject: [Bug target/30829] extra register zero extends
- References: <bug-30829-5748@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #1 from dje at google dot com 2007-11-09 00:57 -------
I looked into what's going on here.
This is a problem in the i386.md lshr+zext combiner patterns (or a problem in
the combine pass, depending on one's point of view). There are patterns in
i386.md that are supposed to handle this case but they're not being used.
lshrsi3_1_one_bit_zext doesn't handle the lshr(1) case. I don't understand why
the lshr is outside of the zero_extend in its definition. Looking at the dump
of the combine pass I see it trying to match a zero_extract, and if I code
lshrsi3_1_one_bit_zext to use a zero_extract then I get the expected code (i.e.
the superfluous move is gone).
lshrsi3_1_zext doesn't handle the lshr(n) case. It intuitively matches but
combine tries to "simplify" this case to (and:DI (subreg:DI (lshr ...) 0)
0xffffffff) and there's no pattern that matches this. If I add a pattern to
match this it still doesn't work because the rtx_cost of the two separate insns
lshr,zext (= 4 + 1) is less than the rtx cost of the combined pattern (= 4 + 3
+ 4) so combine rejects the change because it thinks it's more expensive.
ix86_rtx_costs has code to treat zero_extend SI->DI as a cheap case but it
doesn't get used here unfortunately because the zero_extend gets converted to
an AND (which isn't to say that's bad for the general case).
There are two other related patterns that I also don't understand:
lshrsi3_cmp_one_bit_zext and lshrsi3_cmp_zext. It's not clear to me that they
will match anything.
It's unfortunate that the intuitive pattern can't work here. i.e. for the one
bit shift one needs to use a zero_extract and for the N bit shift one needs to
use AND and hack ix86_rtx_costs. Maybe if the "simplified" pattern doesn't
work combine could try a "less simplified" pattern, or maybe combine could be
told not to simplify this particular case (either via target dependent means or
by detecting the specific case of SI->DI). Maybe I'm missing something.
--
dje at google dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dje at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829