This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch] i386.c: Fix PR rtl-optimization/17931
- From: Kazu Hirata <kazu at cs dot umass dot edu>
- To: gcc-patches at gcc dot gnu dot org
- Date: Tue, 12 Oct 2004 10:10:21 -0400 (EDT)
- Subject: [patch] i386.c: Fix PR rtl-optimization/17931
Hi,
Attached is a patch to fix PR rtl-optimization 17931, a regression
from all recent versions of released GCC.
Consider:
struct flags {
unsigned f0 : 1;
unsigned f1 : 1;
};
_Bool
foo (struct flags *p)
{
if (p->f0)
return 1;
return p->f1;
}
Without the patch, I get
foo:
movl 4(%esp), %eax
movb (%eax), %dl
movb %dl, %al
andl $1, %eax <- Notice!
testb %al, %al <- Notice!
jne .L7
xorl %eax, %eax
testb $2, %dl
setne %al
ret
.p2align 2,,3
.L7:
movl $1, %eax
ret
Notice that I have "andl" immediately followed "testb". These could
be combined to "testb $1, %al". It turns out that the combiner does
try to form "testb $1, %al" expressed as
(set (reg:CCZ 17 flags)
(compare:CCZ (zero_extract:SI (subreg:SI (reg:QI 63) 0)
(const_int 1 [0x1])
(const_int 0 [0x0]))
(const_int 0 [0x0])))
but combine_validate_cost rejects this insn because the cost of the
combined testb appears too high. The problem is that ix86_rtx_cost
does not handle COMPARE with ZERO_EXTRACT in it.
The patch fixes this problem by handling the above COMPARE construct
in the same way as AND.
Tested on i686-pc-linux-gnu. OK to apply?
p.s.
It may be very interesting to see what combine_validate_cost rejects.
We may be missing a few good optimization opportunities there.
Kazu Hirata
2004-10-12 Kazu Hirata <kazu@cs.umass.edu>
* config/i386/i386.c (ix86_rtx_costs): Handle COMPARE with
ZERO_EXTRACT in it.
Index: config/i386/i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.734
diff -u -r1.734 i386.c
--- config/i386/i386.c 1 Oct 2004 07:43:00 -0000 1.734
+++ config/i386/i386.c 11 Oct 2004 16:23:58 -0000
@@ -14326,6 +14326,21 @@
*total = COSTS_N_INSNS (ix86_cost->add);
return false;
+ case COMPARE:
+ if (GET_CODE (XEXP (x, 0)) == ZERO_EXTRACT
+ && XEXP (XEXP (x, 0), 1) == const1_rtx
+ && GET_CODE (XEXP (XEXP (x, 0), 2)) == CONST_INT
+ && XEXP (x, 1) == const0_rtx)
+ {
+ /* This kind of construct is implemented using test[bwl].
+ Treat it as if we had an AND. */
+ *total = (COSTS_N_INSNS (ix86_cost->add)
+ + rtx_cost (XEXP (XEXP (x, 0), 0), outer_code)
+ + rtx_cost (const1_rtx, outer_code));
+ return true;
+ }
+ return false;
+
case FLOAT_EXTEND:
if (!TARGET_SSE_MATH || !VALID_SSE_REG_MODE (mode))
*total = 0;