This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

avoid ambiguous/misleading paradoxical subreg in compares


This is an updated version of a patch that I first posted on May 28,
2000.  Except that this time it hit on frv, while compiling samba.
The reduced testcase is:

#include <stdlib.h>

static void __attribute__((__noinline__))
     foo (unsigned char *p) {
  *p = 0x81;
}

static void __attribute__((__noinline__))
     bar (int x) {
  asm ("");
}

int main() {
  unsigned char b;

  foo(&b);
  if (b & 0x80)
    {
      bar (b & 0x7f);
      exit (0);
    }
  else
    {
      bar (b & 1);
      abort ();
    }
}


The problem, again, is that combine assumes a paradoxical subreg is
equivalent to both zero- and sign-extensions, but it can only be
considered as such if we can actually perform the compare in the
narrower mode.  If we don't, then we need to preserve the accurate
extension, instead of the ambiguous paradoxical subreg that combine
attempts to introduce that leads to incorrect results.

More specifically, before combine, we have:

(insn 14 12 15 0 (set (reg:QI 171)
        (mem/f:QI (plus:SI (reg/f:SI 2 fp)
                (const_int -1 [0xffffffffffffffff])) [0 S1 A8])) 0 {*movqi_load} (nil)
    (nil))

(insn 15 14 16 0 (set (reg:SI 172)
        (sign_extend:SI (reg:QI 171))) 30 {extendqisi2} (insn_list 14 (nil))
    (nil))

(insn 16 15 63 0 (set (reg:CC 173)
        (compare:CC (reg:SI 172)
            (const_int 0 [0x0]))) 88 {cmpsi_cc} (insn_list 15 (nil))
    (expr_list:REG_DEAD (reg:SI 172)
        (nil)))

(insn 63 16 17 0 (set (reg:SI 180)
        (zero_extend:SI (reg:QI 171))) 33 {zero_extendqisi2} (nil)
    (expr_list:REG_DEAD (reg:QI 171)
        (nil)))

(jump_insn 17 63 56 0 (set (pc)
        (if_then_else:CC (ge:CC (reg:CC 173)
                (const_int 0 [0x0]))
            (label_ref 33)
            (pc))) 92 {*branch_signed_true} (insn_list 16 (nil))
    (expr_list:REG_DEAD (reg:CC 173)
        (expr_list:REG_BR_PROB (const_int 100 [0x64])
            (nil))))

reg 180 is used for the bitand operations in both branches of the if.
Combine changes the following two insns:

(note 15 14 16 0 NOTE_INSN_DELETED)

(insn 16 15 63 0 (set (reg:CC 173)
        (compare:CC (subreg:SI (reg:QI 171) 0)
            (const_int 0 [0x0]))) 88 {cmpsi_cc} (insn_list 14 (nil))
    (nil))

Note how it now uses a SImode compare with 0 to determine whether a
reg 171 is negative.  So far so good, since the load in insn 14 is
actually a sign-extend, as specified in frv.h.

Later on, regmove modifies the following insns:

(insn 14 12 15 0 (set (reg:SI 171)
        (zero_extend:SI (mem/f:QI (plus:SI (reg/f:SI 2 fp)
                    (const_int -1 [0xffffffffffffffff])) [0 S1 A8]))) 33 {zero_extendqisi2} (nil)
    (nil))

(insn 16 15 63 0 (set (reg:CC 173)
        (compare:CC (reg:SI 171)
            (const_int 0 [0x0]))) 88 {cmpsi_cc} (insn_list 14 (nil))
    (nil))

(insn 63 16 17 0 (set (reg:SI 180)
        (reg:SI 171)) 7 {*movsi_internal} (nil)
    (expr_list:REG_DEAD (reg:SI 171)
        (nil)))

This change is perfectly acceptable: reg 171 was used in one case for
a zero extension, and in another with a don't-care-extension (which is
one of the several meanings of paradoxical subregs).

If you'll remember that May 2000 posting, the mn10300 problem I had
back then occurred later on in reload, so there are several points in
which GCC does indeed make the assumption that paradoxical subregs are
don't-care in terms of extension.

Therefore, the only solution I see for the problem is still to
guarantee that we don't turn specific kinds of extends into
paradoxical subregs in combine, which was exactly the solution I
posted back then, and that fixed this problem too.  I don't see that
this change can possibly get us to generate worse code, since if there
is an instruction that performs the extension we need or a compare
instruction that performs the comparison in a narrower mode, we'll
still use them; it's only in the case in which we inappropriately
removed a necessary extension that the optimization would have made a
difference, but it's in these very cases that the optimization is
incorrect.

I should mention that the insns quoted above are from gcc 3.4.CVS as
of earlier today.  Mainline, surprisingly, generates correct code,
although the patch below actually applies to mainline.  The reason is
that mainline actually generates worse rtl initially, and then combine
doesn't get a chance to mis-optimize the compare.  Before combine, we
have something like this:

(insn 13 11 14 0 (set (reg:QI 181 [ b ])
        (mem/i:QI (plus:SI (reg/f:SI 2 fp)
                (const_int -1 [0xffffffffffffffff])) [0 b+0 S1 A8])) 0 {*movqi_load} (nil)
    (nil))

(insn 14 13 15 0 (set (reg:SI 178 [ b.0 ])
        (zero_extend:SI (reg:QI 181 [ b ]))) 34 {zero_extendqisi2} (insn_list:REG_DEP_TRUE 13 (nil))
    (expr_list:REG_DEAD (reg:QI 181 [ b ])
        (nil)))

(insn 15 14 16 0 (set (reg:SI 182 [ b.0 ])
        (sign_extend:SI (subreg/u:QI (reg:SI 178 [ b.0 ]) 3))) 31 {extendqisi2}
(insn_list:REG_DEP_TRUE 14 (nil))
    (nil))

(insn 16 15 17 0 (set (reg:CC_NZ 183)
        (compare:CC_NZ (reg:SI 182 [ b.0 ])
            (const_int 0 [0x0]))) 89 {*cmpsi_cc_nz} (insn_list:REG_DEP_TRUE 15 (nil))
    (expr_list:REG_DEAD (reg:SI 182 [ b.0 ])
        (nil)))

Note insn 15, in which we have a sign-extend from the subreg of the
pseudo holding the zero-extended mem, instead of a sign-extend from
the original pseudo.  Combine then only goes as far as this:

(note 13 11 14 0 NOTE_INSN_DELETED)

(insn 14 13 15 0 (set (reg:SI 178 [ b.0 ])
        (zero_extend:SI (mem/i:QI (plus:SI (reg/f:SI 2 fp)
                    (const_int -1 [0xffffffffffffffff])) [0 b+0 S1 A8]))) 34 {zero_extendqisi2} (nil)
    (nil))

(insn 15 14 16 0 (set (reg:SI 182 [ b.0 ])
        (sign_extend:SI (subreg/u:QI (reg:SI 178 [ b.0 ]) 3))) 31 {extendqisi2}
(insn_list:REG_DEP_TRUE 14 (nil))
    (nil))

(insn 16 15 17 0 (set (reg:CC_NZ 183)
        (compare:CC_NZ (reg:SI 182 [ b.0 ])
            (const_int 0 [0x0]))) 89 {*cmpsi_cc_nz} (insn_list:REG_DEP_TRUE 15 (nil))
    (expr_list:REG_DEAD (reg:SI 182 [ b.0 ])
        (nil)))

However, if combine was ever modified to see through the zero_extend
in insn 14 and substitute the loaded pseudo into the sign_extend
operand, it would fail just the same.  Mainline has just completed
bootstrapping on x86_64-linux-gnu with this patch.  Ok to install in
mainline?  Ok for 3.4 if it passes too?

Index: gcc/ChangeLog
from  Alexandre Oliva  <aoliva@redhat.com>

	* combine.c (simplify_comparison, case SIGN_EXTEND, ZERO_EXTEND):
	Do not drop the extend if we'd have to add a paradoxical subreg
	later.  Include optabs.h and insn-codes.h.
	* Makefile.in (combine.o): Depend on $(OPTABS_H).

Index: gcc/Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1435
diff -u -p -r1.1435 Makefile.in
--- gcc/Makefile.in 19 Dec 2004 04:42:07 -0000 1.1435
+++ gcc/Makefile.in 20 Dec 2004 21:39:09 -0000
@@ -2047,7 +2047,7 @@ et-forest.o : et-forest.c $(CONFIG_H) $(
 combine.o : combine.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    $(FLAGS_H) function.h insn-config.h $(INSN_ATTR_H) $(REGS_H) $(EXPR_H) \
    rtlhooks-def.h $(BASIC_BLOCK_H) $(RECOG_H) real.h hard-reg-set.h \
-   toplev.h $(TM_P_H) $(TREE_H) $(TARGET_H) output.h
+   toplev.h $(TM_P_H) $(TREE_H) $(TARGET_H) output.h $(OPTABS_H)
 regclass.o : regclass.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
    hard-reg-set.h $(FLAGS_H) $(BASIC_BLOCK_H) $(REGS_H) insn-config.h \
    $(RECOG_H) reload.h real.h toplev.h function.h output.h $(GGC_H) \
Index: gcc/combine.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.463
diff -u -p -r1.463 combine.c
--- gcc/combine.c 15 Dec 2004 01:02:55 -0000 1.463
+++ gcc/combine.c 20 Dec 2004 21:39:17 -0000
@@ -90,6 +90,8 @@ Software Foundation, 59 Temple Place - S
 #include "real.h"
 #include "toplev.h"
 #include "target.h"
+#include "optabs.h"
+#include "insn-codes.h"
 #include "rtlhooks-def.h"
 /* Include output.h for dump_file.  */
 #include "output.h"
@@ -10009,16 +10011,22 @@ simplify_comparison (enum rtx_code code,
 	  break;
 
 	case SIGN_EXTEND:
-	  /* Can simplify (compare (zero/sign_extend FOO) CONST)
-	     to (compare FOO CONST) if CONST fits in FOO's mode and we
-	     are either testing inequality or have an unsigned comparison
-	     with ZERO_EXTEND or a signed comparison with SIGN_EXTEND.  */
-	  if (! unsigned_comparison_p
-	      && (GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 0)))
-		  <= HOST_BITS_PER_WIDE_INT)
+	  /* Can simplify (compare (zero/sign_extend FOO) CONST) to
+	     (compare FOO CONST) if CONST fits in FOO's mode and we
+	     are either testing inequality or have an unsigned
+	     comparison with ZERO_EXTEND or a signed comparison with
+	     SIGN_EXTEND.  But don't do it if we don't have a compare
+	     insn of the given mode, since we'd have to revert it
+	     later on, and then we wouldn't know whether to sign- or
+	     zero-extend.  */
+	  mode = GET_MODE (XEXP (op0, 0));
+	  if (mode != VOIDmode && GET_MODE_CLASS (mode) == MODE_INT
+	      && ! unsigned_comparison_p
+	      && (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
 	      && ((unsigned HOST_WIDE_INT) const_op
-		  < (((unsigned HOST_WIDE_INT) 1
-		      << (GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 0))) - 1)))))
+		  < (((unsigned HOST_WIDE_INT) 1 
+		      << (GET_MODE_BITSIZE (mode) - 1))))
+	      && cmp_optab->handlers[(int) mode].insn_code != CODE_FOR_nothing)
 	    {
 	      op0 = XEXP (op0, 0);
 	      continue;
@@ -10067,11 +10075,12 @@ simplify_comparison (enum rtx_code code,
 	  /* ... fall through ...  */
 
 	case ZERO_EXTEND:
-	  if ((unsigned_comparison_p || equality_comparison_p)
-	      && (GET_MODE_BITSIZE (GET_MODE (XEXP (op0, 0)))
-		  <= HOST_BITS_PER_WIDE_INT)
-	      && ((unsigned HOST_WIDE_INT) const_op
-		  < GET_MODE_MASK (GET_MODE (XEXP (op0, 0)))))
+	  mode = GET_MODE (XEXP (op0, 0));
+	  if (mode != VOIDmode && GET_MODE_CLASS (mode) == MODE_INT
+	      && (unsigned_comparison_p || equality_comparison_p)
+	      && (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
+	      && ((unsigned HOST_WIDE_INT) const_op < GET_MODE_MASK (mode))
+	      && cmp_optab->handlers[(int) mode].insn_code != CODE_FOR_nothing)
 	    {
 	      op0 = XEXP (op0, 0);
 	      continue;
-- 
Alexandre Oliva             http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]