[Bug target/28946] [4.0/4.1/4.2 Regression] assembler shifts set the flag ZF, no need to re-test to zero

uros at kss-loka dot si gcc-bugzilla@gcc.gnu.org
Tue Sep 5 09:36:00 GMT 2006

------- Comment #5 from uros at kss-loka dot si  2006-09-05 09:35 -------
The problem here is following:

We already have the patterns, that would satisfy combined instruction
(*lshrsi3_cmp) in above testcase. However, combiner rejects combined
instruction because the register that holds shifted result is unused!

The problematic part is in combine.c, around line 2236 (please read the
comment, which describes exactly the situation we have here). This part of code
is activated only when the register that holds the result of arith operation is
keept alive. This is quite strange - even if the result is unused, resulting
code will be still smaller as we avoid extra CC setting instruction.

The patch bellow (currently under testing, but so far OK) forces generation of
combined instruction even if the arithmetic result is unused.

Index: combine.c
--- combine.c   (revision 116691)
+++ combine.c   (working copy)
@@ -2244,7 +2244,7 @@
      needed, and make the PARALLEL by just replacing I2DEST in I3SRC with
      I2SRC.  Later we will make the PARALLEL that contains I2.  */

-  if (i1 == 0 && added_sets_2 && GET_CODE (PATTERN (i3)) == SET
+  if (i1 == 0 && GET_CODE (PATTERN (i3)) == SET
       && GET_CODE (SET_SRC (PATTERN (i3))) == COMPARE
       && XEXP (SET_SRC (PATTERN (i3)), 1) == const0_rtx
       && rtx_equal_p (XEXP (SET_SRC (PATTERN (i3)), 0), i2dest))
@@ -2254,6 +2254,13 @@
       enum machine_mode compare_mode;

+      /* To force generation of the combined comparison and arithmetic
+        operation PARALLEL, pretend that the set in I2 is to be used,
+        even if it is dead after I2. This results in better generated
+        code, as only CC setting arithmetic instruction will be
+        emitted in conditionals.  */
+      added_sets_2 = 1;
       newpat = PATTERN (i3);
       SUBST (XEXP (SET_SRC (newpat), 0), i2src);

Compiling testcase with this patch results in following code:

        movl 4(%esp), %eax
        shrl $5, %eax
        je  .L2
        jmp fct1
        .p2align 4,,7
        jmp fct2



More information about the Gcc-bugs mailing list