This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] [Bug middle-end/33187] Missed cmove opportunity


Hello!

Attached patch teaches combine pass to handle load + float_extend sequences, produces by compress_float_constant optimization. Combine tried to fully simplify float_extend constant load, but failed, because direct constant load is usually not supported. Non-simplified load + float_extend sequences blocked cmove ifcvt conversion.

In following testcase (-O2 -march=i686 -ffast-math):

--cut here--
double sgn (double __x)
{
 return __x >= 0.0 ? 1.0 : -1.0;
}
--cut here--

combine tried to combine:

(insn 11 10 12 3 cmov7.c:14 (set (reg:SF 63)
(mem/u/c/i:SF (symbol_ref/u:SI ("*.LC1") [flags 0x2]) [3 S4 A32])) 97 {*
movsf_1} (expr_list:REG_EQUAL (const_double:SF -1.0e+0 [-0x0.8p+1])
(nil)))


(insn 12 11 36 3 cmov7.c:14 (set (reg:DF 58 [ D.1658 ])
(float_extend:DF (reg:SF 63))) 134 {*extendsfdf2_i387} (expr_list:REG_DE
AD (reg:SF 63)
(expr_list:REG_EQUAL (const_double:DF -1.0e+0 [-0x0.8p+1])
(nil))))


directly into:

Failed to match this instruction:
(set (reg:DF 58 [ D.1658 ])
   (const_double:DF -1.0e+0 [-0x0.8p+1]))

With attached patch, combine generates:

Successfully matched this instruction:
(set (reg:DF 61)
(float_extend:DF (mem/u/c/i:SF (symbol_ref/u:SI ("*.LC0") [flags 0x2]) [3 S4 A32])))


This insn is further simplified in the post-reload split pass into:

(insn 51 50 41 2 cmov7.c:14 (set (reg:DF 9 st(1) [67])
       (const_double:DF 1.0e+0 [0x0.8p+1])) 101 {*movdf_nointeger} (nil))

and generates expected fld1 x87 asm insn. However, in ifcvt after combie pass, this patch enables generation of fcmov insn, producing:

sgn:
       fldl    4(%esp)
       fldz
       fcomip  %st(1), %st
       fstp    %st(0)
       fld1
       fchs
       fld1
       fcmovnbe        %st(1), %st
       fstp    %st(1)
       ret

In further checkings, this patch reduced povray-3.6.1 size for 2728 bytes and the test execution time was 0.38% shorter (using -O3 -mfpmath=387 -ffast-math on x86_64). Number of fcmov insns raised from 1583 to 1679.

2007-08-27 Uros Bizjak <ubizjak@gmail.com>

       PR middle-end/33187
       * combine.c (subst): Do not try to simplify X if it represents load
       of FP constant from the constant pool via float extension.

testsuite/ChangeLog:

2007-08-27 Uros Bizjak <ubizjak@gmail.com>

       PR middle-end/33187
       * gcc.target/i386/cmov7.c: New file.

Patch was bootstrapped and regression tested on x86_64 with and without -m32. OK for mainline?

Uros.
Index: testsuite/gcc.target/i386/cmov7.c
===================================================================
--- testsuite/gcc.target/i386/cmov7.c	(revision 0)
+++ testsuite/gcc.target/i386/cmov7.c	(revision 0)
@@ -0,0 +1,15 @@
+/* PR middle-end/33187 */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=k8 -ffast-math -mfpmath=387" } */
+/* { dg-final { scan-assembler "fcmov" } } */
+
+/* compress_float_constant generates load + float_extend
+   sequence which combine pass failed to combine into
+   (set (reg:DF) (float_extend:DF (mem:SF (symbol_ref...)))).  */
+
+double
+sgn (double __x)
+{
+  return __x >= 0.0 ? 1.0 : -1.0;
+}
Index: combine.c
===================================================================
--- combine.c	(revision 127835)
+++ combine.c	(working copy)
@@ -4478,6 +4478,18 @@ subst (rtx x, rtx from, rtx to, int in_d
 	}
     }
 
+  /* Check if we are loading something from the constant pool via float
+     extension; in this case we would undo compress_float_constant
+     optimization and degenerate constant load to an immediate value.  */
+  if (GET_CODE (x) == FLOAT_EXTEND
+      && MEM_P (XEXP (x, 0))
+      && MEM_READONLY_P (XEXP (x, 0)))
+    {
+      rtx tmp = avoid_constant_pool_reference (x);
+      if (x != tmp)
+	return x;
+    }
+
   /* Try to simplify X.  If the simplification changed the code, it is likely
      that further simplification will help, so loop, but limit the number
      of repetitions that will be performed.  */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]