This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] [Bug middle-end/33187] Missed cmove opportunity
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 27 Aug 2007 22:20:01 +0200
- Subject: [PATCH] [Bug middle-end/33187] Missed cmove opportunity
Hello!
Attached patch teaches combine pass to handle load + float_extend
sequences, produces by compress_float_constant optimization. Combine
tried to fully simplify float_extend constant load, but failed, because
direct constant load is usually not supported. Non-simplified load +
float_extend sequences blocked cmove ifcvt conversion.
In following testcase (-O2 -march=i686 -ffast-math):
--cut here--
double sgn (double __x)
{
return __x >= 0.0 ? 1.0 : -1.0;
}
--cut here--
combine tried to combine:
(insn 11 10 12 3 cmov7.c:14 (set (reg:SF 63)
(mem/u/c/i:SF (symbol_ref/u:SI ("*.LC1") [flags 0x2]) [3 S4
A32])) 97 {*
movsf_1} (expr_list:REG_EQUAL (const_double:SF -1.0e+0 [-0x0.8p+1])
(nil)))
(insn 12 11 36 3 cmov7.c:14 (set (reg:DF 58 [ D.1658 ])
(float_extend:DF (reg:SF 63))) 134 {*extendsfdf2_i387}
(expr_list:REG_DE
AD (reg:SF 63)
(expr_list:REG_EQUAL (const_double:DF -1.0e+0 [-0x0.8p+1])
(nil))))
directly into:
Failed to match this instruction:
(set (reg:DF 58 [ D.1658 ])
(const_double:DF -1.0e+0 [-0x0.8p+1]))
With attached patch, combine generates:
Successfully matched this instruction:
(set (reg:DF 61)
(float_extend:DF (mem/u/c/i:SF (symbol_ref/u:SI ("*.LC0") [flags
0x2]) [3 S4 A32])))
This insn is further simplified in the post-reload split pass into:
(insn 51 50 41 2 cmov7.c:14 (set (reg:DF 9 st(1) [67])
(const_double:DF 1.0e+0 [0x0.8p+1])) 101 {*movdf_nointeger} (nil))
and generates expected fld1 x87 asm insn. However, in ifcvt after combie
pass, this patch enables generation of fcmov insn, producing:
sgn:
fldl 4(%esp)
fldz
fcomip %st(1), %st
fstp %st(0)
fld1
fchs
fld1
fcmovnbe %st(1), %st
fstp %st(1)
ret
In further checkings, this patch reduced povray-3.6.1 size for 2728
bytes and the test execution time was 0.38% shorter (using -O3
-mfpmath=387 -ffast-math on x86_64). Number of fcmov insns raised from
1583 to 1679.
2007-08-27 Uros Bizjak <ubizjak@gmail.com>
PR middle-end/33187
* combine.c (subst): Do not try to simplify X if it represents load
of FP constant from the constant pool via float extension.
testsuite/ChangeLog:
2007-08-27 Uros Bizjak <ubizjak@gmail.com>
PR middle-end/33187
* gcc.target/i386/cmov7.c: New file.
Patch was bootstrapped and regression tested on x86_64 with and without
-m32. OK for mainline?
Uros.
Index: testsuite/gcc.target/i386/cmov7.c
===================================================================
--- testsuite/gcc.target/i386/cmov7.c (revision 0)
+++ testsuite/gcc.target/i386/cmov7.c (revision 0)
@@ -0,0 +1,15 @@
+/* PR middle-end/33187 */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=k8 -ffast-math -mfpmath=387" } */
+/* { dg-final { scan-assembler "fcmov" } } */
+
+/* compress_float_constant generates load + float_extend
+ sequence which combine pass failed to combine into
+ (set (reg:DF) (float_extend:DF (mem:SF (symbol_ref...)))). */
+
+double
+sgn (double __x)
+{
+ return __x >= 0.0 ? 1.0 : -1.0;
+}
Index: combine.c
===================================================================
--- combine.c (revision 127835)
+++ combine.c (working copy)
@@ -4478,6 +4478,18 @@ subst (rtx x, rtx from, rtx to, int in_d
}
}
+ /* Check if we are loading something from the constant pool via float
+ extension; in this case we would undo compress_float_constant
+ optimization and degenerate constant load to an immediate value. */
+ if (GET_CODE (x) == FLOAT_EXTEND
+ && MEM_P (XEXP (x, 0))
+ && MEM_READONLY_P (XEXP (x, 0)))
+ {
+ rtx tmp = avoid_constant_pool_reference (x);
+ if (x != tmp)
+ return x;
+ }
+
/* Try to simplify X. If the simplification changed the code, it is likely
that further simplification will help, so loop, but limit the number
of repetitions that will be performed. */