This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFC PATCH] Fix 15187
- From: Uros Bizjak <uros at kss-loka dot si>
- To: gcc-patches at gcc dot gnu dot org
- Date: Thu, 16 Sep 2004 14:55:46 +0200
- Subject: [RFC PATCH] Fix 15187
Hello!
There is a problem with code like:
double test(double x) {
if (x > 0.0)
return cos(x);
else
return sin(x);
}
This code used to compile to quite bad asm code, as it is shown in
PR15187 [0]. A fix was introduced [1], which fixed !TARGET_CMOV, however
targets with cmov insn (i686) still generate bad code.
From above example, gcc -O2 -ffast-math -march=i686 produces following
code:
test:
pushl %ebp
movl %esp, %ebp
fldl 8(%ebp)
fldz
fld %st(1)
fld %st(2)
fxch %st(1)
fcos
fxch %st(3)
popl %ebp
fcomip %st(2), %st
fstp %st(1)
fsin
fcmovnbe %st(1), %st
fstp %st(1)
ret
There is a good chance that this is the most un-optimized piece of code
I have ever seen. Both, sin() and cos() are calculated, and then cmove
picks one of them. The problem is in noce_try_cmove_arith() in ifcvt.c
source. This function is triggered when 'true' and 'false' branch
consists of only one insn. If this instruction is only moving a register
or constant around, everything is OK. However, with -ffast-math on i686
builtins like sin(), cos(), sqrt(), etc... also count as one
instruction, even when they can last considerably more than hundred
clock cycles.
Attached patch fixes this problem by checking if it is faster to
calculate both branches or use normal conditional jump. The same trick
as [1] was used, and above example now becomes:
test:
pushl %ebp
movl %esp, %ebp
fldl 8(%ebp)
fldz
fcomip %st(1), %st
jae .L2
fsin
popl %ebp
ret
.L2:
fcos
popl %ebp
ret
When both branches are smaller than cost of BRANCH_COST, like:
test (double x)
{
if (x > 0.0)
return 10.0;
else
return 1.0;
}
cmov sequence is still produced:
test:
pushl %ebp
movl %esp, %ebp
fldl 8(%ebp)
fldz
fcomip %st(1), %st
fstp %st(0)
flds .LC2
fld1
fcmovb %st(1), %st
fstp %st(1)
popl %ebp
ret
The patch was bootstrapped on i686 for c,c++, no regressions.
BTW: I think that cond_exec_process_if_block() function in ifcvt.c
should be changed from using count_bb_insns() to use total_bb_rtx_cost()
for the same reason that find_if_case_1() and find_if_case_2() were
changed. Unfortunatelly, ix86 is !HAVE_conditional_execution, so I can't
fix it.
2004-09-16 Uros Bizjak <uros@kss-loka.si>
* ifcvt.c (noce_try_cmove_arith): Exit early if total
insn_rtx_cost of both branches > BRANCH_COST
[0] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187
[1] http://gcc.gnu.org/ml/gcc-patches/2004-07/msg00654.html
Uros.
Index: ifcvt.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ifcvt.c,v
retrieving revision 1.164
diff -u -p -r1.164 ifcvt.c
--- ifcvt.c 29 Aug 2004 22:10:42 -0000 1.164
+++ ifcvt.c 16 Sep 2004 12:09:47 -0000
@@ -1214,6 +1214,7 @@ noce_try_cmove_arith (struct noce_if_inf
rtx insn_a, insn_b;
rtx tmp, target;
int is_mem = 0;
+ int insn_cost;
enum rtx_code code;
/* A conditional move from two memory sources is equivalent to a
@@ -1247,6 +1248,25 @@ noce_try_cmove_arith (struct noce_if_inf
insn_a = if_info->insn_a;
insn_b = if_info->insn_b;
+ /* Total insn_rtx_cost should be smaller than branch cost. Exit
+ if insn_rtx_cost can't be estimated. */
+ if (insn_a)
+ {
+ insn_cost = insn_rtx_cost (PATTERN (insn_a));
+ if (insn_cost == 0 || insn_cost > COSTS_N_INSNS (BRANCH_COST))
+ return FALSE;
+ }
+ else
+ {
+ insn_cost = 0;
+ }
+
+ if (insn_b) {
+ insn_cost += insn_rtx_cost (PATTERN (insn_b));
+ if (insn_cost == 0 || insn_cost > COSTS_N_INSNS (BRANCH_COST))
+ return FALSE;
+ }
+
/* Possibly rearrange operands to make things come out more natural. */
if (reversed_comparison_code (if_info->cond, if_info->jump) != UNKNOWN)
{