This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] - Regression exposed by recent change to compress_float_constant


Following patch has exposed an optimization shortcoming:

2005-07-12 Dale Johannesen <dalej@apple.com>

        * expr.c (compress_float_constant):  Add cost check.
        * config/rs6000.c (rs6000_rtx_cost):  Adjust FLOAT_EXTEND cost.

This patch results in generating worse code for the following test case:

1) Test case:

struct S {
        float d1, d2, d3;
};

S ms()
{
        struct S s = {0,0,0};
        return s;
}

With: -O1 -mdynamic-no-pic -march=pentium4 -mtune=prescott, gcc now generates
pxor %xmm0, %xmm0
movsd %xmm0, (%eax)
...


Instead of:

         movl    $0, (%eax)
        movl    $0, 4(%eax)
        ....

This is because change to compress_float_constant has changed the RTL pattern which cse cannot optimize:

Before above patch compress_float_constant generated:

(insn 12 7 13 0 (set (reg:SF 59)
(mem/u/i:SF (symbol_ref/u:SI ("*LC0") [flags 0x2]) [0 S4 A32])) -1 (nil)
(nil))


(insn 13 12 14 0 (set (mem/s/j:DF (reg/f:SI 58 [ D.1929 ]) [0 <result>.d1+0 S8 A32])
(float_extend:DF (reg:SF 59))) -1 (nil)
(nil))


Which cse was then able to constant propagate double 0.0, resulting in the following pattern:

(insn 13 7 15 0 (set (mem/s/j:DF (reg/f:SI 58 [ D.1929 ]) [0 <result>.d1+0 S8 A32])
(const_double:DF 0.0 [0x0.0p+0])) 64 {*movdf_nointeger} (nil)
(nil))


With the latest gcc (which includes above patch):
compress_float_constant's new cost computation disallows generation of float_extend:DF. cse is then faced with the new pattern:


(insn 12 11 13 0 s.C:7 (set (reg:DF 59)
(mem/u/i:DF (symbol_ref/u:SI ("*LC0") [flags 0x2]) [0 S8 A64])) -1 (nil)
(nil))


(insn 13 12 14 0 s.C:7 (set (mem/s/j:DF (reg/f:SI 58 [ D.1929 ]) [0 <result>.d1+0 S8 A32])
(reg:DF 59)) -1 (nil)
(nil))


As soon as it sees a REG node as source, it gives up.

What is the right way to restore this optimization again:

1) Can cse be taught to constant propagate when source is a REG rtl? Why this was never attempted before. Fixing up cse seems to fix both the double float as well as single float case (which is not impacted by above patch).

2) For the double float case (as illustrated by above test case), can we twik compress_float_constant to not use the cost computation of RHS when LHS is store to a memory. This fixes the performance regressions and caused no regression. Attached patch is what I tried.

3) Any other approrach?

- Thanks, fariborz

Index: expr.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/expr.c,v
retrieving revision 1.778.4.5
diff -c -p -r1.778.4.5 expr.c
*** expr.c 13 Jul 2005 01:07:47 -0000 1.778.4.5
--- expr.c 10 Aug 2005 18:55:49 -0000
*************** compress_float_constant (rtx x, rtx y)
*** 3187,3196 ****
the extension. */
if (! (*insn_data[ic].operand[1].predicate) (trunc_y, srcmode))
continue;
! /* This is valid, but may not be cheaper than the original. */
! newcost = rtx_cost (gen_rtx_FLOAT_EXTEND (dstmode, trunc_y), SET);
! if (oldcost < newcost)
! continue;
}
else if (float_extend_from_mem[dstmode][srcmode])
{
--- 3187,3199 ----
the extension. */
if (! (*insn_data[ic].operand[1].predicate) (trunc_y, srcmode))
continue;
! if (!MEM_P (x))
! {
! /* This is valid, but may not be cheaper than the original. */
! newcost = rtx_cost (gen_rtx_FLOAT_EXTEND (dstmode, trunc_y), SET);
! if (oldcost < newcost && (!MEM_P (x)) )
! continue;
! }
}
else if (float_extend_from_mem[dstmode][srcmode])
{








Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]