[Bug target/34163] [4.3/4.4/4.5 Regression] 10% performance regression since Nov 1 on Polyhedron's "NF" on AMD64

Fri Jul 3 11:06:00 GMT 2009

------- Comment #19 from rguenth at gcc dot gnu dot org  2009-07-03 11:05 -------
In fact, in this case we have the C equivalent

  int i;
  long j = (long)(i - 1);

vs.

  long j = (long)i - 1;

which I believe are equivalent if overflow is undefined (or i - 1 does not
wrap).

It is just that fold obviously considers (long)i - 1 to be more expensive
than (long)(i - 1) and thus does not transform the latter into the former
(and it can't transform (long)i - 1 to (long)(i - 1) as if (long)i - 1
does not overflow there is no guarantee that i - 1 does not).

We should be able to do the former transformation during SCEV analysis
though.

I have a patch which results in (-O3 -ffast-math -funroll-loops)

.L6:
        mulss   (%rcx), %xmm0
        movss   (%rdx), %xmm5
        movss   4(%rdx), %xmm4
        addl    $4, %ebp
        subss   %xmm0, %xmm5
        movss   8(%rdx), %xmm0
        mulss   (%rsi), %xmm5
        movss   %xmm5, (%rdx)
        mulss   4(%rcx), %xmm5
        subss   %xmm5, %xmm4
        mulss   4(%rsi), %xmm4
        movss   %xmm4, 4(%rdx)
        movss   8(%rcx), %xmm3
        mulss   %xmm4, %xmm3
        subss   %xmm3, %xmm0
        mulss   8(%rsi), %xmm0
        movss   %xmm0, 8(%rdx)
        movss   12(%rcx), %xmm2
        addq    $16, %rcx
        mulss   %xmm0, %xmm2
        movss   12(%rdx), %xmm0
        subss   %xmm2, %xmm0
        mulss   12(%rsi), %xmm0
        addq    $16, %rsi
        movss   %xmm0, 12(%rdx)
        addq    $16, %rdx
        cmpl    %r8d, %ebp
        jne     .L6

-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2008-04-21 07:11:35         |2009-07-03 11:05:43
               date|                            |

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34163