Bug 18777 - Redundant loop count insns in simple vectorized loop
Summary: Redundant loop count insns in simple vectorized loop
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.1.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 18557
Blocks:
  Show dependency treegraph
 
Reported: 2004-12-02 08:47 UTC by Uroš Bizjak
Modified: 2005-07-23 22:49 UTC (History)
1 user (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Uroš Bizjak 2004-12-02 08:47:08 UTC
The modifies testcase from PR18767 shows the problem where loop count variables
still remains in vectorized loop.
Compiling the modified testcase with 'g++ -O2 -march=pentium4 -ftree-vectorize'
following code is produced for the first loop:

        ...
        leal    -24(%ebp), %esi
        leal    -40(%ebp), %ebx
        leal    -56(%ebp), %ecx
        xorl    %eax, %eax
        xorl    %edx, %edx
.L2:
        addl    $1, %eax
        movaps  (%edx,%esi), %xmm0
        mulps   (%ebx,%edx), %xmm0
        movaps  %xmm0, (%edx,%ecx)
        addl    $16, %edx
        cmpl    $1, %eax
        jne     .L2
        ...

It looks that the compiler does not figure out that the conditional jump is
never taken.

However with 'g++ -O2 -march=pentium4 -ftree-vectorize -funroll-loops' generated
code is a lot better:

        ...
        movaps  -24(%ebp), %xmm0
        mulps   -40(%ebp), %xmm0
        movaps  %xmm0, -56(%ebp)
        ...

Uros.
Comment 1 Andrew Pinski 2004-12-02 13:31:28 UTC
I think this is related to PR 18557.
Comment 2 Andrew Pinski 2005-06-12 03:19:15 UTC
Fixed on the mainline:
_Z6foobarv:
.LFB2:
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
        subl    $56, %esp
.LCFI2:
        movaps  -40(%ebp), %xmm0
        mulps   -24(%ebp), %xmm0
        movaps  %xmm0, -56(%ebp)
        fldz
        fadds   -56(%ebp)
        fadds   -52(%ebp)
        fadds   -48(%ebp)
        fadds   -44(%ebp)
        leave
        ret