This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/51499] vectorizer missing simple case

From: "fb.programming at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sun, 11 Dec 2011 08:33:40 +0000
Subject: [Bug tree-optimization/51499] vectorizer missing simple case
Auto-submitted: auto-generated
References: <bug-51499-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499

--- Comment #2 from fb.programming at gmail dot com 2011-12-11 08:33:40 UTC ---
(In reply to comment #1)

g++-4.6.2 -S -Wall -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 \
          -ffast-math  -fno-vect-cost-model

gives me exactly the same assembly code as above (which I'm surprised
a bit as -funsafe-math-optimizations might as well have eliminated the
loop completely).

The optimal assembly, however, I would expect to be something like:

.L3:
    addq    $1, %rax
    addpd    %xmm0, %xmm3
    cmpq    %rdi, %rax
    addpd    %xmm0, %xmm2
    addpd    %xmm0, %xmm1
    jne    .L3

Where the vector (sum1,sum2) is stored in xmm1, (sum3,sum4) stored in
xmm2, etc and (a,a) stored in xmm0. This speeds it up by a factor of 2
and is completely equivalent to the scalar case so I don't see why
-ffast-math (which implies -funsafe-math-optimizations) should be
necessary in this case, either.

References:
- [Bug tree-optimization/51499] New: vectorizer missing simple case
  - From: fb.programming at gmail dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]