This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/56083] Vectorizer uses xor/movlps/movhps rather than movups


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56083

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2013-01-23 16:05:40 UTC ---
(In reply to comment #0)
> Unnecessarily complex machine code is generated on x86-64. Perhaps there is a
> reason for this but to me it seems like the compiler is failing to optimize
> properly. Asm code labels changed and comments added, other than that they are
> are produced by the respective compilers for this C code:

This is tuning decision, use -march= for targets that benefit from unaligned
loads and stores:

  /* X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL */
  m_COREI7 | m_AMDFAM10 | m_BDVER | m_BTVER,

  /* X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL */
  m_COREI7 | m_BDVER,

-O3 -march=corei7 produces:

        movups  (%rdi), %xmm0
        xorps   .LC0(%rip), %xmm0
        movups  %xmm0, (%rdi)

Which is the same as your hand optimized code.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]