This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/85572] New: faster code for absolute value of __v2di


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85572

            Bug ID: 85572
           Summary: faster code for absolute value of __v2di
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kretz at kde dot org
  Target Milestone: ---

The absolute value for 64-bit integer SSE vectors is only optimized when
AVX512VL is available. Test case (`-O2 -ffast-math` and one of -mavx512vl,
-msse4, or -msse2):

#include <x86intrin.h>

__v2di abs(__v2di x) {
    return x < 0 ? -x : x;
}

With SSE4 I suggest:

abs(long long __vector(2)):
  pxor %xmm1, %xmm1
  pcmpgtq %xmm0, %xmm1
  pxor %xmm1, %xmm0
  psubq %xmm1, %xmm0
  ret

in C++:
    auto neg = reinterpret_cast<__v2di>(x < 0);
    return (x ^ neg) - neg;


Without SSE4:

abs(long long __vector(2)):
  movdqa %xmm0, %xmm2
  pxor %xmm1, %xmm1
  psrlq $63, %xmm2
  psubq %xmm2, %xmm1
  pxor %xmm1, %xmm0
  paddq %xmm2, %xmm0
  ret

in C++:
    auto neg = -reinterpret_cast<__v2di>(reinterpret_cast<__v2du>(x) >> 63);
    return (x ^ neg) - neg;


related issue for scalars: #67510

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]