[PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
Allan Sandfeld Jensen
allan@carewolf.com
Mon Apr 24 08:54:00 GMT 2017
On Monday 24 April 2017, Allan Sandfeld Jensen wrote:
> On Monday 24 April 2017, Jakub Jelinek wrote:
> > On Mon, Apr 24, 2017 at 10:02:40AM +0200, Allan Sandfeld Jensen wrote:
> > > > That said, both the options I've mentioned above provide the same
> > > > advantages and don't have the disadvantages of pessimizing normal
> > > > code.
> > >
> > > What pessimizing? This produce the same or better code for all legal
> > > arguments. The only difference besides better generated code is that it
> > > allows
> >
> > No. Have you really tried that?
> >
> > > the intrinsics to be used incorrectly with non-literal arguments
> > > because we lack the C-extension for constexp to prevent that.
> >
> > Consider e.g. -O2 -mavx2 -mtune=intel:
> > #include <x86intrin.h>
> >
> > __m256i
> > foo (__m256i x, int s)
> > {
> >
> > return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)x, s);
> >
> > }
> >
> > __m256i
> > bar (__m256i x, int s)
> > {
> >
> > return ((s & 0xff) < 16) ? (__m256i)((__v16hi)x << (s & 0xff)) :
> > _mm256_setzero_si256 (); }
> >
> > The first one generates
> >
> > movl %edi, %edi
> > vmovq %rdi, %xmm1
> > vpsllw %xmm1, %ymm0, %ymm0
> > ret
> >
> > (because that is actually what the instruction does), the second one
>
> That is a different instruction. That is the vpsllw not vpsllwi
>
> The intrinsics I changed is the immediate version, I didn't change the non-
> immediate version. It is probably a bug if you can give non-immediate
> values to the immediate only intrinsic. At least both versions handles it,
> if in different ways, but is is illegal arguments.
>
Though I now that I think about it, this means my change of to the existing
sse-psslw-1.c test and friends is wrong, because it uses variable input.
`Allan
More information about the Gcc-patches
mailing list