This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
- From: Allan Sandfeld Jensen <linux at carewolf dot com>
- To: gcc-patches at gcc dot gnu dot org, Jakub Jelinek <jakub at redhat dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>
- Date: Mon, 24 Apr 2017 10:02:40 +0200
- Subject: Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
- Authentication-results: sourceware.org; auth=none
- References: <201704221338.46300.linux@carewolf.com> <201704240951.29932.linux@carewolf.com> <20170424075627.GI1809@tucnak>
On Monday 24 April 2017, Jakub Jelinek wrote:
> On Mon, Apr 24, 2017 at 09:51:29AM +0200, Allan Sandfeld Jensen wrote:
> > On Monday 24 April 2017, Jakub Jelinek wrote:
> > > On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > > > --- a/gcc/config/i386/avx2intrin.h
> > > > +++ b/gcc/config/i386/avx2intrin.h
> > > > @@ -667,7 +667,7 @@ extern __inline __m256i
> > > >
> > > > __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > > > _mm256_slli_epi16 (__m256i __A, int __B)
> > > > {
> > > >
> > > > - return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > > > + return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B &
> > > > 0xff)) : _mm256_setzero_si256();
> > > >
> > > > }
> > >
> > > What is the advantage of doing that when you replace one operation with
> > > several (&, <, ?:, <<)?
> > > I'd say instead we should fold the builtins if in the gimple fold
> > > target hook we see the shift count constant and can decide based on
> > > that. Or we could use __builtin_constant_p (__B) to decide whether to
> > > use the generic vector shifts or builtin, but that means larger IL.
> >
> > The advantage is that in this builtin, the __B is always a literal (or
> > constexpr), so the if statement is resolved at compile time.
>
> Do we really want to support all the thousands _mm* intrinsics in constexpr
> contexts? People can just use generic vectors instead.
>
I would love to support it, but first we need a C extension attribute matching
constexpr, and I consider it a separate issue.
> That said, both the options I've mentioned above provide the same
> advantages and don't have the disadvantages of pessimizing normal code.
>
What pessimizing? This produce the same or better code for all legal
arguments. The only difference besides better generated code is that it allows
the intrinsics to be used incorrectly with non-literal arguments because we
lack the C-extension for constexp to prevent that.
`Allan