This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts

From: Jakub Jelinek <jakub at redhat dot com>
To: Allan Sandfeld Jensen <linux at carewolf dot com>
Cc: Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org
Date: Mon, 24 Apr 2017 09:56:27 +0200
Subject: Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
Authentication-results: sourceware.org; auth=none
Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jakub at redhat dot com
Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 6F882C03BD68
Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6F882C03BD68
References: <201704221338.46300.linux@carewolf.com> <201704240933.09704.linux@carewolf.com> <20170424074349.GG1809@tucnak> <201704240951.29932.linux@carewolf.com>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Mon, Apr 24, 2017 at 09:51:29AM +0200, Allan Sandfeld Jensen wrote:
> On Monday 24 April 2017, Jakub Jelinek wrote:
> > On Mon, Apr 24, 2017 at 09:33:09AM +0200, Allan Sandfeld Jensen wrote:
> > > --- a/gcc/config/i386/avx2intrin.h
> > > +++ b/gcc/config/i386/avx2intrin.h
> > > @@ -667,7 +667,7 @@ extern __inline __m256i
> > > 
> > >  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > >  _mm256_slli_epi16 (__m256i __A, int __B)
> > >  {
> > > 
> > > -  return (__m256i)__builtin_ia32_psllwi256 ((__v16hi)__A, __B);
> > > +  return ((__B & 0xff) < 16) ? (__m256i)((__v16hi)__A << (__B & 0xff)) :
> > > _mm256_setzero_si256();
> > > 
> > >  }
> > 
> > What is the advantage of doing that when you replace one operation with
> > several (&, <, ?:, <<)?
> > I'd say instead we should fold the builtins if in the gimple fold target
> > hook we see the shift count constant and can decide based on that.
> > Or we could use __builtin_constant_p (__B) to decide whether to use
> > the generic vector shifts or builtin, but that means larger IL.
> 
> The advantage is that in this builtin, the __B is always a literal (or 
> constexpr), so the if statement is resolved at compile time.

Do we really want to support all the thousands _mm* intrinsics in constexpr
contexts?  People can just use generic vectors instead.

That said, both the options I've mentioned above provide the same advantages
and don't have the disadvantages of pessimizing normal code.

	Jakub

Follow-Ups:
- Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
  - From: Allan Sandfeld Jensen

References:
- Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
  - From: Allan Sandfeld Jensen
- Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
  - From: Jakub Jelinek
- Re: [PATCH] [x86] Avoid builtins for SSE/AVX2 immidiate logical shifts
  - From: Allan Sandfeld Jensen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]