This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [i386] Replace builtins with vector extensions
- From: Ulrich Drepper <drepper at gmail dot com>
- To: Marc Glisse <marc dot glisse at inria dot fr>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sun, 29 Jun 2014 05:40:56 -0400
- Subject: Re: [i386] Replace builtins with vector extensions
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 02 dot 1404112137530 dot 19663 at stedding dot saclay dot inria dot fr> <alpine dot DEB dot 2 dot 10 dot 1404281322180 dot 3620 at laptop-mg dot saclay dot inria dot fr> <alpine dot DEB dot 2 dot 10 dot 1405171529460 dot 3642 at laptop-mg dot saclay dot inria dot fr> <alpine dot DEB dot 2 dot 10 dot 1406281230560 dot 9234 at laptop-mg dot saclay dot inria dot fr> <CAOPLpQebzMfJLaq+xqxuHJ_5+tG78JiQNPXiyz-svbN8r7cbBQ at mail dot gmail dot com> <alpine dot DEB dot 2 dot 10 dot 1406281548450 dot 9234 at laptop-mg dot saclay dot inria dot fr>
On Sat, Jun 28, 2014 at 6:53 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> There is always a risk, but then even with builtins I think there was a
> small risk that an RTL optimization would mess things up. It is indeed
> higher if we expose the operation to the optimizers earlier, but it would be
> a bug if an "optimization" replaced a vector operation by something worse.
> Also, I am only proposing to handle the most trivial operations this way,
> not more complicated ones (like v[0]+=s) where we would be likely to fail
> generating the right instruction. And the pragma should ensure that the
> function will always be compiled in a mode where the vector instruction is
> available.
>
> ARM did the same and I don't think I have seen a bug reporting a regression
> about it (I haven't really looked though).
I think the Arm definitions come from a different angle. It's new,
there is no assumed semantics. For the x86 intrinsics Intel defines
that _mm_xxx() generates one of a given opcodes if there is a match.
If I want to generate a specific code sequence I use the intrinsics.
Otherwise I could already today use the vector type semantics myself.
Don't get me wrong, I like the idea to have the optimization of the
intrinsics happening. But perhaps not unconditionally or at least not
without preventing them.
I know this will look ugly, but how about a macro
__GCC_X86_HONOR_INTRINSICS to enable the current code and have by
default your proposed use of the vector arithmetic in place? This
wouldn't allow removing support for the built-ins but it would also
open the door to some more risky optimizations to be enabled by
default.