Bug 57202 - Please make the intrinsics headers like immintrin.h be usable without compiler flags
Summary: Please make the intrinsics headers like immintrin.h be usable without compile...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.8.1
: P3 enhancement
Target Milestone: 4.9.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-08 04:27 UTC by Thiago Macieira
Modified: 2017-04-17 16:04 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-* i?86-*-*
Build:
Known to work: 4.9.0
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thiago Macieira 2013-05-08 04:27:51 UTC
Please make all headers for intrinsics be includable without special compiler flags.

In other words, I want the following to work:

$ gcc -fsyntax-only -include smmintrin.h -xc /dev/null
In file included from <command-line>:0:0:
/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include/smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"

Note it works with ICC:
$ icc -fsyntax-only -include smmintrin.h -xc /dev/null && echo works
works


Not only that, please make all the intrinsics functions be defined and ready to be used.

This is necessary so that the following source file could compile even if -msse4.1 is not passed on the command-line (adapted from http://gcc.gnu.org/gcc-4.8/changes.html):

#include <smmintrin.h>

 __attribute__ ((target ("default")))
int foo(void)
{
  return 1;
}

__attribute__ ((target ("sse4.2")))
int foo(void)
{
  __m128i v;
  _mm_blendv_epi8(v, v, v);
  return 2;
}

There are several reasons for that, number one among them that it makes the GCC 4.8 feature above actually useful for non-trivial code. Also, passing extra options on the command-line are simply not an option for C++ code (where the feature above is useful) if that code is moderately complex and uses inline functions, and those options cannot be used if LTO is to be used (bug 54231).
Comment 1 Thiago Macieira 2013-05-08 07:03:26 UTC
This also applies to arm_neon.h.
Comment 2 Andrew Pinski 2013-05-08 07:35:52 UTC
(In reply to comment #1)
> This also applies to arm_neon.h.

Please file a bug separate for arm.
Comment 3 Marc Glisse 2013-05-08 08:02:23 UTC
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00740.html
that patch (or maybe a later iteration) is waiting for reviews but I think it is what this PR is asking for.
Comment 4 Marc Glisse 2014-02-12 13:16:54 UTC
Can this be closed?
Comment 5 Thiago Macieira 2014-02-12 15:24:55 UTC
(In reply to Marc Glisse from comment #4)
> Can this be closed?

Oh, yeah, this is working fine in GCC 4.9.
Comment 6 Marc Glisse 2014-02-12 15:45:00 UTC
Thanks.
Comment 7 Jeffrey Walton 2017-04-17 13:45:49 UTC
Please forgive my ignorance... What was fixed?

The problem statement is/was "Please make all headers for intrinsics be includable without special compiler flags." But it appears the intrinsics are not available.

I'm working with Ubuntu 16/GCC 5.4 on an old VIA C7 (SSE and SSE2, and some other extensions):

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge cmov pat clflush acpi mmx fxsr sse sse2 tm nx pni est tm2 xtpr rng rng_en ace ace_en ace2 ace2_en phe phe_en pmm pmm_en

The SSSE3 intrinsics are causing a compile error:

g++ -g2 -O2 -march=native -pipe -c test.cpp
In file included from /usr/lib/gcc/i686-linux-gnu/5/include/x86intrin.h:37:0,
                 from cpu.h:40,
                 from aria.cpp:8:
/usr/lib/gcc/i686-linux-gnu/5/include/tmmintrin.h: In member function void ‘test_ssse3()’:
/usr/lib/gcc/i686-linux-gnu/5/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline ‘__m128i _mm_shuffle_epi8(__m128i, __m128i)’: target specific option mismatch
 _mm_shuffle_epi8 (__m128i __X, __m128i __Y)


It appears the intrinsics are not available.
Comment 8 Marc Glisse 2017-04-17 15:04:26 UTC
(In reply to Jeffrey Walton from comment #7)
> It appears the intrinsics are not available.

They are available for functions compiled for a suitable target, for instance because of -march or thanks to the target attribute (see the original report). It does not make sense to make them always available.
Comment 9 Jeffrey Walton 2017-04-17 15:34:09 UTC
On Mon, Apr 17, 2017 at 11:04 AM, glisse at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57202
>
> --- Comment #8 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to Jeffrey Walton from comment #7)
>> It appears the intrinsics are not available.
>
> They are available for functions compiled for a suitable target, for instance
> because of -march or thanks to the target attribute (see the original report).
> It does not make sense to make them always available.

But that's what this bug report is for - to make the intrinsicsalways available.

This code still does not work:

if (HasAVX())
{
    ...
}
else if (HasSSSE3())
{
    // Use _mm_shuffle_epi8()
}
else if (HasSSE2)
{
    // Use _mm_load_si128()
}
else
{
    // Use C/C++
}

When a distro compile with just -march=i686 or -march=x86_64, then the
intrinsics would allow us to easy provide the features for modern
cpus. Because the intrinsics are not available, we're back to that
cursed inline assembly (and its wonderful error messages).

Jeff
Comment 10 Thiago Macieira 2017-04-17 16:04:02 UTC
> But that's what this bug report is for - to make the intrinsicsalways
available.

I never asked for them to be available in undecorated functions. Yes, that's how both the Intel and Microsoft compilers behave, but I actually find that GCC and Clang's behaviour makes sense too. This allows a clear demarcation of where different instructions may be used by the compiler, so the CPU check code can be sure of no leakage. What's more, it allows the compiler to use other instructions that you didn't specifically use.

It's not perfect, but neither is unrestricted use. I've seen code generated by either ICC or MSVC (don't remember which) when using an AVX2 instruction like VPMOVXZBW be surrounded by non-VEX-encoded SSE2 instructions because we never told the compiler it was ok to to use VEX.