GCC does not support *mmintrin.h with function specific opts

Sriraman Tallam tmsriram@google.com
Wed Apr 17 08:39:00 GMT 2013


Hi,

I have attached an updated patch that  addresses all the comments raised.

On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
>> I have attached a patch that fixes this. I have added an option
>> "-mgenerate-builtins" that will do two things.  It will define a macro
>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>> expose all the target specific builtins.  -mgenerate-builtins will not
>> affect code generation.
>
> 1) this shouldn't be an option, either it can be made to work reliably,
>    then it should be done always, or it can't, then it shouldn't be done

Ok, it is on by default now.  There is a way to turn it off, with
-mno-generate-builtins.

> 2) have you verified that if you always generate all builtins, that the
>    builtins not supported by the ISA selected from the command line are
>    created with the right vector modes?

This issue does not arise.  When the target builtin is expanded, it is
checked if the ISA support is there, either via function specific
target opts or global target opts. If not, an error is raised. Test
case added for this, please see intrinsic_4.c in patch.

> 3) the *intrin.h headers in the case where the guarding macro isn't defined
>    should be surrounded by something like
>    #ifndef __FMA4__
>    #pragma GCC push options
>    #pragma GCC target("fma4")
>    #endif
>    ...
>    #ifndef __FMA4__
>    #pragma GCC pop options
>    #endif
>    so that everything that is in the headers is compiled with the ISA
>    in question

I do not think this should be done because it will break the inlining
ability of the header function and cause issues if the caller does not
specify the required ISA. The fact that the header functions are
marked extern __inline, with gnu_inline guarantees that a body will
not be generated and they will be inlined.  If the caller does not
have the required ISA, appropriate errors will be raised. Test cases
added, see intrinsics_1.c, intrinsics_2.c

> 4) what happens if you use the various vector types typedefed in the
>    *intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
>    for VECTOR_TYPE is a function call, perhaps it will just be handled as
>    generic BLKmode vectors, which is desirable I think

I checked some tests here.  With -mno-sse for instance, vector types
are not permitted in function arguments and return values and gcc
raises a warning/error in each case.  With return values, gcc always
gives an error if a SSE register is required in a return value.  I
even fixed this message to not do it for functions marked as extern
inline, with "gnu_inline" keyword as a body for them will not be
generated.


> 5) what happens if you use a target builtin in a function not supporting
>    the corresponding ISA, do you get proper error explaining what you are
>    doing wrong?

Yes, please sse intrinsic_4.c test in patch.

> 6) what happens if you use some intrinsics in a function not supporting
>    the corresponding ISA?  Dunno if the inliner chooses not to inline it
>    and error out because it is always_inline, or what exactly will happen
>    then

Same deal here. The intrinsic function will, guaranteed, to be inlined
into the caller which will be a corresponding builtin call. That
builtin call will trigger an error if the ISA is not supported.

Thanks
Sri

>
> For all this you certainly need testcases.
>
>         Jakub
-------------- next part --------------
	* config/i386/i386.c (construct_container): Do not issue SSE
	return error for extern gnu_inline functions.
	(def_builtin): Do not generate builtins when -mno-generate-builtins
	is used.
	* config/i386/i386.opt (mgenerate-builtins): New target option.
	* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
	__ALL_ISA__ when generate_target_builtins is true.
	* testsuite/gcc.target/i386/intrinsics_1.c: New test.
	* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
	* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
	* config/i386/lwpintrin.h: Ditto.
	* config/i386/xopintrin.h: Ditto.
	* config/i386/fmaintrin.h: Ditto.
	* config/i386/bmiintrin.h: Ditto.
	* config/i386/fma4intrin.h: Ditto.
	* config/i386/nmmintrin.h: Ditto.
	* config/i386/tbmintrin.h: Ditto.
	* config/i386/smmintrin.h: Ditto.
	* config/i386/wmmintrin.h: Ditto.
	* config/i386/popcntintrin.h: Ditto.
	* config/i386/f16cintrin.h: Ditto.
	* config/i386/pmmintrin.h: Ditto.
	* config/i386/bmi2intrin.h: Ditto.
	* config/i386/tmmintrin.h: Ditto.
	* config/i386/xmmintrin.h: Ditto.
	* config/i386/mmintrin.h: Ditto.
	* config/i386/ammintrin.h: Ditto.
	* config/i386/emmintrin.h: Ditto.

Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 197691)
+++ config/i386/i386.c	(working copy)
@@ -6370,8 +6370,13 @@ construct_container (enum machine_mode mode, enum
     return NULL;
 
   /* We allowed the user to turn off SSE for kernel mode.  Don't crash if
-     some less clueful developer tries to use floating-point anyway.  */
-  if (needed_sseregs && !TARGET_SSE)
+     some less clueful developer tries to use floating-point anyway.  It is
+     alright if this is in a extern "gnu_inline" function, as it is the
+     caller that matters in this case.  */
+  if (needed_sseregs && !TARGET_SSE
+      && !(DECL_EXTERNAL (current_function_decl)
+           && lookup_attribute ("gnu_inline",
+		DECL_ATTRIBUTES (current_function_decl)) != NULL))
     {
       if (in_return)
 	{
@@ -26813,7 +26818,8 @@ def_builtin (HOST_WIDE_INT mask, const char *name,
       ix86_builtins_isa[(int) code].isa = mask;
 
       mask &= ~OPTION_MASK_ISA_64BIT;
-      if (mask == 0
+      if (generate_target_builtins
+	  || mask == 0
 	  || (mask & ix86_isa_flags) != 0
 	  || (lang_hooks.builtin_function
 	      == lang_hooks.builtin_function_ext_scope))
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 197691)
+++ config/i386/i386.opt	(working copy)
@@ -626,3 +626,7 @@ Split 32-byte AVX unaligned store
 mrtm
 Target Report Mask(ISA_RTM) Var(ix86_isa_flags) Save
 Support RTM built-in functions and code generation
+
+mgenerate-builtins
+Target Report Var(generate_target_builtins) Save Init(1)
+Generate all target builtins that are otherwise only generated when the approrpriate ISA is turned on.
Index: config/i386/i386-c.c
===================================================================
--- config/i386/i386-c.c	(revision 197691)
+++ config/i386/i386-c.c	(working copy)
@@ -54,6 +54,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_fla
   int last_arch_char = ix86_arch_string[arch_len - 1];
   int last_tune_char = ix86_tune_string[tune_len - 1];
 
+  if (generate_target_builtins)
+    def_or_undef (parse_in, "__ALL_ISA__");
+
   /* Built-ins based on -march=.  */
   switch (arch)
     {
Index: testsuite/gcc.target/i386/intrinsics_4.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_4.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_4.c	(revision 0)
@@ -0,0 +1,11 @@
+/* Test to check if a target specific builtin used in a function without the
+   appropriate ISA support generates an error.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse4.1" } */
+
+#include <smmintrin.h>
+__m128i foo(__m128i *V)
+{
+    return __builtin_ia32_movntdqa (V); /* { dg-error "'__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1" } */
+}
Index: testsuite/gcc.target/i386/intrinsics_1.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_1.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_1.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test case to check if intrinsics and function specific target
+   optimizations work together.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mno-sse4.1" } */
+
+#include <smmintrin.h>
+
+__attribute__((target("sse4.1")))
+__m128i foo(__m128i *V)
+{
+    return _mm_stream_load_si128(V);
+}
Index: testsuite/gcc.target/i386/intrinsics_2.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_2.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_2.c	(revision 0)
@@ -0,0 +1,19 @@
+/* Ok, to have SSE return in non-SSE functions marked as
+   extern, "gnu_inline".  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mno-sse4.1" } */
+
+#include <smmintrin.h>
+
+extern __inline __attribute__ ((__gnu_inline__))
+__m128i bar (__m128i *V)
+{
+  return _mm_stream_load_si128(V);
+}
+
+__attribute__((target("sse4.1")))
+__m128i foo(__m128i *V)
+{
+  return bar (V);
+}
Index: testsuite/gcc.target/i386/intrinsics_3.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_3.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_3.c	(revision 0)
@@ -0,0 +1,11 @@
+/* Using vector types without SSE enabled should generate an error.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse" } */
+
+typedef long long  _m128i  __attribute__((vector_size(16),__may_alias__));
+
+int foo (_m128i V) /* { dg-warning "SSE vector argument without SSE enabled changes the ABI" } */
+{
+  return 0;
+}
Index: testsuite/gcc.target/i386/intrinsics_5.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_5.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_5.c	(revision 0)
@@ -0,0 +1,13 @@
+/* Test case to check if -mno-generate-builtins will break use of intrinsics
+   when the appropriate ISA is not specified.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-generate-builtins -mno-sse4.1" } */
+
+#include <smmintrin.h>
+__m128i foo(__m128i *V) /* { dg-error "unknown type name" } */
+{
+    return V;
+}
+
+/* { dg-excess-errors "\"SSE4.1 instruction set not enabled\"" } */
Index: config/i386/lzcntintrin.h
===================================================================
--- config/i386/lzcntintrin.h	(revision 197691)
+++ config/i386/lzcntintrin.h	(working copy)
@@ -25,7 +25,7 @@
 # error "Never use <lzcntintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __LZCNT__
+#if !defined (__LZCNT__) && !defined (__ALL_ISA__)
 # error "LZCNT instruction is not enabled"
 #endif /* __LZCNT__ */
 
Index: config/i386/lwpintrin.h
===================================================================
--- config/i386/lwpintrin.h	(revision 197691)
+++ config/i386/lwpintrin.h	(working copy)
@@ -28,7 +28,7 @@
 #ifndef _LWPINTRIN_H_INCLUDED
 #define _LWPINTRIN_H_INCLUDED
 
-#ifndef __LWP__
+#if !defined (__LWP__) && !defined (__ALL_ISA__)
 # error "LWP instruction set not enabled"
 #else
 
Index: config/i386/xopintrin.h
===================================================================
--- config/i386/xopintrin.h	(revision 197691)
+++ config/i386/xopintrin.h	(working copy)
@@ -28,7 +28,7 @@
 #ifndef _XOPMMINTRIN_H_INCLUDED
 #define _XOPMMINTRIN_H_INCLUDED
 
-#ifndef __XOP__
+#if !defined (__XOP__) && !defined (__ALL_ISA__)
 # error "XOP instruction set not enabled"
 #else
 
Index: config/i386/fmaintrin.h
===================================================================
--- config/i386/fmaintrin.h	(revision 197691)
+++ config/i386/fmaintrin.h	(working copy)
@@ -28,7 +28,7 @@
 #ifndef _FMAINTRIN_H_INCLUDED
 #define _FMAINTRIN_H_INCLUDED
 
-#ifndef __FMA__
+#if !defined (__FMA__) && !defined (__ALL_ISA__)
 # error "FMA instruction set not enabled"
 #else
 
Index: config/i386/bmiintrin.h
===================================================================
--- config/i386/bmiintrin.h	(revision 197691)
+++ config/i386/bmiintrin.h	(working copy)
@@ -25,7 +25,7 @@
 # error "Never use <bmiintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI__
+#if !defined (__BMI__) && !defined (__ALL_ISA__)
 # error "BMI instruction set not enabled"
 #endif /* __BMI__ */
 
Index: config/i386/fma4intrin.h
===================================================================
--- config/i386/fma4intrin.h	(revision 197691)
+++ config/i386/fma4intrin.h	(working copy)
@@ -28,7 +28,7 @@
 #ifndef _FMA4INTRIN_H_INCLUDED
 #define _FMA4INTRIN_H_INCLUDED
 
-#ifndef __FMA4__
+#if !defined (__FMA4__) && !defined (__ALL_ISA__)
 # error "FMA4 instruction set not enabled"
 #else
 
Index: config/i386/nmmintrin.h
===================================================================
--- config/i386/nmmintrin.h	(revision 197691)
+++ config/i386/nmmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _NMMINTRIN_H_INCLUDED
 #define _NMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_2__
+#if !defined (__SSE4_2__) && !defined (__ALL_ISA__)
 # error "SSE4.2 instruction set not enabled"
 #else
 /* We just include SSE4.1 header file.  */
Index: config/i386/tbmintrin.h
===================================================================
--- config/i386/tbmintrin.h	(revision 197691)
+++ config/i386/tbmintrin.h	(working copy)
@@ -25,7 +25,7 @@
 # error "Never use <tbmintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __TBM__
+#if !defined (__TBM__) && !defined (__ALL_ISA__)
 # error "TBM instruction set not enabled"
 #endif /* __TBM__ */
 
Index: config/i386/smmintrin.h
===================================================================
--- config/i386/smmintrin.h	(revision 197691)
+++ config/i386/smmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _SMMINTRIN_H_INCLUDED
 #define _SMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_1__
+#if !defined (__SSE4_1__) && !defined (__ALL_ISA__)
 # error "SSE4.1 instruction set not enabled"
 #else
 
Index: config/i386/wmmintrin.h
===================================================================
--- config/i386/wmmintrin.h	(revision 197691)
+++ config/i386/wmmintrin.h	(working copy)
@@ -30,7 +30,7 @@
 /* We need definitions from the SSE2 header file.  */
 #include <emmintrin.h>
 
-#if !defined (__AES__) && !defined (__PCLMUL__)
+#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__)
 # error "AES/PCLMUL instructions not enabled"
 #else
 
Index: config/i386/popcntintrin.h
===================================================================
--- config/i386/popcntintrin.h	(revision 197691)
+++ config/i386/popcntintrin.h	(working copy)
@@ -21,7 +21,7 @@
    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef __POPCNT__
+#if !defined (__POPCNT__) && !defined (__ALL_ISA__)
 # error "POPCNT instruction set not enabled"
 #endif /* __POPCNT__ */
 
Index: config/i386/f16cintrin.h
===================================================================
--- config/i386/f16cintrin.h	(revision 197691)
+++ config/i386/f16cintrin.h	(working copy)
@@ -25,7 +25,7 @@
 # error "Never use <f16intrin.h> directly; include <x86intrin.h> or <immintrin.h> instead."
 #endif
 
-#ifndef __F16C__
+#if !defined (__F16C__) && !defined (__ALL_ISA__)
 # error "F16C instruction set not enabled"
 #else
 
Index: config/i386/pmmintrin.h
===================================================================
--- config/i386/pmmintrin.h	(revision 197691)
+++ config/i386/pmmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _PMMINTRIN_H_INCLUDED
 #define _PMMINTRIN_H_INCLUDED
 
-#ifndef __SSE3__
+#if !defined (__SSE3__) && !defined (__ALL_ISA__)
 # error "SSE3 instruction set not enabled"
 #else
 
Index: config/i386/bmi2intrin.h
===================================================================
--- config/i386/bmi2intrin.h	(revision 197691)
+++ config/i386/bmi2intrin.h	(working copy)
@@ -25,7 +25,7 @@
 # error "Never use <bmi2intrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI2__
+#if !defined (__BMI2__) && !defined (__ALL_ISA__)
 # error "BMI2 instruction set not enabled"
 #endif /* __BMI2__ */
 
Index: config/i386/tmmintrin.h
===================================================================
--- config/i386/tmmintrin.h	(revision 197691)
+++ config/i386/tmmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _TMMINTRIN_H_INCLUDED
 #define _TMMINTRIN_H_INCLUDED
 
-#ifndef __SSSE3__
+#if !defined (__SSSE3__) && !defined (__ALL_ISA__)
 # error "SSSE3 instruction set not enabled"
 #else
 
Index: config/i386/xmmintrin.h
===================================================================
--- config/i386/xmmintrin.h	(revision 197691)
+++ config/i386/xmmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _XMMINTRIN_H_INCLUDED
 #define _XMMINTRIN_H_INCLUDED
 
-#ifndef __SSE__
+#if !defined (__SSE__) && !defined (__ALL_ISA__)
 # error "SSE instruction set not enabled"
 #else
 
Index: config/i386/mmintrin.h
===================================================================
--- config/i386/mmintrin.h	(revision 197691)
+++ config/i386/mmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
-#ifndef __MMX__
+#if !defined (__MMX__) && !defined (__ALL_ISA__)
 # error "MMX instruction set not enabled"
 #else
 /* The Intel API is flexible enough that we must allow aliasing with other
Index: config/i386/ammintrin.h
===================================================================
--- config/i386/ammintrin.h	(revision 197691)
+++ config/i386/ammintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _AMMINTRIN_H_INCLUDED
 #define _AMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4A__
+#if !defined (__SSE4A__) && !defined (__ALL_ISA__)
 # error "SSE4A instruction set not enabled"
 #else
 
Index: config/i386/emmintrin.h
===================================================================
--- config/i386/emmintrin.h	(revision 197691)
+++ config/i386/emmintrin.h	(working copy)
@@ -27,7 +27,7 @@
 #ifndef _EMMINTRIN_H_INCLUDED
 #define _EMMINTRIN_H_INCLUDED
 
-#ifndef __SSE2__
+#if !defined (__SSE2__) && !defined (__ALL_ISA__)
 # error "SSE2 instruction set not enabled"
 #else
 


More information about the Gcc-patches mailing list