Bug 41201 - #pragma GCC target ("sse2") doesn't alter __SSE2__ in C++ (as it does in C)
Summary: #pragma GCC target ("sse2") doesn't alter __SSE2__ in C++ (as it does in C)
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 48026
  Show dependency treegraph
 
Reported: 2009-08-31 20:19 UTC by David Baron
Modified: 2012-02-02 06:24 UTC (History)
7 users (show)

See Also:
Host: x86_64-unknown-linux-gnu
Target: x86_64-unknown-linux-gnu
Build: x86_64-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2010-12-08 14:10:43


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Baron 2009-08-31 20:19:04 UTC
I'm really not sure what component this goes in; apologies if it's wrong.

The following testcase:

=== BEGIN pragma.c ===
#ifdef __SSE2__
#error "SSE2 should not be defined"
#endif 

#pragma GCC push_options

#pragma GCC target ("sse2")

#ifndef __SSE2__
#error "SSE2 should be defined"
#endif 

#pragma GCC pop_options

#ifdef __SSE2__
#error "SSE2 should not be defined"
#endif 
=== END pragma.c ===

hits the middle #error in C++, but compiles correctly in C.  I'm using GCC 4.4.1 with -mno-sse2:

$ /usr/local/gcc-4.4.1/bin/gcc -x c -c -mno-sse2 pragma.c
[returns 0]
$ /usr/local/gcc-4.4.1/bin/gcc -x c++ -c -mno-sse2 pragma.c
pragma.c:10:2: error: #error "SSE2 should be defined"
[returns 1]

This matters because the real case I'm worried about (for https://bugzilla.mozilla.org/show_bug.cgi?id=513422 ) is an example that replaces the middle #ifdef-#error-#endif in the testcase with a #include of <emmintrin.h>, and emmintrin.h has an #error in exactly that case.

I think C++ should behave the same as C does (a behavior that it looks like a number of C testcases already in the GCC testsuite depend on: sse-22.c sse-23.c funcspec-9.c in gcc/testsuite/gcc.target/i386)
Comment 1 Siarhei Siamashka 2010-04-27 22:44:14 UTC
"#pragma GCC <target|optimize>" just does not seem to work with C++. Just stumbled on it trying to narrow down something that looks like wrong-code generation bug in gcc 4.5.0 when compiling qt4.

Prepending "__attribute__((optimize("-O0")))" to each function still works, so no real need to go through the trouble of splitting source files into parts to bisect the issue.
Comment 2 Justin Lebar 2010-07-12 22:09:22 UTC
cc'ing Harsha Jagasia, who wrote sse-22.c and funcspec-9.

I might be willing to put together a patch for this, but I'm totally unfamiliar with the codebase, so I'd almost surely need some help.
Comment 3 Justin Lebar 2010-07-12 22:22:05 UTC
Also cc'ing H.J. Lu, who wrote sse-23.c
Comment 4 H.J. Lu 2010-07-12 22:43:05 UTC
I think the whole "pragma GCC target" is incomplete/broken.
Comment 5 Justin Lebar 2010-07-26 16:20:48 UTC
FWIW, it seems that MSVC is perfectly happy to compile code with intrinsics inside a file which doesn't have any special flags.  It would be awesome if there were some way to do the same with GCC, even if it involves #pragma hackery.  The current limitation makes using intrinsics pretty painful, as it requires a separate file for each targeted instruction set.
Comment 6 Dave Korn 2010-12-08 14:10:43 UTC
I stumbled into the underlying cause of this bug while working on dllimport/dllexport attributes.  It turns out that the TARGET_INSERT_ATTRIBUTES hook is broken in C++, because of a bit of premature optimisation in the routine gcc/cp/decl2.c#cplus_decl_attributes().  This is a C++-specific wrapper round the core compiler's gcc/attribs.c#decl_attributes() which takes care of deferred attribute handling for templates:

/* Like decl_attributes, but handle C++ complexity.  */

void
cplus_decl_attributes (tree *decl, tree attributes, int flags)
{
  if (*decl == NULL_TREE || *decl == void_type_node
      || *decl == error_mark_node
      || attributes == NULL_TREE)
    return;

  if (processing_template_decl)
    {
      if (check_for_bare_parameter_packs (attributes))
	return;

      save_template_attributes (&attributes, decl);
      if (attributes == NULL_TREE)
	return;
    }

  if (TREE_CODE (*decl) == TEMPLATE_DECL)
    decl = &DECL_TEMPLATE_RESULT (*decl);

  decl_attributes (decl, attributes, flags);

  if (TREE_CODE (*decl) == TYPE_DECL)
    SET_IDENTIFIER_TYPE_VALUE (DECL_NAME (*decl), TREE_TYPE (*decl));
}

  The early exits when attributes == NULL_TREE are (I suppose) intended to save the time taken calling decl_attributes when there aren't any attributes to be added to the newly-created decl, but it has the side-effect of bypassing the call that decl_attributes makes to TARGET_INSERT_ATTRIBUTES.  The consequence is that the default/target/pragma-derived attributes don't get inserted onto any decls in C++ - except for decls that have some explicit attributes specified!

  I'm going to be testing this fix as part of a larger patch over the next couple of days, perhaps others would like to give it a try on their platforms:


Index: gcc/cp/decl2.c
===================================================================
--- gcc/cp/decl2.c	(revision 167484)
+++ gcc/cp/decl2.c	(working copy)
@@ -1269,8 +1269,7 @@ void
 cplus_decl_attributes (tree *decl, tree attributes, int flags)
 {
   if (*decl == NULL_TREE || *decl == void_type_node
-      || *decl == error_mark_node
-      || attributes == NULL_TREE)
+      || *decl == error_mark_node)
     return;
 
   if (processing_template_decl)
@@ -1279,13 +1278,13 @@ cplus_decl_attributes (tree *decl, tree attributes
 	return;
 
       save_template_attributes (&attributes, decl);
-      if (attributes == NULL_TREE)
-	return;
     }
 
   if (TREE_CODE (*decl) == TEMPLATE_DECL)
     decl = &DECL_TEMPLATE_RESULT (*decl);
 
+  /* Even if ATTRIBUTES is null, we must call this in order to
+     give the TARGET_INSERT_ATTRIBUTES hook a chance to run.  */
   decl_attributes (decl, attributes, flags);
 
   if (TREE_CODE (*decl) == TYPE_DECL)
Comment 7 Dave Korn 2010-12-08 14:19:40 UTC
(In reply to comment #6)
> I stumbled into the underlying cause of this bug 

Should clarify: I'm referring to the broken-ness referred to in comments 1 and 4.  My patch probably won't make any difference to the original testcase, because it's to do with declarations, not preprocessor macros.