This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: GCC does not support *mmintrin.h with function specific opts
- From: Sriraman Tallam <tmsriram at google dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Diego Novillo <dnovillo at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, David Li <davidxl at google dot com>
- Date: Wed, 15 May 2013 19:25:02 -0700
- Subject: Re: GCC does not support *mmintrin.h with function specific opts
- References: <CAAs8Hmxybjtoo0fUTPYWNqF8+SUTLQQAb4TmQdycB1WkccAgkg at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1304251941170 dot 12832 at digraph dot polyomino dot org dot uk> <CAAs8HmyBKhjMKyu7aFqn2AU4RGXoCvLP=AwKV+hVS31Wb=FuRg at mail dot gmail dot com> <CAAs8HmzMfRvw19nxu5khXHKoM+Oz63KJSnzy=5WfZHdODF5izw at mail dot gmail dot com> <CAAs8Hmx7YVLJoLNVKy=52SykLB+hqOYWvB9iQUd+g1ZH6mKOCw at mail dot gmail dot com> <CAAs8Hmy34GY0c1PNyvtGW3vtHZ_cVY+PRy0tLuAuvvP8=FaREA at mail dot gmail dot com> <CAAs8HmzO4Mx1W-+RbmhhOH6nNLRQ5TtNeaqpxR-AS0ZR_9vuzw at mail dot gmail dot com> <CAMe9rOomjVGV5rpqRkLqPB530uEPYMcVGU6hAScbjs2eeW-etg at mail dot gmail dot com> <CAFULd4YNGAfhwao+ZKVuHTE1uvqiB04qrdX5FeZxBcVw0Yr1TQ at mail dot gmail dot com> <20130514083913 dot GJ1377 at tucnak dot redhat dot com> <20130514100419 dot GM1377 at tucnak dot redhat dot com>
On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote:
>> When trying with -O2 -mno-avx:
>> #ifndef __AVX__
>> #pragma GCC push_options
>> #pragma GCC target("avx")
>> #define __DISABLE_AVX__
>> #endif
>> typedef float __v8sf __attribute__ ((__vector_size__ (32)));
>> typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
>> extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>> _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
>> #ifdef __DISABLE_AVX__
>> #pragma GCC pop_options
>> #undef __DISABLE_AVX__
>> #endif
>> __m256 a, b, c;
>> void __attribute__((target ("avx")))
>> foo (void)
>> {
>> a = _mm256_and_ps (b, c);
>> }
>> we get bogus errors and ICE:
>> tty2.c: In function '_mm256_and_ps':
>> tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
>> tty2.c: In function 'foo':
>> tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
>> tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
>> 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
>> ../../gcc/expr.c:3485
>> (I have added "1 ||" instead of your generate_builtins into i386.c
>> (def_builtin)), that just shows that target attribute/pragma support still
>> has very severe issues that need to be fixed, instead of papered around.
>>
>> Note, we ICE on:
>> #pragma GCC target ("mavx")
>> That should be fixed too.
>
> Ok, I had a brief look at the above two issues.
>
> The first testcase has the problem that the ix86_previous_fndecl cache
> gets out of date. When set_cfun is called on _mm256_and_ps (with the
> implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps,
> TARGET_AVX is set to true, target reinited. Then set_cfun is called
> with NULL, we don't do anything. Later on #pragma GCC pop_options appears,
> sets !TARGET_AVX (as that is the new target_option_current_node).
> Next foo is being parsed, avx attribute is noticed, the same target node
> is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's
> target node is the same as foo's and so we don't do cl_target_restore_option
> at all, so !TARGET_AVX remains, while it should be set. That is the reason
> for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache
> on any #pragma GCC target below. The #pragma GCC target ("mavx") is also
> fixed below. The patch also includes the "1 ||" to enable building all
> builtins. We still ICE with:
> #0 fancy_abort (file=0x11d8fad "../../gcc/expr.c", line=316, function=0x11dada3 "convert_move") at ../../gcc/diagnostic.c:1180
> #1 0x0000000000771c39 in convert_move (to=0x7ffff1b2df00, from=0x7ffff1b314e0, unsignedp=0) at ../../gcc/expr.c:316
> #2 0x000000000078009f in store_expr (exp=0x7ffff19ab390, target=0x7ffff1b2df00, call_param_p=0, nontemporal=false) at ../../gcc/expr.c:5300
> #3 0x000000000077eba1 in expand_assignment (to=0x7ffff1b35090, from=0x7ffff19ab390, nontemporal=false) at ../../gcc/expr.c:5025
> on the first testcase. We don't ICE say on:
> #ifndef __AVX__
> #pragma GCC push_options
> #pragma GCC target("avx")
> #define __DISABLE_AVX__
> #endif
> typedef float __v8sf __attribute__ ((__vector_size__ (32)));
> typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
> extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
> #ifdef __DISABLE_AVX__
> #pragma GCC pop_options
> #undef __DISABLE_AVX__
> #endif
> __m256 a[10], b[10], c[10];
> void __attribute__((target ("avx")))
> foo (void)
> {
> a[0] = _mm256_and_ps (b[0], c[0]);
> }
> The problem is that in the first testcase, the VAR_DECL c (guess also b and
> a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
> types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
> (it has been laid out while -mno-avx has been the current) and also
> DECL_RTL which is a mem:BLK. Guess expr.c would need to special case
> TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
> DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
> to the desired mode?
>
> --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.000000000 +0100
> +++ gcc/config/i386/i386-c.c 2013-05-14 11:46:50.773806894 +0200
> @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre
>
> if (! args)
> {
> - cur_tree = ((pop_target)
> - ? pop_target
> - : target_option_default_node);
> + cur_tree = (pop_target ? pop_target : target_option_default_node);
> cl_target_option_restore (&global_options,
> TREE_TARGET_OPTION (cur_tree));
> }
> else
> {
> cur_tree = ix86_valid_target_attribute_tree (args);
> - if (!cur_tree)
> - return false;
> + if (!cur_tree || cur_tree == error_mark_node)
> + {
> + cl_target_option_restore (&global_options,
> + TREE_TARGET_OPTION (prev_tree));
> + return false;
> + }
> }
>
> target_option_current_node = cur_tree;
> + ix86_reset_previous_fndecl ();
>
> /* Figure out the previous/current isa, arch, tune and the differences. */
> prev_opt = TREE_TARGET_OPTION (prev_tree);
> --- gcc/config/i386/i386-protos.h.jj 2013-04-03 08:28:50.000000000 +0200
> +++ gcc/config/i386/i386-protos.h 2013-05-14 11:41:35.389638299 +0200
> @@ -40,6 +40,8 @@ extern void ix86_output_addr_diff_elt (F
> extern enum calling_abi ix86_cfun_abi (void);
> extern enum calling_abi ix86_function_type_abi (const_tree);
>
> +extern void ix86_reset_previous_fndecl (void);
> +
> #ifdef RTX_CODE
> extern int standard_80387_constant_p (rtx);
> extern const char *standard_80387_constant_opcode (rtx);
> --- gcc/config/i386/i386.c.jj 2013-05-14 08:23:31.000000000 +0200
> +++ gcc/config/i386/i386.c 2013-05-14 11:40:54.996867411 +0200
> @@ -4559,6 +4559,13 @@ ix86_can_inline_p (tree caller, tree cal
> /* Remember the last target of ix86_set_current_function. */
> static GTY(()) tree ix86_previous_fndecl;
>
> +/* Invalidate ix86_previous_fndecl cache. */
> +void
> +ix86_reset_previous_fndecl (void)
> +{
> + ix86_previous_fndecl = NULL_TREE;
> +}
> +
> /* Establish appropriate back-end context for processing the function
> FNDECL. The argument might be NULL to indicate processing at top
> level, outside of any function scope. */
> @@ -26829,7 +26836,7 @@ def_builtin (HOST_WIDE_INT mask, const c
> ix86_builtins_isa[(int) code].isa = mask;
>
> mask &= ~OPTION_MASK_ISA_64BIT;
> - if (mask == 0
> + if (/* HACK */ 1 || mask == 0
> || (mask & ix86_isa_flags) != 0
> || (lang_hooks.builtin_function
> == lang_hooks.builtin_function_ext_scope))
This does not seem necessary now once the #pragma GCC target is used
in the header. The target-specific builtins seem to be generated when
the target is specified using pragmas.
I will consolidate your patches and send one shortly.
Thanks for clearing this,
Sri
>
>
> Jakub