This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR81921
On Wed, 23 Aug 2017, Richard Biener wrote:
> On Tue, 22 Aug 2017, Richard Biener wrote:
>
> > On August 22, 2017 6:38:55 PM GMT+02:00, "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> wrote:
> > >On 22/08/17 13:55, Kyrill Tkachov wrote:
> > >> Hi Richard,
> > >> [roping in more aarch64 maintainers]
> > >>
> > >> On 22/08/17 13:27, Richard Biener wrote:
> > >>> On Tue, 22 Aug 2017, Uros Bizjak wrote:
> > >>>
> > >>>> On Tue, Aug 22, 2017 at 12:15 PM, Richard Biener
> > ><rguenther@suse.de>
> > >>>> wrote:
> > >>>>> The following patch fixes PR81921 (and LTO build of libgo) which I
> > >ran
> > >>>>> into when trying to enable free-lang-data for non-LTO compiles.
> > >>>>>
> > >>>>> free-lang-data forces a DECL_FUNCTION_SPECIFIC_TARGET for all
> > >functions
> > >>>>> so we have them ending up with target_option_default_node
> > >eventually
> > >>>>> which is something ix86_can_inline_p doesn't expect (I tried
> > >forcing
> > >>>>> a compare of the actual options but that fails as well as we get
> > >>>>> spurious differences in use-fpmath, the default node with -m32
> > >>>>> -march=x86_64 doesn't have it while non-default nodes have it).
> > >>>>>
> > >>>>> The patch is what I consider safe for branches, we might want to
> > >work
> > >>>>> on sth better (actually comparing always and fixing the fpmath
> > >issue)
> > >>>>> on trunk as followup.
> > >>>>>
> > >>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for
> > >trunk
> > >>>>> and active branches?
> > >>>>>
> > >>>>> Note the change to the ref = false conditional isn't strictly
> > >necessary
> > >>>>> but it makes -flto and non-flto behave consistently.
...
> > >> The aarch64 changes looks ok to me FWIW (since I wrote that function)
> > >>
> > >
> > >I *think* this is probably OK for AArch64, but I don't think it's safe
> > >in general.
> > >
> > >Consider, for example, the arm port where we can have functions built
> > >for ARM and functions built for Thumb. You cannot, in general, inline
> > >either into the other, (think inline asm, or isa-specific intrinsic
> > >operations).
> > >
> > >I think this approach only works at all because it's generally
> > >meaningless to have some ISA that is the default and then construct
> > >other functions later in the compilation that deliberately target a
> > >smaller subset of the ISA. But such an assumption doesn't apply when
> > >you construct functions in completely separate ISAs which are
> > >call-compatible but otherwise distinct.
> >
> > Note the patch doesn't change anything for NON-LTO, where you never see
> > target_option_default_node. Just LTO forces those in place of NULL to
> > make cross-TU behavior more like you suggest.
>
> So thinking about this some more we lack the ability to distinguish
> between no explicit target attribute and an explicit target attribute
> that matches the global defaults with LTO. Without LTO it is the
> target that decides whether to put target_option_default_node into
> DECL_FUNCTION_SPECIFIC_TARGET but LTO treats the global options
> as explicit target attribute for all functions of a TU.
>
> That said, the
>
> /* If callee has no option attributes, then it is ok to inline */
> if (!callee_opts)
> ret = true;
>
> tries to allow inlining of functions with no explicit attribute
> into any other function which IMHO is reasonable?
>
> That argues for sth like DECL_FUNCTION_SPECIFIC_TARGET_USER_P and
> DECL_FUNCTION_SPECIFIC_OPTIMIZATION_USER_P.
>
> Does that make any sense? The question is whether, in LTO context,
> differing global options are USER_P or not ...
>
> OTOH as you say usually function specific opts are a superset of
> global opts which means if the target properly checks two option
> sets against each other the effect should be similar.
Which means we can also do the following (only the default and the x86
hook changed), always compare and properly treat a not present
DECL_FUNCTION_SPECIFIC_TARGET as target_option_default_node.
To fix PR81921 we need to make sure to process target("sse2") when
it might change fpmath use (for -m32). Not sure if that captures
all cases so I wonder whether skipping the processing is not
premature optimization -- we should do this at most once per function?
Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for trunk
for the x86 part? Is the ix86_valid_target_attribute_tree change ok
for branches?
Note that as there is the chance this now rejects cases that it didn't
before even w/o -flto such change isn't appropriate for the branches
(the ix86_valid_target_attribute_tree change is). Note I arrived
here because enabling free-lang-data unconditionally broke bootstrap,
so the changes to the other archs where a similar thing may happen --
you can check with
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 251274)
+++ gcc/tree.c (working copy)
@@ -5660,8 +5660,7 @@ free_lang_data (void)
unsigned i;
/* If we are the LTO frontend we have freed lang-specific data already.
*/
- if (in_lto_p
- || (!flag_generate_lto && !flag_generate_offload))
+ if (in_lto_p)
return 0;
/* Provide a dummy TRANSLATION_UNIT_DECL if the FE failed to provide
one. */
Richard, can you add a testcase where it is invalid to inline
-mthumb/arm code into the other? I suppose that requires some
inline asm to break. Or is there already one?
Thanks,
Richard.
2017-08-23 Richard Biener <rguenther@suse.de>
PR target/81921
* targhooks.c (default_target_can_inline_p): Properly
use target_option_default_node when no DECL_FUNCTION_SPECIFIC_TARGET
is present and always compare.
* config/i386/i386.c (ix86_valid_target_attribute_tree): Make
sure to process the attribute when fpmath may change.
(ix86_can_inline_p): Properly use target_option_default_node when
no DECL_FUNCTION_SPECIFIC_TARGET is present and always compare.
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c (revision 251275)
+++ gcc/targhooks.c (working copy)
@@ -1442,27 +1442,18 @@ default_target_option_pragma_parse (tree
bool
default_target_can_inline_p (tree caller, tree callee)
{
- bool ret = false;
tree callee_opts = DECL_FUNCTION_SPECIFIC_TARGET (callee);
tree caller_opts = DECL_FUNCTION_SPECIFIC_TARGET (caller);
-
- /* If callee has no option attributes, then it is ok to inline */
- if (!callee_opts)
- ret = true;
-
- /* If caller has no option attributes, but callee does then it is not ok to
- inline */
- else if (!caller_opts)
- ret = false;
+ if (! callee_opts)
+ callee_opts = target_option_default_node;
+ if (! caller_opts)
+ caller_opts = target_option_default_node;
/* If both caller and callee have attributes, assume that if the
pointer is different, the two functions have different target
options since build_target_option_node uses a hash table for the
options. */
- else
- ret = (callee_opts == caller_opts);
-
- return ret;
+ return callee_opts == caller_opts;
}
/* If the machine does not have a case insn that compares the bounds,
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 251275)
+++ gcc/config/i386/i386.c (working copy)
@@ -7369,7 +7369,8 @@ ix86_valid_target_attribute_tree (tree a
|| opts->x_target_flags != def->x_target_flags
|| option_strings[IX86_FUNCTION_SPECIFIC_ARCH]
|| option_strings[IX86_FUNCTION_SPECIFIC_TUNE]
- || enum_opts_set.x_ix86_fpmath)
+ || enum_opts_set.x_ix86_fpmath
+ || !TARGET_64BIT_P (opts->x_ix86_isa_flags))
{
/* If we are using the default tune= or arch=, undo the string assigned,
and use the default. */
@@ -7502,53 +7503,47 @@ ix86_valid_target_attribute_p (tree fnde
static bool
ix86_can_inline_p (tree caller, tree callee)
{
- bool ret = false;
tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
-
- /* If callee has no option attributes, then it is ok to inline. */
if (!callee_tree)
- ret = true;
+ callee_tree = target_option_default_node;
+ if (!caller_tree)
+ caller_tree = target_option_default_node;
+ if (callee_tree == caller_tree)
+ return true;
- /* If caller has no option attributes, but callee does then it is not ok to
- inline. */
- else if (!caller_tree)
+ struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+ struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+ bool ret = false;
+
+ /* Callee's isa options should be a subset of the caller's, i.e. a SSE4
+ function can inline a SSE2 function but a SSE2 function can't inline
+ a SSE4 function. */
+ if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags)
+ != callee_opts->x_ix86_isa_flags)
+ || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2)
+ != callee_opts->x_ix86_isa_flags2))
ret = false;
- else
- {
- struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
- struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+ /* See if we have the same non-isa options. */
+ else if (caller_opts->x_target_flags != callee_opts->x_target_flags)
+ ret = false;
- /* Callee's isa options should be a subset of the caller's, i.e. a SSE4
- function can inline a SSE2 function but a SSE2 function can't inline
- a SSE4 function. */
- if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags)
- != callee_opts->x_ix86_isa_flags)
- || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2)
- != callee_opts->x_ix86_isa_flags2))
- ret = false;
-
- /* See if we have the same non-isa options. */
- else if (caller_opts->x_target_flags != callee_opts->x_target_flags)
- ret = false;
-
- /* See if arch, tune, etc. are the same. */
- else if (caller_opts->arch != callee_opts->arch)
- ret = false;
-
- else if (caller_opts->tune != callee_opts->tune)
- ret = false;
+ /* See if arch, tune, etc. are the same. */
+ else if (caller_opts->arch != callee_opts->arch)
+ ret = false;
- else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath)
- ret = false;
+ else if (caller_opts->tune != callee_opts->tune)
+ ret = false;
- else if (caller_opts->branch_cost != callee_opts->branch_cost)
- ret = false;
+ else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath)
+ ret = false;
- else
- ret = true;
- }
+ else if (caller_opts->branch_cost != callee_opts->branch_cost)
+ ret = false;
+
+ else
+ ret = true;
return ret;
}
Index: gcc/testsuite/gcc.target/i386/pr81921.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr81921.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr81921.c (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lto } */
+/* { dg-options "-flto -march=x86-64" } */
+
+extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__, target("sse2")))
+_mm_loadu_si128 (int const *__P)
+{
+ return *__P;
+}
+
+void __attribute__((target("ssse3"))) foo (void *p)
+{
+ volatile int x = _mm_loadu_si128 (p);
+}