This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR81921


On Wed, Aug 23, 2017 at 10:04 AM, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 23 Aug 2017, Richard Biener wrote:
>
>> On Tue, 22 Aug 2017, Richard Biener wrote:
>>
>> > On August 22, 2017 6:38:55 PM GMT+02:00, "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> wrote:
>> > >On 22/08/17 13:55, Kyrill Tkachov wrote:
>> > >> Hi Richard,
>> > >> [roping in more aarch64 maintainers]
>> > >>
>> > >> On 22/08/17 13:27, Richard Biener wrote:
>> > >>> On Tue, 22 Aug 2017, Uros Bizjak wrote:
>> > >>>
>> > >>>> On Tue, Aug 22, 2017 at 12:15 PM, Richard Biener
>> > ><rguenther@suse.de>
>> > >>>> wrote:
>> > >>>>> The following patch fixes PR81921 (and LTO build of libgo) which I
>> > >ran
>> > >>>>> into when trying to enable free-lang-data for non-LTO compiles.
>> > >>>>>
>> > >>>>> free-lang-data forces a DECL_FUNCTION_SPECIFIC_TARGET for all
>> > >functions
>> > >>>>> so we have them ending up with target_option_default_node
>> > >eventually
>> > >>>>> which is something ix86_can_inline_p doesn't expect (I tried
>> > >forcing
>> > >>>>> a compare of the actual options but that fails as well as we get
>> > >>>>> spurious differences in use-fpmath, the default node with -m32
>> > >>>>> -march=x86_64 doesn't have it while non-default nodes have it).
>> > >>>>>
>> > >>>>> The patch is what I consider safe for branches, we might want to
>> > >work
>> > >>>>> on sth better (actually comparing always and fixing the fpmath
>> > >issue)
>> > >>>>> on trunk as followup.
>> > >>>>>
>> > >>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for
>> > >trunk
>> > >>>>> and active branches?
>> > >>>>>
>> > >>>>> Note the change to the ref = false conditional isn't strictly
>> > >necessary
>> > >>>>> but it makes -flto and non-flto behave consistently.
>
> ...
>
>> > >> The aarch64 changes looks ok to me FWIW (since I wrote that function)
>> > >>
>> > >
>> > >I *think* this is probably OK for AArch64, but I don't think it's safe
>> > >in general.
>> > >
>> > >Consider, for example, the arm port where we can have functions built
>> > >for ARM and functions built for Thumb.  You cannot, in general, inline
>> > >either into the other, (think inline asm, or isa-specific intrinsic
>> > >operations).
>> > >
>> > >I think this approach only works at all because it's generally
>> > >meaningless to have some ISA that is the default and then construct
>> > >other functions later in the compilation that deliberately target a
>> > >smaller subset of the ISA.  But such an assumption doesn't apply when
>> > >you construct functions in completely separate ISAs which are
>> > >call-compatible but otherwise distinct.
>> >
>> > Note the patch doesn't change anything for NON-LTO, where you never see
>> > target_option_default_node. Just LTO forces those in place of NULL to
>> > make cross-TU behavior more like you suggest.
>>
>> So thinking about this some more we lack the ability to distinguish
>> between no explicit target attribute and an explicit target attribute
>> that matches the global defaults with LTO.  Without LTO it is the
>> target that decides whether to put target_option_default_node into
>> DECL_FUNCTION_SPECIFIC_TARGET but LTO treats the global options
>> as explicit target attribute for all functions of a TU.
>>
>> That said, the
>>
>>   /* If callee has no option attributes, then it is ok to inline */
>>   if (!callee_opts)
>>     ret = true;
>>
>> tries to allow inlining of functions with no explicit attribute
>> into any other function which IMHO is reasonable?
>>
>> That argues for sth like DECL_FUNCTION_SPECIFIC_TARGET_USER_P and
>> DECL_FUNCTION_SPECIFIC_OPTIMIZATION_USER_P.
>>
>> Does that make any sense?  The question is whether, in LTO context,
>> differing global options are USER_P or not ...
>>
>> OTOH as you say usually function specific opts are a superset of
>> global opts which means if the target properly checks two option
>> sets against each other the effect should be similar.
>
> Which means we can also do the following (only the default and the x86
> hook changed), always compare and properly treat a not present
> DECL_FUNCTION_SPECIFIC_TARGET as target_option_default_node.
>
> To fix PR81921 we need to make sure to process target("sse2") when
> it might change fpmath use (for -m32).  Not sure if that captures
> all cases so I wonder whether skipping the processing is not
> premature optimization -- we should do this at most once per function?
>
> Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for trunk
> for the x86 part?  Is the ix86_valid_target_attribute_tree change ok
> for branches?
>
> Note that as there is the chance this now rejects cases that it didn't
> before even w/o -flto such change isn't appropriate for the branches
> (the ix86_valid_target_attribute_tree change is).  Note I arrived
> here because enabling free-lang-data unconditionally broke bootstrap,
> so the changes to the other archs where a similar thing may happen --
> you can check with
>
> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c  (revision 251274)
> +++ gcc/tree.c  (working copy)
> @@ -5660,8 +5660,7 @@ free_lang_data (void)
>    unsigned i;
>
>    /* If we are the LTO frontend we have freed lang-specific data already.
> */
> -  if (in_lto_p
> -      || (!flag_generate_lto && !flag_generate_offload))
> +  if (in_lto_p)
>      return 0;
>
>    /* Provide a dummy TRANSLATION_UNIT_DECL if the FE failed to provide
> one.  */
>
> Richard, can you add a testcase where it is invalid to inline
> -mthumb/arm code into the other?  I suppose that requires some
> inline asm to break.  Or is there already one?
>
> Thanks,
> Richard.
>
> 2017-08-23  Richard Biener  <rguenther@suse.de>
>
>         PR target/81921
>         * targhooks.c (default_target_can_inline_p): Properly
>         use target_option_default_node when no DECL_FUNCTION_SPECIFIC_TARGET
>         is present and always compare.
>         * config/i386/i386.c (ix86_valid_target_attribute_tree): Make
>         sure to process the attribute when fpmath may change.
>         (ix86_can_inline_p): Properly use target_option_default_node when
>         no DECL_FUNCTION_SPECIFIC_TARGET is present and always compare.
>
> Index: gcc/targhooks.c
> ===================================================================
> --- gcc/targhooks.c     (revision 251275)
> +++ gcc/targhooks.c     (working copy)
> @@ -1442,27 +1442,18 @@ default_target_option_pragma_parse (tree
>  bool
>  default_target_can_inline_p (tree caller, tree callee)
>  {
> -  bool ret = false;
>    tree callee_opts = DECL_FUNCTION_SPECIFIC_TARGET (callee);
>    tree caller_opts = DECL_FUNCTION_SPECIFIC_TARGET (caller);
> -
> -  /* If callee has no option attributes, then it is ok to inline */
> -  if (!callee_opts)
> -    ret = true;
> -
> -  /* If caller has no option attributes, but callee does then it is not ok to
> -     inline */
> -  else if (!caller_opts)
> -    ret = false;
> +  if (! callee_opts)
> +    callee_opts = target_option_default_node;
> +  if (! caller_opts)
> +    caller_opts = target_option_default_node;
>
>    /* If both caller and callee have attributes, assume that if the
>       pointer is different, the two functions have different target
>       options since build_target_option_node uses a hash table for the
>       options.  */
> -  else
> -    ret = (callee_opts == caller_opts);
> -
> -  return ret;
> +  return callee_opts == caller_opts;
>  }
>
>  /* If the machine does not have a case insn that compares the bounds,
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c      (revision 251275)
> +++ gcc/config/i386/i386.c      (working copy)
> @@ -7369,7 +7369,8 @@ ix86_valid_target_attribute_tree (tree a
>        || opts->x_target_flags != def->x_target_flags
>        || option_strings[IX86_FUNCTION_SPECIFIC_ARCH]
>        || option_strings[IX86_FUNCTION_SPECIFIC_TUNE]
> -      || enum_opts_set.x_ix86_fpmath)
> +      || enum_opts_set.x_ix86_fpmath
> +      || !TARGET_64BIT_P (opts->x_ix86_isa_flags))

You should use (!TARGET_64BIT_P ... && TARGET_SSE_P ...) to match the
condition in the special fpmath processing part. (BTW: The comment in
that part is wrong, we need SSE, not sse2 on 32-bit targets to use SSE
math.)

Other x86 changes LGTM.

Uros.

>      {
>        /* If we are using the default tune= or arch=, undo the string assigned,
>          and use the default.  */
> @@ -7502,53 +7503,47 @@ ix86_valid_target_attribute_p (tree fnde
>  static bool
>  ix86_can_inline_p (tree caller, tree callee)
>  {
> -  bool ret = false;
>    tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>    tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> -
> -  /* If callee has no option attributes, then it is ok to inline.  */
>    if (!callee_tree)
> -    ret = true;
> +    callee_tree = target_option_default_node;
> +  if (!caller_tree)
> +    caller_tree = target_option_default_node;
> +  if (callee_tree == caller_tree)
> +    return true;
>
> -  /* If caller has no option attributes, but callee does then it is not ok to
> -     inline.  */
> -  else if (!caller_tree)
> +  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
> +  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
> +  bool ret = false;
> +
> +  /* Callee's isa options should be a subset of the caller's, i.e. a SSE4
> +     function can inline a SSE2 function but a SSE2 function can't inline
> +     a SSE4 function.  */
> +  if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags)
> +       != callee_opts->x_ix86_isa_flags)
> +      || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2)
> +         != callee_opts->x_ix86_isa_flags2))
>      ret = false;
>
> -  else
> -    {
> -      struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
> -      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
> +  /* See if we have the same non-isa options.  */
> +  else if (caller_opts->x_target_flags != callee_opts->x_target_flags)
> +    ret = false;
>
> -      /* Callee's isa options should be a subset of the caller's, i.e. a SSE4
> -        function can inline a SSE2 function but a SSE2 function can't inline
> -        a SSE4 function.  */
> -      if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags)
> -          != callee_opts->x_ix86_isa_flags)
> -         || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2)
> -             != callee_opts->x_ix86_isa_flags2))
> -       ret = false;
> -
> -      /* See if we have the same non-isa options.  */
> -      else if (caller_opts->x_target_flags != callee_opts->x_target_flags)
> -       ret = false;
> -
> -      /* See if arch, tune, etc. are the same.  */
> -      else if (caller_opts->arch != callee_opts->arch)
> -       ret = false;
> -
> -      else if (caller_opts->tune != callee_opts->tune)
> -       ret = false;
> +  /* See if arch, tune, etc. are the same.  */
> +  else if (caller_opts->arch != callee_opts->arch)
> +    ret = false;
>
> -      else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath)
> -       ret = false;
> +  else if (caller_opts->tune != callee_opts->tune)
> +    ret = false;
>
> -      else if (caller_opts->branch_cost != callee_opts->branch_cost)
> -       ret = false;
> +  else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath)
> +    ret = false;
>
> -      else
> -       ret = true;
> -    }
> +  else if (caller_opts->branch_cost != callee_opts->branch_cost)
> +    ret = false;
> +
> +  else
> +    ret = true;
>
>    return ret;
>  }
> Index: gcc/testsuite/gcc.target/i386/pr81921.c
> ===================================================================
> --- gcc/testsuite/gcc.target/i386/pr81921.c     (nonexistent)
> +++ gcc/testsuite/gcc.target/i386/pr81921.c     (working copy)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target lto } */
> +/* { dg-options "-flto -march=x86-64" } */
> +
> +extern __inline int  __attribute__((__gnu_inline__, __always_inline__, __artificial__, target("sse2")))
> +_mm_loadu_si128 (int const *__P)
> +{
> +    return *__P;
> +}
> +
> +void __attribute__((target("ssse3"))) foo (void *p)
> +{
> +  volatile int x = _mm_loadu_si128 (p);
> +}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]