This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Work around PR65873
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 13 May 2015 16:16:51 +0200
- Subject: Re: Work around PR65873
- Authentication-results: sourceware.org; auth=none
- References: <20150512221234 dot GA54352 at kam dot mff dot cuni dot cz> <CAFiYyc1pZH5FqVchrOPy6S2Z2N-_CwZTyM=f8LGdc6rZYmt=sg at mail dot gmail dot com>
> On Wed, May 13, 2015 at 12:12 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >
> > Hi,
> > this patch works around PR where we refuse to inline always_inline memcpy
> > into function with explicit Ofast optimization attribute. This is because
> > we think we can not promote -fno-fast-math code to -fast-math code.
> > This is of course safe for memcpy because it contains to fast-math code,
> > but we don't really do the analysis for each of the flags we check.
> >
> > Earlier compilers was happily producing wrong code here and it seems practical
> > to do that on GCC 5 to deal with common falout. I will implement the more
> > fine grained check incrementally.
> >
> > Bootstrapped/regtested x86_64-linux. Will commit it to mainline shortly and
> > to release branch later this week.
>
> Hmm, the changelog or the description above doesn't cover the
>
> @@ -481,6 +490,17 @@ can_inline_edge_p (struct cgraph_edge *e
> e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> inlinable = false;
> }
> + /* If explicit optimize attribute are not used, the mismatch is caused
> + by different command line options used to build different units.
> + Do not care about COMDAT functions - those are intended to be
> + optimized with the optimization flags of module they are used in.
> + Also do not care about mixing up size/speed optimization when
> + DECL_DISREGARD_INLINE_LIMITS is set. */
> + else if ((callee->merged
> + && !lookup_attribute ("optimize",
> + DECL_ATTRIBUTES (caller->decl)))
> + || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
> + ;
> /* If mismatch is caused by merging two LTO units with different
> optimizationflags we want to be bit nicer. However never inline
> if one of functions is not optimized at all. */
>
> hunk.
Ah, sorry. The change was intended, but I forgot to write about it.
Yep, as comment says this makes us a bit more relaxed about inlining COMDATs
when the prevailing variant is -Os and caller body is -O3 or vice versa.
Honza
>
> > PR ipa/65873
> > * ipa-inline.c (can_inline_edge_p): Allow early inlining of always
> > inlines across optimization boundary.
> > * testsuite/gcc.c-torture/compile/pr65873.c: New testcase.
> > Index: ipa-inline.c
> > ===================================================================
> > --- ipa-inline.c (revision 223093)
> > +++ ipa-inline.c (working copy)
> > @@ -427,46 +427,55 @@ can_inline_edge_p (struct cgraph_edge *e
> > && lookup_attribute ("always_inline",
> > DECL_ATTRIBUTES (callee->decl)));
> >
> > + /* Until GCC 4.9 we did not check the semantics alterning flags
> > + bellow and inline across optimization boundry.
> > + Enabling checks bellow breaks several packages by refusing
> > + to inline library always_inline functions. See PR65873.
> > + Disable the check for early inlining for now until better solution
> > + is found. */
> > + if (always_inline && early)
> > + ;
> > /* There are some options that change IL semantics which means
> > we cannot inline in these cases for correctness reason.
> > Not even for always_inline declared functions. */
> > /* Strictly speaking only when the callee contains signed integer
> > math where overflow is undefined. */
> > - if ((check_maybe_up (flag_strict_overflow)
> > - /* this flag is set by optimize. Allow inlining across
> > - optimize boundary. */
> > - && (!opt_for_fn (caller->decl, optimize)
> > - == !opt_for_fn (callee->decl, optimize) || !always_inline))
> > - || check_match (flag_wrapv)
> > - || check_match (flag_trapv)
> > - /* Strictly speaking only when the callee uses FP math. */
> > - || check_maybe_up (flag_rounding_math)
> > - || check_maybe_up (flag_trapping_math)
> > - || check_maybe_down (flag_unsafe_math_optimizations)
> > - || check_maybe_down (flag_finite_math_only)
> > - || check_maybe_up (flag_signaling_nans)
> > - || check_maybe_down (flag_cx_limited_range)
> > - || check_maybe_up (flag_signed_zeros)
> > - || check_maybe_down (flag_associative_math)
> > - || check_maybe_down (flag_reciprocal_math)
> > - /* We do not want to make code compiled with exceptions to be brought
> > - into a non-EH function unless we know that the callee does not
> > - throw. This is tracked by DECL_FUNCTION_PERSONALITY. */
> > - || (check_match (flag_non_call_exceptions)
> > - /* TODO: We also may allow bringing !flag_non_call_exceptions
> > - to flag_non_call_exceptions function, but that may need
> > - extra work in tree-inline to add the extra EH edges. */
> > - && (!opt_for_fn (callee->decl, flag_non_call_exceptions)
> > - || DECL_FUNCTION_PERSONALITY (callee->decl)))
> > - || (check_maybe_up (flag_exceptions)
> > - && DECL_FUNCTION_PERSONALITY (callee->decl))
> > - /* Strictly speaking only when the callee contains function
> > - calls that may end up setting errno. */
> > - || check_maybe_up (flag_errno_math)
> > - /* When devirtualization is diabled for callee, it is not safe
> > - to inline it as we possibly mangled the type info.
> > - Allow early inlining of always inlines. */
> > - || (!early && check_maybe_down (flag_devirtualize)))
> > + else if ((check_maybe_up (flag_strict_overflow)
> > + /* this flag is set by optimize. Allow inlining across
> > + optimize boundary. */
> > + && (!opt_for_fn (caller->decl, optimize)
> > + == !opt_for_fn (callee->decl, optimize) || !always_inline))
> > + || check_match (flag_wrapv)
> > + || check_match (flag_trapv)
> > + /* Strictly speaking only when the callee uses FP math. */
> > + || check_maybe_up (flag_rounding_math)
> > + || check_maybe_up (flag_trapping_math)
> > + || check_maybe_down (flag_unsafe_math_optimizations)
> > + || check_maybe_down (flag_finite_math_only)
> > + || check_maybe_up (flag_signaling_nans)
> > + || check_maybe_down (flag_cx_limited_range)
> > + || check_maybe_up (flag_signed_zeros)
> > + || check_maybe_down (flag_associative_math)
> > + || check_maybe_down (flag_reciprocal_math)
> > + /* We do not want to make code compiled with exceptions to be
> > + brought into a non-EH function unless we know that the callee
> > + does not throw.
> > + This is tracked by DECL_FUNCTION_PERSONALITY. */
> > + || (check_match (flag_non_call_exceptions)
> > + /* TODO: We also may allow bringing !flag_non_call_exceptions
> > + to flag_non_call_exceptions function, but that may need
> > + extra work in tree-inline to add the extra EH edges. */
> > + && (!opt_for_fn (callee->decl, flag_non_call_exceptions)
> > + || DECL_FUNCTION_PERSONALITY (callee->decl)))
> > + || (check_maybe_up (flag_exceptions)
> > + && DECL_FUNCTION_PERSONALITY (callee->decl))
> > + /* Strictly speaking only when the callee contains function
> > + calls that may end up setting errno. */
> > + || check_maybe_up (flag_errno_math)
> > + /* When devirtualization is diabled for callee, it is not safe
> > + to inline it as we possibly mangled the type info.
> > + Allow early inlining of always inlines. */
> > + || (!early && check_maybe_down (flag_devirtualize)))
> > {
> > e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> > inlinable = false;
> > @@ -481,6 +490,17 @@ can_inline_edge_p (struct cgraph_edge *e
> > e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> > inlinable = false;
> > }
> > + /* If explicit optimize attribute are not used, the mismatch is caused
> > + by different command line options used to build different units.
> > + Do not care about COMDAT functions - those are intended to be
> > + optimized with the optimization flags of module they are used in.
> > + Also do not care about mixing up size/speed optimization when
> > + DECL_DISREGARD_INLINE_LIMITS is set. */
> > + else if ((callee->merged
> > + && !lookup_attribute ("optimize",
> > + DECL_ATTRIBUTES (caller->decl)))
> > + || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
> > + ;
> > /* If mismatch is caused by merging two LTO units with different
> > optimizationflags we want to be bit nicer. However never inline
> > if one of functions is not optimized at all. */
> > Index: testsuite/gcc.c-torture/compile/pr65873.c
> > ===================================================================
> > --- testsuite/gcc.c-torture/compile/pr65873.c (revision 0)
> > +++ testsuite/gcc.c-torture/compile/pr65873.c (revision 0)
> > @@ -0,0 +1,14 @@
> > +typedef __SIZE_TYPE__ size_t;
> > +
> > +extern inline __attribute__ ((__always_inline__, __gnu_inline__, __artificial__, __nothrow__, __leaf__)) void *
> > +memcpy (void *__restrict __dest, const void *__restrict __src, size_t __len)
> > +{
> > + return __builtin___memcpy_chk (__dest, __src, __len, __builtin_object_size (__dest, 0));
> > +}
> > +
> > +__attribute__((optimize ("Ofast"))) void
> > +bar (void *d, void *s, size_t l)
> > +{
> > + memcpy (d, s, l);
> > +}
> > +