This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Work around PR65873
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 13 May 2015 15:12:50 +0200
- Subject: Re: Work around PR65873
- Authentication-results: sourceware.org; auth=none
- References: <20150512221234 dot GA54352 at kam dot mff dot cuni dot cz>
On Wed, May 13, 2015 at 12:12 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>
> Hi,
> this patch works around PR where we refuse to inline always_inline memcpy
> into function with explicit Ofast optimization attribute. This is because
> we think we can not promote -fno-fast-math code to -fast-math code.
> This is of course safe for memcpy because it contains to fast-math code,
> but we don't really do the analysis for each of the flags we check.
>
> Earlier compilers was happily producing wrong code here and it seems practical
> to do that on GCC 5 to deal with common falout. I will implement the more
> fine grained check incrementally.
>
> Bootstrapped/regtested x86_64-linux. Will commit it to mainline shortly and
> to release branch later this week.
Hmm, the changelog or the description above doesn't cover the
@@ -481,6 +490,17 @@ can_inline_edge_p (struct cgraph_edge *e
e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
inlinable = false;
}
+ /* If explicit optimize attribute are not used, the mismatch is caused
+ by different command line options used to build different units.
+ Do not care about COMDAT functions - those are intended to be
+ optimized with the optimization flags of module they are used in.
+ Also do not care about mixing up size/speed optimization when
+ DECL_DISREGARD_INLINE_LIMITS is set. */
+ else if ((callee->merged
+ && !lookup_attribute ("optimize",
+ DECL_ATTRIBUTES (caller->decl)))
+ || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
+ ;
/* If mismatch is caused by merging two LTO units with different
optimizationflags we want to be bit nicer. However never inline
if one of functions is not optimized at all. */
hunk.
> PR ipa/65873
> * ipa-inline.c (can_inline_edge_p): Allow early inlining of always
> inlines across optimization boundary.
> * testsuite/gcc.c-torture/compile/pr65873.c: New testcase.
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c (revision 223093)
> +++ ipa-inline.c (working copy)
> @@ -427,46 +427,55 @@ can_inline_edge_p (struct cgraph_edge *e
> && lookup_attribute ("always_inline",
> DECL_ATTRIBUTES (callee->decl)));
>
> + /* Until GCC 4.9 we did not check the semantics alterning flags
> + bellow and inline across optimization boundry.
> + Enabling checks bellow breaks several packages by refusing
> + to inline library always_inline functions. See PR65873.
> + Disable the check for early inlining for now until better solution
> + is found. */
> + if (always_inline && early)
> + ;
> /* There are some options that change IL semantics which means
> we cannot inline in these cases for correctness reason.
> Not even for always_inline declared functions. */
> /* Strictly speaking only when the callee contains signed integer
> math where overflow is undefined. */
> - if ((check_maybe_up (flag_strict_overflow)
> - /* this flag is set by optimize. Allow inlining across
> - optimize boundary. */
> - && (!opt_for_fn (caller->decl, optimize)
> - == !opt_for_fn (callee->decl, optimize) || !always_inline))
> - || check_match (flag_wrapv)
> - || check_match (flag_trapv)
> - /* Strictly speaking only when the callee uses FP math. */
> - || check_maybe_up (flag_rounding_math)
> - || check_maybe_up (flag_trapping_math)
> - || check_maybe_down (flag_unsafe_math_optimizations)
> - || check_maybe_down (flag_finite_math_only)
> - || check_maybe_up (flag_signaling_nans)
> - || check_maybe_down (flag_cx_limited_range)
> - || check_maybe_up (flag_signed_zeros)
> - || check_maybe_down (flag_associative_math)
> - || check_maybe_down (flag_reciprocal_math)
> - /* We do not want to make code compiled with exceptions to be brought
> - into a non-EH function unless we know that the callee does not
> - throw. This is tracked by DECL_FUNCTION_PERSONALITY. */
> - || (check_match (flag_non_call_exceptions)
> - /* TODO: We also may allow bringing !flag_non_call_exceptions
> - to flag_non_call_exceptions function, but that may need
> - extra work in tree-inline to add the extra EH edges. */
> - && (!opt_for_fn (callee->decl, flag_non_call_exceptions)
> - || DECL_FUNCTION_PERSONALITY (callee->decl)))
> - || (check_maybe_up (flag_exceptions)
> - && DECL_FUNCTION_PERSONALITY (callee->decl))
> - /* Strictly speaking only when the callee contains function
> - calls that may end up setting errno. */
> - || check_maybe_up (flag_errno_math)
> - /* When devirtualization is diabled for callee, it is not safe
> - to inline it as we possibly mangled the type info.
> - Allow early inlining of always inlines. */
> - || (!early && check_maybe_down (flag_devirtualize)))
> + else if ((check_maybe_up (flag_strict_overflow)
> + /* this flag is set by optimize. Allow inlining across
> + optimize boundary. */
> + && (!opt_for_fn (caller->decl, optimize)
> + == !opt_for_fn (callee->decl, optimize) || !always_inline))
> + || check_match (flag_wrapv)
> + || check_match (flag_trapv)
> + /* Strictly speaking only when the callee uses FP math. */
> + || check_maybe_up (flag_rounding_math)
> + || check_maybe_up (flag_trapping_math)
> + || check_maybe_down (flag_unsafe_math_optimizations)
> + || check_maybe_down (flag_finite_math_only)
> + || check_maybe_up (flag_signaling_nans)
> + || check_maybe_down (flag_cx_limited_range)
> + || check_maybe_up (flag_signed_zeros)
> + || check_maybe_down (flag_associative_math)
> + || check_maybe_down (flag_reciprocal_math)
> + /* We do not want to make code compiled with exceptions to be
> + brought into a non-EH function unless we know that the callee
> + does not throw.
> + This is tracked by DECL_FUNCTION_PERSONALITY. */
> + || (check_match (flag_non_call_exceptions)
> + /* TODO: We also may allow bringing !flag_non_call_exceptions
> + to flag_non_call_exceptions function, but that may need
> + extra work in tree-inline to add the extra EH edges. */
> + && (!opt_for_fn (callee->decl, flag_non_call_exceptions)
> + || DECL_FUNCTION_PERSONALITY (callee->decl)))
> + || (check_maybe_up (flag_exceptions)
> + && DECL_FUNCTION_PERSONALITY (callee->decl))
> + /* Strictly speaking only when the callee contains function
> + calls that may end up setting errno. */
> + || check_maybe_up (flag_errno_math)
> + /* When devirtualization is diabled for callee, it is not safe
> + to inline it as we possibly mangled the type info.
> + Allow early inlining of always inlines. */
> + || (!early && check_maybe_down (flag_devirtualize)))
> {
> e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> inlinable = false;
> @@ -481,6 +490,17 @@ can_inline_edge_p (struct cgraph_edge *e
> e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
> inlinable = false;
> }
> + /* If explicit optimize attribute are not used, the mismatch is caused
> + by different command line options used to build different units.
> + Do not care about COMDAT functions - those are intended to be
> + optimized with the optimization flags of module they are used in.
> + Also do not care about mixing up size/speed optimization when
> + DECL_DISREGARD_INLINE_LIMITS is set. */
> + else if ((callee->merged
> + && !lookup_attribute ("optimize",
> + DECL_ATTRIBUTES (caller->decl)))
> + || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
> + ;
> /* If mismatch is caused by merging two LTO units with different
> optimizationflags we want to be bit nicer. However never inline
> if one of functions is not optimized at all. */
> Index: testsuite/gcc.c-torture/compile/pr65873.c
> ===================================================================
> --- testsuite/gcc.c-torture/compile/pr65873.c (revision 0)
> +++ testsuite/gcc.c-torture/compile/pr65873.c (revision 0)
> @@ -0,0 +1,14 @@
> +typedef __SIZE_TYPE__ size_t;
> +
> +extern inline __attribute__ ((__always_inline__, __gnu_inline__, __artificial__, __nothrow__, __leaf__)) void *
> +memcpy (void *__restrict __dest, const void *__restrict __src, size_t __len)
> +{
> + return __builtin___memcpy_chk (__dest, __src, __len, __builtin_object_size (__dest, 0));
> +}
> +
> +__attribute__((optimize ("Ofast"))) void
> +bar (void *d, void *s, size_t l)
> +{
> + memcpy (d, s, l);
> +}
> +