This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 24 Apr 2014 10:22:24 +0200
- Subject: Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops
- Authentication-results: sourceware.org; auth=none
- References: <5358000B dot 4070302 at redhat dot com>
On Wed, Apr 23, 2014 at 8:01 PM, Jeff Law <law@redhat.com> wrote:
>
> The more aggressive threading across loop backedges requires invalidating
> equivalences that do not hold across all iterations of a loop.
>
> At first glance, invaliding at PHI nodes should be sufficient as any
> statement which potentially generated a new equivalence would be reprocessed
> as we come across the backedge. However, there is one important case where
> that does not hold.
>
> Specifically we might have derived a value from a conditional and the
> conditional might have been fed by a statement that doesn't produce useful
> equivalences (such as a GIMPLE_ASM). Thus the equivalence from the
> conditional is still visible because no new equivalence will be recorded for
> the GIMPLE_ASM.
>
> So if the result of the GIMPLE_ASM that gets used in the conditional varies
> from one loop iteration to the next, we could use a stale value from a prior
> iteration to thread the current iteration. That's exactly what happens in
> the ffmpeg code.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also
> verified that the sample audio in the referenced BZs no longer chops off
> after ~2 seconds.
>
> Installed on the trunk. OK for 4.9.1 after a suitable soak period on the
> trunk?
Ok, but ...
>
>
>
> commit 02269351ce3a81b5470b8137fb3c34bca27011da
> Author: Jeff Law <law@redhat.com>
> Date: Wed Apr 23 00:25:47 2014 -0600
>
> PR tree-optimization/60902
> * tree-ssa-threadedge.c
> (record_temporary_equivalences_from_stmts_at_dest): Make sure to
> invalidate outputs from statements that do not produce useful
> outputs for threading.
>
> PR tree-optimization/60902
> * gcc.target/i386/pr60902.c: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 638c0da..ddebba7 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2014-04-23 Jeff Law <law@redhat.com>
> +
> + PR tree-optimization/60902
> + * tree-ssa-threadedge.c
> + (record_temporary_equivalences_from_stmts_at_dest): Make sure to
> + invalidate outputs from statements that do not produce useful
> + outputs for threading.
> +
> 2014-04-23 Venkataramanan Kumar <venkataramanan.kumar@linaro.org>
>
> * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 126ad08..62b07f4 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-04-23 Jeff Law <law@redhat.com>
> +
> + PR tree-optimization/60902
> + * gcc.target/i386/pr60902.c: New test.
> +
> 2014-04-23 Alex Velenko <Alex.Velenko@arm.com>
>
> * gcc.target/aarch64/vdup_lane_1.c: New testcase.
> diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c
> b/gcc/testsuite/gcc.target/i386/pr60902.c
> new file mode 100644
> index 0000000..b81dcd7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr60902.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +extern void abort ();
> +extern void exit (int);
> +
> +int x;
> +
> +foo()
> +{
> + static int count;
> + count++;
> + if (count > 1)
> + abort ();
> +}
> +
> +static inline int
> +frob ()
> +{
> + int a;
> + __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x));
> + x++;
> + return a;
> +}
> +
> +int
> +main ()
> +{
> + int i;
> + for (i = 0; i < 10 && frob () == 0; i++)
> + foo();
> + exit (0);
> +}
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index c447b72..8a0103b 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge
> e,
> && (gimple_code (stmt) != GIMPLE_CALL
> || gimple_call_lhs (stmt) == NULL_TREE
> || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
> - continue;
> + {
> + /* STMT might still have DEFS and we need to invalidate any known
> + equivalences for them.
> +
> + Consider if STMT is a GIMPLE_ASM with one or more outputs that
> + feeds a conditional inside a loop. We might derive an
> equivalence
> + due to the conditional. */
> + tree op;
> + ssa_op_iter iter;
> +
> + if (backedge_seen)
> + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
You only need SSA_OP_DEF here, no need to process virtual
operands.
> + {
> + /* This call only invalidates equivalences created by
> + PHI nodes. This is by design to keep the cost of
> + of invalidation reasonable. */
> + invalidate_equivalences (op, stack, src_map, dst_map);
> +
> + /* However, conditionals can imply values for real
> + operands as well. And those won't be recorded in the
> + maps. In fact, those equivalences may be recorded
> totally
> + outside the threading code. We can just create a new
> + temporary NULL equivalence here. */
> + record_temporary_equivalence (op, NULL_TREE, stack);
> + }
> +
> + continue;
> + }
>
> /* The result of __builtin_object_size depends on all the arguments
> of a phi node. Temporarily using only one edge produces invalid
>