This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [4.9.1 RFA] [tree-optimization/60902] Invalidate outputs of GIMPLE_ASMs when threading around loops


On Wed, Apr 23, 2014 at 8:01 PM, Jeff Law <law@redhat.com> wrote:
>
> The more aggressive threading across loop backedges requires invalidating
> equivalences that do not hold across all iterations of a loop.
>
> At first glance, invaliding at PHI nodes should be sufficient as any
> statement which potentially generated a new equivalence would be reprocessed
> as we come across the backedge.  However, there is one important case where
> that does not hold.
>
> Specifically we might have derived a value from a conditional and the
> conditional might have been fed by a statement that doesn't produce useful
> equivalences (such as a GIMPLE_ASM).  Thus the equivalence from the
> conditional is still visible because no new equivalence will be recorded for
> the GIMPLE_ASM.
>
> So if the result of the GIMPLE_ASM that gets used in the conditional varies
> from one loop iteration to the next, we could use a stale value from a prior
> iteration to thread the current iteration.  That's exactly what happens in
> the ffmpeg code.
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  Also
> verified that the sample audio in the referenced BZs no longer chops off
> after ~2 seconds.
>
> Installed on the trunk.  OK for 4.9.1 after a suitable soak period on the
> trunk?

Ok, but ...

>
>
>
> commit 02269351ce3a81b5470b8137fb3c34bca27011da
> Author: Jeff Law <law@redhat.com>
> Date:   Wed Apr 23 00:25:47 2014 -0600
>
>         PR tree-optimization/60902
>         * tree-ssa-threadedge.c
>         (record_temporary_equivalences_from_stmts_at_dest): Make sure to
>         invalidate outputs from statements that do not produce useful
>         outputs for threading.
>
>         PR tree-optimization/60902
>         * gcc.target/i386/pr60902.c: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 638c0da..ddebba7 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2014-04-23  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/60902
> +       * tree-ssa-threadedge.c
> +       (record_temporary_equivalences_from_stmts_at_dest): Make sure to
> +       invalidate outputs from statements that do not produce useful
> +       outputs for threading.
> +
>  2014-04-23 Venkataramanan Kumar  <venkataramanan.kumar@linaro.org>
>
>         * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 126ad08..62b07f4 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-04-23  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/60902
> +       * gcc.target/i386/pr60902.c: New test.
> +
>  2014-04-23  Alex Velenko  <Alex.Velenko@arm.com>
>
>         * gcc.target/aarch64/vdup_lane_1.c: New testcase.
> diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c
> b/gcc/testsuite/gcc.target/i386/pr60902.c
> new file mode 100644
> index 0000000..b81dcd7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr60902.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +extern void abort ();
> +extern void exit (int);
> +
> +int x;
> +
> +foo()
> +{
> +  static int count;
> +  count++;
> +  if (count > 1)
> +    abort ();
> +}
> +
> +static inline int
> +frob ()
> +{
> +  int a;
> +  __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x));
> +  x++;
> +  return a;
> +}
> +
> +int
> +main ()
> +{
> +  int i;
> +  for (i = 0; i < 10 && frob () == 0; i++)
> +    foo();
> +  exit (0);
> +}
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index c447b72..8a0103b 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge
> e,
>            && (gimple_code (stmt) != GIMPLE_CALL
>                || gimple_call_lhs (stmt) == NULL_TREE
>                || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
> -       continue;
> +       {
> +         /* STMT might still have DEFS and we need to invalidate any known
> +            equivalences for them.
> +
> +            Consider if STMT is a GIMPLE_ASM with one or more outputs that
> +            feeds a conditional inside a loop.  We might derive an
> equivalence
> +            due to the conditional.  */
> +         tree op;
> +         ssa_op_iter iter;
> +
> +         if (backedge_seen)
> +           FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)

You only need SSA_OP_DEF here, no need to process virtual
operands.

> +             {
> +               /* This call only invalidates equivalences created by
> +                  PHI nodes.  This is by design to keep the cost of
> +                  of invalidation reasonable.  */
> +               invalidate_equivalences (op, stack, src_map, dst_map);
> +
> +               /* However, conditionals can imply values for real
> +                  operands as well.  And those won't be recorded in the
> +                  maps.  In fact, those equivalences may be recorded
> totally
> +                  outside the threading code.  We can just create a new
> +                  temporary NULL equivalence here.  */
> +               record_temporary_equivalence (op, NULL_TREE, stack);
> +             }
> +
> +         continue;
> +       }
>
>        /* The result of __builtin_object_size depends on all the arguments
>          of a phi node. Temporarily using only one edge produces invalid
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]