[PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)

Aldy Hernandez aldyh@redhat.com
Tue Jan 3 17:37:00 GMT 2017


On 12/20/2016 09:16 AM, Richard Biener wrote:
> On Fri, Dec 16, 2016 at 3:41 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> Hi folks.
>>
>> This is a follow-up on Jeff and Richi's interaction on the aforementioned PR
>> here:
>>
>>         https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01397.html
>>
>> I decided to explore the idea of analyzing may-undefness on-demand, which
>> actually looks rather cheap.
>>
>> BTW, I don't understand why we don't have auto_bitmap's, as we already have
>> auto_sbitmap's.  I've implemented the former based on auto_sbitmap's code we
>> already have.
>>
>> The attached patch fixes the bug without introducing any regressions.
>>
>> I also tested the patch by compiling 242 .ii files with -O3.  These were
>> gathered from a stage1 build with -save-temps.  There is a slight time
>> degradation of 4 seconds within 27 minutes of user time:
>>
>>         tainted:        26:52
>>         orig:           26:48
>>
>> This was the average aggregate time of two runs compiling all 242 .ii files.
>> IMO, this looks reasonable.  It is after all, -O3.    Is it acceptable?
>
> +  while (!worklist.is_empty ())
> +    {
> +      name = worklist.pop ();
> +      gcc_assert (TREE_CODE (name) == SSA_NAME);
> +
> +      if (ssa_undefined_value_p (name, true))
> +       return true;
> +
> +      bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
>
> it should be already set as we use visited_ssa as "was it ever on the worklist",
> so maybe renaming it would be a good thing as well.

I don't understand what you would prefer here.

>
> +             if (TREE_CODE (name) == SSA_NAME)
> +               {
> +                 /* If an SSA has already been seen, this may be a loop.
> +                    Fail conservatively.  */
> +                 if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
> +                   return false;
>
> so to me "conservative" is returning true, not false.

OK

>
> +                 bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
> +                 worklist.safe_push (name);
>
> but for loops we can just continue and ignore this use.  And bitmap_set_bit
> returns whether it set a bit, thus
>
>                 if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name)))
>                   worklist.safe_push (name);
>
> should work?

Fixed.

>
> +      /* Check that any SSA names used to define NAME is also fully
> +        defined.  */
> +      use_operand_p use_p;
> +      ssa_op_iter iter;
> +      FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE)
> +       {
> +         name = USE_FROM_PTR (use_p);
> +         if (TREE_CODE (name) == SSA_NAME)
>
> always true.
>
> +           {
> +             /* If an SSA has already been seen, this may be a loop.
> +                Fail conservatively.  */
> +             if (bitmap_bit_p (visited_ssa, SSA_NAME_VERSION (name)))
> +               return false;
> +             bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (name));
> +             worklist.safe_push (name);
>
> See above.
>
> In principle the thing is sound but I'd like to be able to pass in a bitmap of
> known maybe-undefined/must-defined SSA names to have a cache for
> multiple invocations from within a pass (like this unswitching case).

Done, though bitmaps are now calculated as part of the instantiation.

>
> Also once you hit defs that are in a post-dominated region of the loop entry
> you can treat them as not undefined (as their use invokes undefined
> behavior).  This is also how you treat function parameters (well,
> ssa_undefined_value_p does), where the call site invokes undefined behavior
> when passing in undefined values.  So we need an extra parameter specifying
> the post-dominance region.

Done.

>
> You do not handle memory or calls conservatively which means the existing
> testcase only needs some obfuscation to become a problem again.  To fix
> that before /* Check that any SSA names used to define NAME is also fully
> defined.  */ bail out conservatively, like
>
>    if (! is_gimple_assign (def)
>       || gimple_assign_single_p (def))
>     return true;

As I asked previously, I understand the !is_gimple_assign, which will 
skip over GIMPLE_CALLs setting a value, but the 
"gimple_assign_single_p(def) == true"??

Won't this last one bail on things like e.3_7 = e, or x_77 = y_88? Don't 
we want to follow up the def chain precisely on these?

The attached implementation uses a cache, and a pre-computed 
post-dominance region.  A previous incantation of this patch (sans the 
post-dominance stuff) had performance within the noise of the previous 
implementation.

I am testing again, and will do some performance comparisons in a bit, 
but for now-- are we on the same page here?  Is this what you wanted?

Aldy

p.s. I could turn the post-dominance region into a bitmap for faster 
access if preferred.
-------------- next part --------------
commit 47d0c1b3144d4d56405d72c3ad55183d8632d0a7
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Fri Dec 16 03:44:52 2016 -0500

            PR tree-optimization/71691
            * bitmap.h (class auto_bitmap): New.
            * tree-ssa-defined-or-undefined.c: New file.
            * tree-ssa-defined-or-undefined.h: New file.
            * Makefile.in (tree-ssa-defined-or-undefined.o): New.
            * tree-ssa-loop-unswitch.c (tree_ssa_unswitch_loops): Calculate
            post dominance info.
            (tree_may_unswitch_on): Add defined_or_undefined parameter.
            Use defined_or_undefined instead of ssa_undefined_value_p.
            (tree_unswitch_single_loop): Instantiate defined_or_undefined
            variable.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2aae684..beff302 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1511,6 +1511,7 @@ OBJS = \
 	tree-ssa-coalesce.o \
 	tree-ssa-copy.o \
 	tree-ssa-dce.o \
+	tree-ssa-defined-or-undefined.o \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
 	tree-ssa-forwprop.o \
diff --git a/gcc/bitmap.h b/gcc/bitmap.h
index 1b2c8e0..bc32a21 100644
--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -802,4 +802,25 @@ bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
        bmp_iter_and_compl (&(ITER), &(BITNUM));				\
        bmp_iter_next (&(ITER), &(BITNUM)))
 
+/* A class that ties the lifetime of a bitmap to its scope.  */
+class auto_bitmap
+{
+ public:
+  auto_bitmap () { bits = BITMAP_ALLOC (NULL); }
+  ~auto_bitmap () { BITMAP_FREE (bits); }
+  // Allow calling bitmap functions on our bitmap.
+  operator bitmap () { return bits; }
+
+ private:
+  // Prevent making a copy that references our bitmap.
+  auto_bitmap (const auto_bitmap &);
+  auto_bitmap &operator = (const auto_bitmap &);
+#if __cplusplus >= 201103L
+  auto_bitmap (auto_bitmap &&);
+  auto_bitmap &operator = (auto_bitmap &&);
+#endif
+
+  bitmap bits;
+};
+
 #endif /* GCC_BITMAP_H */
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr71691.c b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
new file mode 100644
index 0000000..1ffd1a2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
@@ -0,0 +1,47 @@
+/* { dg-options "-fno-tree-vrp" } */
+
+char b;
+short f;
+unsigned e;
+int g = 20;
+
+void
+foo ()
+{
+  int l, h;
+  for (l = 0; l <= 7; l++)
+    {
+      int j = 38;
+      if (g)
+	h = 0;
+      for (; h <= 7; h++)
+	{
+	  int i, k = b % (j % 4);
+	  g = f;
+	  for (;;)
+	    {
+	      j = 6 || b;
+	      if (e)
+		{
+		  for (; j; --j)
+		    if (k)
+		      __builtin_printf ("%d", 9);
+		  if (i)
+		    __builtin_printf ("%d", j);
+		}
+	      if (l)
+		continue;
+	      break;
+	    }
+	  i = f || b;
+	}
+    }
+}
+
+int
+main ()
+{
+  foo ();
+  return 0;
+}
+
diff --git a/gcc/tree-ssa-defined-or-undefined.c b/gcc/tree-ssa-defined-or-undefined.c
new file mode 100644
index 0000000..f845eba
--- /dev/null
+++ b/gcc/tree-ssa-defined-or-undefined.c
@@ -0,0 +1,134 @@
+/* Simple class for classifying SSA_NAMEs that are either fully
+   defined or possibly undefined.
+
+   This is meant to support conservative analysis for optimization
+   purposes, not for generating warnings.  The analysis for generating
+   warnings is deeper to avoid generation of false positives.
+
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "ssa.h"
+#include "tree-ssa.h"
+#include "tree-ssa-defined-or-undefined.h"
+
+/* Return TRUE if an SSA_NAME maybe undefined.  */
+
+bool
+defined_or_undefined::is_maybe_undefined (const tree name)
+{
+  if (in_cache (name))
+    return bitmap_bit_p (b_maybe_undef, SSA_NAME_VERSION (name));
+
+  auto_bitmap visited_ssa;
+  auto_vec<tree> worklist;
+  worklist.safe_push (name);
+  while (!worklist.is_empty ())
+    {
+      tree t = worklist.pop ();
+
+      /* If it's obviously undefined, avoid further computations.  */
+      if (ssa_undefined_value_p (t, true))
+	{
+	  set_maybe_undefined (t);
+	  if (t != name)
+	    set_maybe_undefined (name);
+	  return true;
+	}
+
+      bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (t));
+
+      /* A PARM_DECL will not have an SSA_NAME_DEF_STMT.  Parameters
+	 get their initial value from function entry.  */
+      if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL)
+	{
+	  set_defined (t);
+	  continue;
+	}
+
+      gimple *def = SSA_NAME_DEF_STMT (t);
+
+      /* DEFs in the post-dominated region of the loop entry invoke
+	 undefined behavior.  Adding any use won't make things any
+	 worse.  */
+      for (unsigned i = 0; i < postdom.length (); ++i)
+	if (def->bb == postdom[i])
+	  {
+	    set_defined (t);
+	    return false;
+	  }
+
+      /* Check that all the PHI args are fully defined.  */
+      if (gphi *phi = dyn_cast <gphi *> (def))
+	{
+	  if (virtual_operand_p (PHI_RESULT (phi)))
+	    continue;
+	  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+	    {
+	      tree t = gimple_phi_arg_def (phi, i);
+	      /* If an SSA has already been seen, it may be a loop,
+		 but we can continue and ignore this use.  Otherwise,
+		 add the SSA_NAME to the queue and visit it later.  */
+	      if (TREE_CODE (t) == SSA_NAME
+		  && bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (t)))
+		worklist.safe_push (t);
+	    }
+	  continue;
+	}
+
+      /* Handle memory and calls conservatively.  */
+      if (!is_gimple_assign (def)
+	  /* ?? Do we really want this?  The code below will return
+	     TRUE for something like "e.3_7 = e", which is probably
+	     not what we want.
+	  || gimple_assign_single_p (def)
+	  */
+	  )
+	{
+	  set_maybe_undefined (name);
+	  return true;
+	}
+
+      /* Check that any SSA names used to define NAME is also fully
+	 defined.  */
+      use_operand_p use_p;
+      ssa_op_iter iter;
+      FOR_EACH_SSA_USE_OPERAND (use_p, def, iter, SSA_OP_USE)
+	{
+	  tree t = USE_FROM_PTR (use_p);
+	  /* If an SSA has already been seen, it may be a loop,
+	     but we can continue and ignore this use.  Otherwise,
+	     add the SSA_NAME to the queue and visit it later.  */
+	  if (bitmap_set_bit (visited_ssa, SSA_NAME_VERSION (t)))
+	    worklist.safe_push (t);
+	}
+    }
+  set_defined (name);
+  return false;
+}
diff --git a/gcc/tree-ssa-defined-or-undefined.h b/gcc/tree-ssa-defined-or-undefined.h
new file mode 100644
index 0000000..23e306a
--- /dev/null
+++ b/gcc/tree-ssa-defined-or-undefined.h
@@ -0,0 +1,73 @@
+/* Simple class for identifying SSA_NAMEs that are either fully defined
+   or maybe undefined.
+
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_TREE_SSA_DEFINED_OR_UNDEFINED_H
+#define GCC_TREE_SSA_DEFINED_OR_UNDEFINED_H
+
+/* Simple class for identifying SSA_NAMEs that are either fully
+   defined or maybe undefined.
+
+   This is meant to support conservative analysis for optimization
+   purposes, not for generating warnings.  The analysis for generating
+   warnings is deeper to avoid generation of false positives.
+
+   Instantiation of this class triggers analysis which is not kept
+   up-to-date as changes in the IL or CFG are made.
+
+   Queries will return the conservative result (maybe undefined) for
+   SSA_NAMEs that were not encountered during the initial
+   analysis.  */
+
+class defined_or_undefined
+{
+ private:
+  // Bitmap specifying which SSAs are in the cache.
+  auto_bitmap b_seen;
+  // Bit is set if a given SSA is maybe undefined, otherwise the given
+  // SSA is definitely defined.
+  auto_bitmap b_maybe_undef;
+  // The post-dominance region of the current loop's entry.
+  vec<basic_block> postdom;
+
+  // Return TRUE if SSA has been previously cached.
+  bool in_cache (tree ssa)
+  { return bitmap_bit_p (b_seen, SSA_NAME_VERSION (ssa)); }
+  // Cache SSA to be maybe undefined.
+  void set_maybe_undefined (tree ssa)
+  { bitmap_set_bit (b_seen, SSA_NAME_VERSION (ssa));
+    bitmap_set_bit (b_maybe_undef, SSA_NAME_VERSION (ssa)); }
+  // Cache SSA to be definitely defined.
+  void set_defined (tree ssa)
+  { bitmap_set_bit (b_seen, SSA_NAME_VERSION (ssa));
+    bitmap_clear_bit (b_maybe_undef, SSA_NAME_VERSION (ssa)); }
+ public:
+  defined_or_undefined ();
+  // ENTRY is the current loop's entry block.
+  defined_or_undefined (basic_block entry)
+    { postdom = get_dominated_by_region (CDI_POST_DOMINATORS, &entry, 1); }
+  // Return TRUE if SSA is maybe undefined.
+  bool is_maybe_undefined (tree ssa);
+  // Return TRUE if SSA is definitely defined.
+  bool is_defined (tree ssa) { return !is_maybe_undefined (ssa); }
+};
+
+#endif // GCC_TREE_SSA_DEFINED_OR_UNDEFINED_H
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 40905af..ea4b68d 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "cfghooks.h"
 #include "tree-ssa-loop-manip.h"
+#include "tree-ssa-defined-or-undefined.h"
 
 /* This file implements the loop unswitching, i.e. transformation of loops like
 
@@ -77,7 +78,8 @@ along with GCC; see the file COPYING3.  If not see
 
 static struct loop *tree_unswitch_loop (struct loop *, basic_block, tree);
 static bool tree_unswitch_single_loop (struct loop *, int);
-static tree tree_may_unswitch_on (basic_block, struct loop *);
+static tree tree_may_unswitch_on (basic_block, struct loop *,
+				  defined_or_undefined *);
 static bool tree_unswitch_outer_loop (struct loop *);
 static edge find_loop_guard (struct loop *);
 static bool empty_bb_without_guard_p (struct loop *, basic_block);
@@ -94,6 +96,9 @@ tree_ssa_unswitch_loops (void)
   struct loop *loop;
   bool changed = false;
 
+  /* Used for defined_or_undefined use later.  */
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+
   /* Go through all loops starting from innermost.  */
   FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
     {
@@ -104,6 +109,8 @@ tree_ssa_unswitch_loops (void)
 	changed |= tree_unswitch_outer_loop (loop);
     }
 
+  free_dominance_info (CDI_POST_DOMINATORS);
+
   if (changed)
     return TODO_cleanup_cfg;
   return 0;
@@ -113,7 +120,8 @@ tree_ssa_unswitch_loops (void)
    basic blocks (for what it means see comments below).  */
 
 static tree
-tree_may_unswitch_on (basic_block bb, struct loop *loop)
+tree_may_unswitch_on (basic_block bb, struct loop *loop,
+		      defined_or_undefined *defined_or_undefined)
 {
   gimple *last, *def;
   gcond *stmt;
@@ -138,7 +146,7 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop)
     {
       /* Unswitching on undefined values would introduce undefined
 	 behavior that the original program might never exercise.  */
-      if (ssa_undefined_value_p (use, true))
+      if (defined_or_undefined->is_maybe_undefined (use))
 	return NULL_TREE;
       def = SSA_NAME_DEF_STMT (use);
       def_bb = gimple_bb (def);
@@ -200,6 +208,7 @@ tree_unswitch_single_loop (struct loop *loop, int num)
   gimple *stmt;
   bool changed = false;
   HOST_WIDE_INT iterations;
+  defined_or_undefined defined_or_undefined (loop->header);
 
   /* Perform initial tests if unswitch is eligible.  */
   if (num == 0)
@@ -243,7 +252,7 @@ tree_unswitch_single_loop (struct loop *loop, int num)
     {
       /* Find a bb to unswitch on.  */
       for (; i < loop->num_nodes; i++)
-	if ((cond = tree_may_unswitch_on (bbs[i], loop)))
+	if ((cond = tree_may_unswitch_on (bbs[i], loop, &defined_or_undefined)))
 	  break;
 
       if (i == loop->num_nodes)
@@ -363,7 +372,8 @@ tree_unswitch_single_loop (struct loop *loop, int num)
       /* Find a bb to unswitch on.  */
       for (; found < loop->num_nodes; found++)
 	if ((bbs[found]->flags & BB_REACHABLE)
-	    && (cond = tree_may_unswitch_on (bbs[found], loop)))
+	    && (cond = tree_may_unswitch_on (bbs[found], loop,
+					     &defined_or_undefined)))
 	  break;
 
       if (found == loop->num_nodes)


More information about the Gcc-patches mailing list