[PATCH, devirtualization] Intraprocedural devirtualization pass

Martin Jambor mjambor@suse.cz
Tue Nov 1 22:52:00 GMT 2011


Hi,

the patch below is the second (and last) revived type-based
devirtualization patch that did not make it to 4.6.  It deals with
virtual calls from the function in which the there is also the object
declaration:

void foo()
{
  A a;

  a.foo ();
}

Normally, the front-end would do the devirtualization on its own,
however, early-inlining can create these situations too.  Since there
is nothing interprocedural going on, the current inlining and IPA-CP
devirtualization bits are of no help.  We do not do type-based
devirtualization in OBJ_TYPE_REF folding either, because the necessary
type-changing checks might make it quite expensive.  Thus, this patch
introduces a new pass to do that.

The patch basically piggy-tails on the intraprocedural
devirtualization mechanism, trying to construct a known-type jump
function for all objects in OBJ_TYPE_REF calls and then immediately
using it like we do in IPA-CP.

The original patch was doing this as a part of
pass_rebuild_cgraph_edges.  Honza did not like this idea so I made it
a separate pass.  First, I scheduled it after
pass_rebuild_cgraph_edges and was only traversing indirect edges,
avoiding a sweep over all of the IL.  Unfortunately, this does not
work in one scenario.  If the newly-known destination of a virtual
call is known not to throw, we may have to purge some EH CFG edges and
potentially basic blocks.  If these basic blocks contain calls
(typically calls to object destructors), we may end up having stale
call edges in the call graph... and our current approach to that
problem is to call pass_rebuild_cgraph_edges.  I think that I was not
running into this problem when the mechanism was a part of that pass
just because of pure luck.  Anyway, this is why I eventually opted for
a sweep over the statements.

My best guess is that it is probably worth it, but only because the
overhead should be still fairly low.  The pass triggers quite a number
of times when building libstdc++ and it can speed up an artificial
testcase which I will attach from over 20 seconds to 7s on my older
desktop - it is very similar to the one I posted with the previous
patch but this time the object constructors must not get early inlined
but the function multiply_matrices has to.  Currently I have problems
compiling Firefox even without LTO so I don't have any numbers from it
either.  IIRC, Honza did not see this patch trigger there when he
tried the ancient version almost a year go.  On the other hand, Maxim
claimed that the impact can be noticeable on some code base he is
concerned about.

I have successfully bootstrapped and tested the patch on x86_64-linux.
What do you think, should we include this in trunk?

Thanks,

Martin


2011-10-31  Martin Jambor  <mjambor@suse.cz>

	* ipa-cp.c (ipa_value_from_known_type_jfunc): Moved to...
	* ipa-prop.c (ipa_binfo_from_known_type_jfunc): ...here, exported and
	updated all callers.
	(intraprocedural_devirtualization): New function.
	(gate_intra_devirtualization): Likewise.
	(pass_intra_devirt): New pass.
	* ipa-prop.h (ipa_binfo_from_known_type_jfunc): Declared.
	* passes.c (init_optimization_passes): Schedule pass_intra_devirt.
	* tree-pass.h (pass_intra_devirt): Declare.

	* testsuite/g++.dg/ipa/imm-devirt-1.C: New test.
	* testsuite/g++.dg/ipa/imm-devirt-2.C: Likewise.


Index: src/gcc/testsuite/g++.dg/ipa/imm-devirt-1.C
===================================================================
--- /dev/null
+++ src/gcc/testsuite/g++.dg/ipa/imm-devirt-1.C
@@ -0,0 +1,62 @@
+/* Verify that virtual calls are folded even when a typecast to an
+   ancestor is involved along the way.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-devirt"  } */
+
+extern "C" void abort (void);
+
+class A
+{
+public:
+  int data;
+  virtual int foo (int i);
+};
+
+
+class B : public A
+{
+public:
+  __attribute__ ((noinline)) B();
+  virtual int foo (int i);
+};
+
+int __attribute__ ((noinline)) A::foo (int i)
+{
+  return i + 1;
+}
+
+int __attribute__ ((noinline)) B::foo (int i)
+{
+  return i + 2;
+}
+
+int __attribute__ ((noinline,noclone)) get_input(void)
+{
+  return 1;
+}
+
+__attribute__ ((noinline)) B::B()
+{
+}
+
+static inline int middleman_1 (class A *obj, int i)
+{
+  return obj->foo (i);
+}
+
+static inline int middleman_2 (class B *obj, int i)
+{
+  return middleman_1 (obj, i);
+}
+
+int main (int argc, char *argv[])
+{
+  class B b;
+
+  if (middleman_2 (&b, get_input ()) != 3)
+    abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "Immediately devirtualizing call.*into.*B::foo"  "devirt"  } } */
+/* { dg-final { cleanup-tree-dump "devirt" } } */
Index: src/gcc/testsuite/g++.dg/ipa/imm-devirt-2.C
===================================================================
--- /dev/null
+++ src/gcc/testsuite/g++.dg/ipa/imm-devirt-2.C
@@ -0,0 +1,95 @@
+/* Verify that virtual calls are folded even when a typecast to an
+   ancestor is involved along the way.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-devirt-slim"  } */
+
+extern "C" void abort (void);
+
+class Distraction
+{
+public:
+  float f;
+  double d;
+  Distraction ()
+  {
+    f = 8.3;
+    d = 10.2;
+  }
+  virtual float bar (float z);
+};
+
+class A
+{
+public:
+  int data;
+  virtual int foo (int i);
+};
+
+class B : public A
+{
+public:
+  int data_2;
+  virtual int foo (int i);
+  virtual int baz (int i);
+};
+
+
+class C : public Distraction, public B
+{
+public:
+  __attribute__ ((noinline)) C();
+  virtual int foo (int i);
+};
+
+float __attribute__ ((noinline)) Distraction::bar (float z)
+{
+  f += z;
+  return f/2;
+}
+
+int __attribute__ ((noinline)) A::foo (int i)
+{
+  return i + 1;
+}
+
+int __attribute__ ((noinline)) B::foo (int i)
+{
+  return i + 2;
+}
+
+int __attribute__ ((noinline)) B::baz (int i)
+{
+  return i * 15;
+}
+
+int __attribute__ ((noinline)) C::foo (int i)
+{
+  return i + 3;
+}
+
+int __attribute__ ((noinline,noclone)) get_input(void)
+{
+  return 1;
+}
+
+static inline int middleman (class A *obj, int i)
+{
+  return obj->foo (i);
+}
+
+__attribute__ ((noinline)) C::C()
+{
+}
+
+int main (int argc, char *argv[])
+{
+  class C c;
+
+  if (middleman (&c, get_input ()) != 4)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "Immediately devirtualizing call.*into.*C::.*foo"  "devirt"  } } */
+/* { dg-final { cleanup-tree-dump "devirt" } } */
Index: src/gcc/ipa-cp.c
===================================================================
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -674,20 +674,6 @@ ipa_get_jf_ancestor_result (struct ipa_j
     return NULL_TREE;
 }
 
-/* Extract the acual BINFO being described by JFUNC which must be a known type
-   jump function.  */
-
-static tree
-ipa_value_from_known_type_jfunc (struct ipa_jump_func *jfunc)
-{
-  tree base_binfo = TYPE_BINFO (jfunc->value.known_type.base_type);
-  if (!base_binfo)
-    return NULL_TREE;
-  return get_binfo_at_offset (base_binfo,
-			      jfunc->value.known_type.offset,
-			      jfunc->value.known_type.component_type);
-}
-
 /* Determine whether JFUNC evaluates to a known value (that is either a
    constant or a binfo) and if so, return it.  Otherwise return NULL. INFO
    describes the caller node so that pass-through jump functions can be
@@ -699,7 +685,7 @@ ipa_value_from_jfunc (struct ipa_node_pa
   if (jfunc->type == IPA_JF_CONST)
     return jfunc->value.constant;
   else if (jfunc->type == IPA_JF_KNOWN_TYPE)
-    return ipa_value_from_known_type_jfunc (jfunc);
+    return ipa_binfo_from_known_type_jfunc (jfunc);
   else if (jfunc->type == IPA_JF_PASS_THROUGH
 	   || jfunc->type == IPA_JF_ANCESTOR)
     {
@@ -1006,7 +992,7 @@ propagate_accross_jump_function (struct
 
       if (jfunc->type == IPA_JF_KNOWN_TYPE)
 	{
-	  val = ipa_value_from_known_type_jfunc (jfunc);
+	  val = ipa_binfo_from_known_type_jfunc (jfunc);
 	  if (!val)
 	    return set_lattice_contains_variable (dest_lat);
 	}
Index: src/gcc/ipa-prop.c
===================================================================
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -3069,3 +3069,107 @@ ipa_update_after_lto_read (void)
     if (node->analyzed)
       ipa_initialize_node_params (node);
 }
+
+/* Extract the acual BINFO being described by JFUNC which must be a known type
+   jump function.  */
+
+tree
+ipa_binfo_from_known_type_jfunc (struct ipa_jump_func *jfunc)
+{
+  tree base_binfo = TYPE_BINFO (jfunc->value.known_type.base_type);
+  if (!base_binfo)
+    return NULL_TREE;
+  return get_binfo_at_offset (base_binfo,
+			      jfunc->value.known_type.offset,
+			      jfunc->value.known_type.component_type);
+}
+
+/* Intraprocedural (early) type-based devirtualization pass.  Looks at all call
+   statements and attempts type-base devirtualization on those calling an
+   OBJ_TYPE_REF.  */
+
+static unsigned int
+intraprocedural_devirtualization (void)
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  bool cfg_changed = false;
+
+  FOR_EACH_BB (bb)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      if (is_gimple_call (gsi_stmt (gsi)))
+	{
+	  tree binfo, token, fndecl;
+	  gimple call = gsi_stmt (gsi);
+	  tree target = gimple_call_fn (call);
+	  struct ipa_jump_func jfunc;
+
+	  if (TREE_CODE (target) != OBJ_TYPE_REF)
+	    continue;
+
+	  jfunc.type = IPA_JF_UNKNOWN;
+	  compute_known_type_jump_func (OBJ_TYPE_REF_OBJECT (target), &jfunc,
+					call);
+	  if (jfunc.type != IPA_JF_KNOWN_TYPE)
+	    continue;
+	  binfo = ipa_binfo_from_known_type_jfunc (&jfunc);
+	  if (!binfo)
+	    continue;
+	  token = OBJ_TYPE_REF_TOKEN (target);
+	  fndecl = gimple_get_virt_method_for_binfo (tree_low_cst (token, 1),
+						     binfo);
+	  if (!fndecl)
+	    continue;
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "ipa-prop: Immediately devirtualizing call ");
+	      print_gimple_expr (dump_file, call, 0, 0);
+	    }
+
+	  gimple_call_set_fndecl (call, fndecl);
+	  update_stmt (call);
+	  if (maybe_clean_eh_stmt (call)
+	      && gimple_purge_dead_eh_edges (bb))
+	    cfg_changed = true;
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, " into ");
+	      print_gimple_expr (dump_file, call, 0, 0);
+	      fprintf (dump_file, "\n");
+	    }
+	}
+
+
+ if (cfg_changed)
+    return TODO_cleanup_cfg;
+  else
+    return 0;
+}
+
+static bool
+gate_intra_devirtualization (void)
+{
+  return flag_devirtualize != 0;
+}
+
+
+struct gimple_opt_pass pass_intra_devirt =
+  {
+    {
+      GIMPLE_PASS,
+      "devirt",					/* name */
+      gate_intra_devirtualization,		/* gate */
+      intraprocedural_devirtualization,    	/* execute */
+      NULL,					/* sub */
+      NULL,					/* next */
+      0,					/* static_pass_number */
+      TV_CGRAPH,				/* tv_id */
+      PROP_cfg,					/* properties_required */
+      0,					/* properties_provided */
+      0,					/* properties_destroyed */
+      0,					/* todo_flags_start */
+      0,					/* todo_flags_finish */
+    }
+  };
Index: src/gcc/ipa-prop.h
===================================================================
--- src.orig/gcc/ipa-prop.h
+++ src/gcc/ipa-prop.h
@@ -347,6 +347,7 @@ bool ipa_propagate_indirect_call_infos (
 
 /* Indirect edge and binfo processing.  */
 struct cgraph_edge *ipa_make_edge_direct_to_target (struct cgraph_edge *, tree);
+tree ipa_binfo_from_known_type_jfunc (struct ipa_jump_func *);
 
 /* Functions related to both.  */
 void ipa_analyze_node (struct cgraph_node *);
Index: src/gcc/passes.c
===================================================================
--- src.orig/gcc/passes.c
+++ src/gcc/passes.c
@@ -1225,6 +1225,7 @@ init_optimization_passes (void)
           NEXT_PASS (pass_cleanup_eh);
           NEXT_PASS (pass_profile);
           NEXT_PASS (pass_local_pure_const);
+	  NEXT_PASS (pass_intra_devirt);
 	  /* Split functions creates parts that are not run through
 	     early optimizations again.  It is thus good idea to do this
 	     late.  */
Index: src/gcc/tree-pass.h
===================================================================
--- src.orig/gcc/tree-pass.h
+++ src/gcc/tree-pass.h
@@ -449,6 +449,7 @@ extern struct gimple_opt_pass pass_trace
 extern struct gimple_opt_pass pass_warn_unused_result;
 extern struct gimple_opt_pass pass_split_functions;
 extern struct gimple_opt_pass pass_feedback_split_functions;
+extern struct gimple_opt_pass pass_intra_devirt;
 
 /* IPA Passes */
 extern struct simple_ipa_opt_pass pass_ipa_lower_emutls;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: intra_devirt_example.C
Type: text/x-c++src
Size: 1732 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20111101/665f84cc/attachment.bin>


More information about the Gcc-patches mailing list