This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC PATCH] Improved folding of OBJ_TYPE_REF calls


Hi,

this email might be of interest to people working on the gimple level
and those dealing with the C++ front-end.

I have been looking at how to do devirtualization in gcc on the gimple
level.  Eventually I want to do it intraprocedurally through folding,
in indirect inlining and also propagate types in IPA-CP.  In order to
learn how to find the target of a virtual call if the base type is
known, I have had a look at the current state of folding of
OBJ_TYPE_REF calls, which I thought was already working, only to find
out it really wasn't.  We did not see errors just because the sole
user of gimple_fold_obj_type_ref() in tree-ssa-ccp.c called it in only
the simplest of circumstances, namely when OBJ_TYPE_REF_OBJECT was an
address of a declaration, even though often it is an address of a
series of COMPONENT_REFs of a declaration.

First I have simply attempted to fold OBJ_TYPE_REFs with such object
references as well but I have run into ICEs when the object had
multiple ancestors and the called virtual function came from the
second one (or any but the first one).  The problem is that
fold_obj_type_ref, as it is now, simply takes OBJ_TYPE_REF_TOKEN as an
index i into the virtual table of the given type and returns ith
method in the list of virtual methods of that type it acquires from
its BINFO.  The problem is that the first virtual method of the second
ancestor has index zero in the ancestor (and thus OBJ_TYPE_REF_TOKEN
is zero) but not zero in the big containing object.  Thus
fold_obj_type_ref returns a wrong method which eventually leads to
some type checking error or miscompilation.

So my second attempt was to require that all the fields in the
OBJ_TYPE_REF_OBJECT chain of COMPONENT_REFs are all first ancestors
through base_binfos.  This worked as expected and for example raised
the number of folded calls in the g++ testsuite to 2906 from just 22.
However, I still was not able to fold my multiple inheritance example.

Examining that example (which is a testcase of the patch below) I have
realized that BASE_BINFOs in it have lists of virtual methods of their
own which point to the correct methods (i.e. base_binfo representing A
in B had B::foo(), not A:foo() in the list).  Therefore my third
attempt was to traverse the base_binfo tree while traversing the
component_refs and use the corresponding list of virtual methods.
Unfortunately, I have found that some base_binfos have wrong lists of
virtual methods with the methods corresponding to the ancestor in
them.  However, I have also found out that such binfos have
BINFO_FLAG_2 (which is TREE_FLAG_2) unset and BINFO_FLAG_5 set and so
I could simply refuse to fold upon encountering them.  That worked and
I could still fold my example but not much more, the number of folded
expressions in the testcase was back to 22.

So my fourth and so far the last attempt to tackle this issue was to
combine the previous two approaches, I use the binfo of the true type
of the whole object as long as the fields in COMPONENT_REFs refer to
first ancestors and resort to traversing the base_binfos if multiple
inheritance is detected.  This way the number of folded virtual calls
in the testcases is back to 2906 and my example is successfully
devirtualized too.

Of course, this is quite awkward.  If I have missed anything and the
process of locating the correct method can be simplified somehow, I'd
like to know.  I would especially like to ask the C++ people to tell
me if the general idea is sensible and I would appreciate their
comments on the binfo flags 2 and 5 as well.

As I have written in the beginning, this is just the first step. I
need to employ a similar mechanism in devirtualization in indirect
inlining and IPA-CP which inevitably mandates some quite fun semantics
of jump-functions.  Therefore if you can comment on the general idea,
please try to do so soon.

The patch is an RFC one intended for 4.6.  Nevertheless, it passes
bootstrap and testing (including Ada) on x86_64-linux (rev 153007).
Some of the interfaces will probably change a bit as I will generalize
the binfo search so that it can be used by indirect inlining and
IPA-CP but at the moment I do not intend to do any big changes.

To give some more numbers, the patch folds all 8 OBJ_TYPE_REFs that
have an ADDR_EXPR of a part of a declaration as their object in the
set of libstdc++ benchmarks.  The runtime of the ultra-artificial
testcase attached to this email (when compiled with -O3) dropped from
22 seconds to 7 seconds on my desktop.

Thanks for any comments,

Martin


2009-10-21  Martin Jambor  <mjambor@suse.cz>

	* gimple.c (get_base_binfo_for_type): New function.
	(get_relevant_ref_binfo_single_inh): New function.
	(get_relevant_ref_binfo_multi_inh): New function.
	(gimple_fold_obj_type_ref_known_binfo): New function.
	(gimple_fold_obj_type_ref): Get binfo from
	get_relevant_ref_binfo_single_inh and
	get_relevant_ref_binfo_multi_inh and use
	gimple_fold_obj_type_ref_known_binfo.
	* gimple.h (gimple_fold_obj_type_ref): Declare.
	(gimple_fold_obj_type_ref_known_binfo): Declare.
	* tree-ssa-ccp.c (fold_gimple_call): Simplify condition for
	folding virtual calls and call gimple_fold_obj_type_ref.


Index: icln/gcc/gimple.c
===================================================================
--- icln.orig/gcc/gimple.c
+++ icln/gcc/gimple.c
@@ -4563,24 +4563,135 @@ gimple_decl_printable_name (tree decl, i
   return IDENTIFIER_POINTER (DECL_NAME (decl));
 }
 
+/* Search for a base binfo of BINFO that corresponds to TYPE and return it if
+   it is found or NULL_TREE if it is not. */
 
-/* Fold a OBJ_TYPE_REF expression to the address of a function.
-   KNOWN_TYPE carries the true type of OBJ_TYPE_REF_OBJECT(REF).  Adapted
-   from cp_fold_obj_type_ref, but it tolerates types with no binfo
-   data.  */
+static tree
+get_base_binfo_for_type (tree binfo, tree type)
+{
+  int i;
+  tree base_binfo;
+  tree res = NULL_TREE;
+
+  for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+    if (TREE_TYPE (base_binfo) == type)
+      {
+	gcc_assert (!res);
+	res = base_binfo;
+      }
 
-tree
-gimple_fold_obj_type_ref (tree ref, tree known_type)
+  return res;
+}
+
+/* Return a binfo describing the true type of object referenced by expression
+   REF if all component references access first ancestors.  REF can consist of
+   a series of COMPONENT_REFs of with a declaration or an INDIREC_REF.  It can
+   also be just a simple declaration, indirect reference or an SSA_NAME.  If
+   the function discoveres an INIDRECT_REF or an SSA_NAME, it will assume that
+   the encapsulating type is KNOWN_TYPE or return NULL_TREE if KNOWN_TYPE is
+   NULL_TREE.  Otherwise the first nonartifical field declaration or the base
+   declaration will be examined to get the encapsulating type.  If a
+   COMPONENT_REF referencing an ancestor which is not the first one, this
+   function returns NULL_TREE and sets *try_multi to true.  */
+
+static tree
+get_relevant_ref_binfo_single_inh (tree ref, tree known_type, bool *try_multi)
 {
-  HOST_WIDE_INT index;
-  HOST_WIDE_INT i;
-  tree v;
-  tree fndecl;
+  while (true)
+    {
+      if (TREE_CODE (ref) == COMPONENT_REF)
+	{
+	  tree par_type;
+	  tree binfo, base_binfo;
+	  tree field = TREE_OPERAND (ref, 1);
+
+	  if (!DECL_ARTIFICIAL (field))
+	    return TYPE_BINFO (TREE_TYPE (field));
+
+	  par_type = TREE_TYPE (TREE_OPERAND (ref, 0));
+	  binfo = TYPE_BINFO (par_type);
+	  if (!binfo
+	      || BINFO_N_BASE_BINFOS (binfo) == 0)
+	    {
+	      if (try_multi)
+		*try_multi = 1;
+	      return NULL_TREE;
+	    }
+	  base_binfo = BINFO_BASE_BINFO (binfo, 0);
+	  if (TREE_TYPE (base_binfo) != TREE_TYPE (field))
+	    {
+	      if (try_multi)
+		*try_multi = 1;
+	      return NULL_TREE;
+	    }
+
+	  ref = TREE_OPERAND (ref, 0);
+	}
+      else if (DECL_P (ref))
+	return TYPE_BINFO (TREE_TYPE (ref));
+      else if (known_type
+	       && (TREE_CODE (ref) == SSA_NAME
+		   || TREE_CODE (ref) == INDIRECT_REF))
+	return TYPE_BINFO (known_type);
+      else
+	return NULL_TREE;
+    }
+}
+
 
-  if (TYPE_BINFO (known_type) == NULL_TREE)
+/* Return a binfo describing the part of object referenced by expression REF.
+   This can and often is a base_binfo of a descendatn binfo. REF can consist of
+   a series of COMPONENT_REFs of with a declaration or an INDIREC_REF.  It can
+   also be just a simple declaration, indirect reference or an SSA_NAME.  If
+   the function discoveres an INIDRECT_REF or an SSA_NAME, it will assume that
+   the encapsulating type is KNOWN_TYPE or return NULL_TREE if KNOWN_TYPE is
+   NULL_TREE.  Otherwise the first nonartifical field declaration or the base
+   declaration will be examined to get the encapsulating type.  */
+
+static tree
+get_relevant_ref_binfo_multi_inh (tree ref, tree known_type)
+{
+  if (DECL_P (ref))
+    return TYPE_BINFO (TREE_TYPE (ref));
+  else if (TREE_CODE (ref) == COMPONENT_REF)
+    {
+      tree desc_binfo;
+      tree field = TREE_OPERAND (ref, 1);
+
+      if (!DECL_ARTIFICIAL (field))
+	return TYPE_BINFO (TREE_TYPE (field));
+      desc_binfo = get_relevant_ref_binfo_multi_inh (TREE_OPERAND (ref, 0),
+						     known_type);
+      if (!desc_binfo)
+	return NULL_TREE;
+      return get_base_binfo_for_type (desc_binfo, TREE_TYPE (field));
+    }
+  else if (known_type
+	   && (TREE_CODE (ref) == SSA_NAME
+	       || TREE_CODE (ref) == INDIRECT_REF))
+    return TYPE_BINFO (known_type);
+  else
     return NULL_TREE;
+}
+
 
-  v = BINFO_VIRTUALS (TYPE_BINFO (known_type));
+/* Fold a OBJ_TYPE_REF expression to the address of a function.  KNOWN_BINFO
+   carries the binfo describing the true type of OBJ_TYPE_REF_OBJECT(REF).  */
+
+tree
+gimple_fold_obj_type_ref_known_binfo (tree ref, tree known_binfo)
+{
+  HOST_WIDE_INT index, i;
+  tree v, fndecl;
+
+  /* If binfo flag 2 is not set, this binfo does not "have its own virtual
+     table" (according to cp/cp-tree.h) and cannot be safely used for
+     devirtualization.  (Purely empirical experience also shows that we can
+     also bail out if flag 5 _is_ set but I have no reason to prefer one
+     way over the other.) */
+  if (!BINFO_FLAG_2 (known_binfo))
+    return NULL_TREE;
+  v = BINFO_VIRTUALS (known_binfo);
   index = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1);
   i = 0;
   while (i != index)
@@ -4591,15 +4702,33 @@ gimple_fold_obj_type_ref (tree ref, tree
     }
 
   fndecl = TREE_VALUE (v);
+  return build_fold_addr_expr (fndecl);
+}
 
-#ifdef ENABLE_CHECKING
-  gcc_assert (tree_int_cst_equal (OBJ_TYPE_REF_TOKEN (ref),
-				  DECL_VINDEX (fndecl)));
-#endif
 
-  cgraph_node (fndecl)->local.vtable_method = true;
+/* Fold a OBJ_TYPE_REF expression to the address of a function.  If KNOWN_TYPE
+   is not NULL_TREE, it is the true type of the outmost encapsulating object if
+   that comes from a pointer SSA_NAME.  If the true outmost encapsulating type
+   can be determined from a declaration OBJ_TYPE_REF_OBJECT(REF), it is used
+   regardless of KNOWN_TYPE (which thuc can be NULL_TREE).  */
 
-  return build_fold_addr_expr (fndecl);
+tree
+gimple_fold_obj_type_ref (tree ref, tree known_type)
+{
+  tree obj = OBJ_TYPE_REF_OBJECT (ref);
+  bool try_multi = false;
+  tree binfo;
+
+  if (TREE_CODE (obj) == ADDR_EXPR)
+    obj = TREE_OPERAND (obj, 0);
+
+  binfo = get_relevant_ref_binfo_single_inh (obj, known_type, &try_multi);
+  if (!binfo && try_multi)
+    binfo = get_relevant_ref_binfo_multi_inh (obj, known_type);
+  if (binfo)
+    return gimple_fold_obj_type_ref_known_binfo (ref, binfo);
+  else
+    return NULL_TREE;
 }
 
 #include "gt-gimple.h"
Index: icln/gcc/tree-ssa-ccp.c
===================================================================
--- icln.orig/gcc/tree-ssa-ccp.c
+++ icln/gcc/tree-ssa-ccp.c
@@ -2924,9 +2924,6 @@ fold_gimple_call (gimple_stmt_iterator *
     }
   else
     {
-      /* Check for resolvable OBJ_TYPE_REF.  The only sorts we can resolve
-         here are when we've propagated the address of a decl into the
-         object slot.  */
       /* ??? Should perhaps do this in fold proper.  However, doing it
          there requires that we create a new CALL_EXPR, and that requires
          copying EH region info to the new node.  Easier to just do it
@@ -2934,19 +2931,11 @@ fold_gimple_call (gimple_stmt_iterator *
       /* ??? Is there a good reason not to do this in fold_stmt_inplace?  */
       callee = gimple_call_fn (stmt);
       if (TREE_CODE (callee) == OBJ_TYPE_REF
-          && lang_hooks.fold_obj_type_ref
-          && TREE_CODE (OBJ_TYPE_REF_OBJECT (callee)) == ADDR_EXPR
-          && DECL_P (TREE_OPERAND
-                     (OBJ_TYPE_REF_OBJECT (callee), 0)))
+          && TREE_CODE (OBJ_TYPE_REF_OBJECT (callee)) == ADDR_EXPR)
         {
           tree t;
 
-          /* ??? Caution: Broken ADDR_EXPR semantics means that
-             looking at the type of the operand of the addr_expr
-             can yield an array type.  See silly exception in
-             check_pointer_types_r.  */
-          t = TREE_TYPE (TREE_TYPE (OBJ_TYPE_REF_OBJECT (callee)));
-          t = lang_hooks.fold_obj_type_ref (callee, t);
+          t = gimple_fold_obj_type_ref (callee, NULL_TREE);
           if (t)
             {
               gimple_call_set_fn (stmt, t);
Index: icln/gcc/gimple.h
===================================================================
--- icln.orig/gcc/gimple.h
+++ icln/gcc/gimple.h
@@ -859,6 +859,7 @@ unsigned get_gimple_rhs_num_ops (enum tr
 gimple gimple_alloc_stat (enum gimple_code, unsigned MEM_STAT_DECL);
 const char *gimple_decl_printable_name (tree, int);
 tree gimple_fold_obj_type_ref (tree, tree);
+tree gimple_fold_obj_type_ref_known_binfo (tree, tree);
 
 /* Returns true iff T is a valid GIMPLE statement.  */
 extern bool is_gimple_stmt (tree);
Index: icln/gcc/testsuite/g++.dg/otr-fold-1.C
===================================================================
--- /dev/null
+++ icln/gcc/testsuite/g++.dg/otr-fold-1.C
@@ -0,0 +1,77 @@
+/* Verify that simple virtual calls are inlined even without early
+   inlining, even when a typecast to an ancestor is involved along the
+   way.  */
+/* { dg-do run } */
+/* { dg-options "-O -fdump-tree-optimized-slim"  } */
+
+extern "C" void abort (void);
+
+class Distraction
+{
+public:
+  float f;
+  double d;
+  Distraction ()
+  {
+    f = 8.3;
+    d = 10.2;
+  }
+  virtual float bar (float z);
+};
+
+class A
+{
+public:
+  int data;
+  virtual int foo (int i);
+};
+
+
+class B : public Distraction, public A
+{
+public:
+  virtual int foo (int i);
+};
+
+float Distraction::bar (float z)
+{
+  f += z;
+  return f/2;
+}
+
+int A::foo (int i)
+{
+  return i + 1;
+}
+
+int B::foo (int i)
+{
+  return i + 2;
+}
+
+int __attribute__ ((noinline,noclone)) get_input(void)
+{
+  return 1;
+}
+
+static int middleman_1 (class A *obj, int i)
+{
+  return obj->foo (i);
+}
+
+static int middleman_2 (class B *obj, int i)
+{
+  return middleman_1 (obj, i);
+}
+
+int main (int argc, char *argv[])
+{
+  class B b;
+
+  if (middleman_2 (&b, get_input ()) != 3)
+    abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "= B::foo"  "optimized"  } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
#include <stdio.h>

class SquareMatrix
{
  int len;

public:
  SquareMatrix (int new_len)
  {
    len = new_len;
  }
  int getLen () const
  {
    return len;
  }
  virtual double getElement (int i, int j) const = 0;
  virtual void setElement (int i, int j, double d) = 0;
};

class DenseMatrix : public SquareMatrix
{
  double *data;

public:
  DenseMatrix (int new_len) : SquareMatrix (new_len)
  {
    data = new double[new_len * new_len];
  }

  double getElement (int i, int j) const;
  void setElement (int i, int j, double d);
  void fill (double d);

  virtual ~DenseMatrix ()
  {
    delete[] data;
  }


};

double DenseMatrix::getElement (int i, int j) const
{
  int len = getLen ();
  return data[i*len+j];
}

void DenseMatrix::setElement (int i, int j, double d)
{
  int len = getLen ();
  data[i*len+j] = d;
}

void DenseMatrix::fill (double d)
{
  int len = getLen ();
  int i, j;

  for (i = 0; i < len; i++)
    {
      for (j = 0; j < len; j++)
	setElement (i, j, d);
      d += 0.5;
    }
}

static void multiply_matrices (SquareMatrix *res, SquareMatrix *a, SquareMatrix *b)
{
  int i, j, len = res->getLen ();

  for (i = 0; i < len; i++)
    for (j = 0; j < len; j++)
      {
	int k;
	double d = 0;

	for (k = 0; k < len; k++)
	  d += a->getElement (i, k) * b->getElement (k, j);

	res->setElement (i, j, d);
      }
}

#define N 1000

int main (int argc, char *argv[])
{
  DenseMatrix r(N), a(N), b(N);

  a.fill(10);
  b.fill(-4);

  multiply_matrices (&r, &a, &b);

  printf ("Some value: %e\n", r.getElement (N/2, N/3));
  return 0;
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]