This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][RFC] ARRAY_REF on pointers and more


This patch tries to address the fact that our optimization capabilities
are weaker for array like accesses through pointers than through real
array accesses.  The problem and suggestions for solutions came up in
the past as "array-ref on pointers" or also in the context of designing
new memory reference trees on the mem-ref branch.

This patch tries to leverage existing support to make array-refs on
pointers happen.  Basically what we can do now (and the fortran
frontend does extensively) is to create array accesses on pointers
by first converting the pointer to a pointer-to-array type like so:

int test1(int *p, int i)
{
  int (*q)[] = (int(*)[])p;
  return (*q)[i];
}

now this isn't an array-ref based on p but on the converted q.  This
patch tries to change that by making this particular conversion
(T* -> T[]*) useless so the IL looks like

  q_2 = p_1(D);
  D.1595_4 = (*q_2)[i_3(D)];

and p_1(D) can be copy propagated to the use of q_2.

I put in code to transform POINTER_PLUS_EXPR to ARRAY_REFs into
tree forwprop (it's a hack, and not unconditionally a good idea, so
I do that only if the result is dereferenced).  It looks for

   off_1 = index_2 * element-size;
   ptr_3 = base_4 p+ off_1;

and transforms it to

   ptr_3 = &(*base_4)[index_2];

if possible.

If you want this to happen on usual C code like

int bar(int *p, int i)
{
  return p[i];
}

then you hit the issue that the C FE prematurely folds the pointer
offset computation into a form preventing i from being used as array
index.  It builds p p+ (sizetype)(i * element-size) as
p p+ ((sizetype)i * element-size) which we cannot transform back for
overflow reasons.  Thus this patch makes sure to retain the sign
for the index * element-size computation and _not_ fold the sizetype
conversion into the multiplication (fold doesn't do this as well
in general).  So instead of getting an unsigned index we now get
a widening (which we could also strip)

  D.1610_2 = (long int) i_1(D);
  D.1609_7 = (*p_5(D))[D.1610_2];


There are complications with the type system and this approach
re-using ARRAY_REF for this (obviously).  On the mem-ref branch
a replacement of INDIRECT_REF allowed an offset operand (but that
still didn't have the element size multiplication absorbed - instead
that was made explicit with IDX_EXPR, and ARRAY_REFs were removed).


Another issue that showed up is that while *&x is valid gimple
(if &x is invariant) we do currently expect that such construct
is always folded (wrongly so IMHO).  Of course the cases where
we not do so are either missed optimizations, wrong types in
the IL or problems with the type system consistency (this patch
exposes such - you can convert T* to T[]* but not T to T[]).


In the end I am again questioning the C frontend behavior of
decomposing each (but variable length or multi-dimensional) array
access (well, [] operator usage) to pointer arithmetic.  With
using intermediate explicit conversions to array types it could
even produce ARRAY_REFs for pointer indexing - removing the need
of the forwprop and type system hacks in most of the cases.

Oh, and I don't expect this patch to bootstrap.

Any comments?

Thanks,
Richard.


2009-05-04  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/29751
	* tree-ssa.c (useless_type_conversion_p_1): Conversions from
	T* to T[]* are useless.
	* tree-ssa-ccp.c (maybe_fold_stmt_addition): Look through
	widening conversions.
	(maybe_fold_reference): *&a is valid gimple - do not create
	wrong types just because we think we have no other choice.
	* tree-ssa-forwprop.c
	(forward_propagate_addr_into_variable_array_index): Look through
	widening conversions.
	(forward_propagate_pointer_plus_expr): New function promoting
	POINTER_PLUS_EXPRs to addresses of ARRAY_REFs.
	(tree_ssa_forward_propagate_single_use_vars): Call it.
	* c-common.c (pointer_int_sum): Avoid premature optimization
	that loses knowledge about overflow and required precision.
	Do element size multiplication in the original index sign.
	* tree-inline.c (remap_gimple_op_r): Use the original type
	of an INDIRECT_REF.
	* tree-cfg.c (one_pointer_from_useless_type_conversion_p): New.
	(verify_types_in_gimple_min_lval): Use it to verify INDIRECT_REF
	types.

	* gcc.dg/tree-ssa/forwprop-13.c: New testcase.
	* gcc.dg/tree-ssa/ssa-fre-24.c: Likewise.

Index: gcc/tree-ssa.c
===================================================================
*** gcc/tree-ssa.c.orig	2009-05-04 11:09:18.000000000 +0200
--- gcc/tree-ssa.c	2009-05-04 11:55:18.000000000 +0200
*************** useless_type_conversion_p_1 (tree outer_
*** 966,971 ****
--- 966,978 ----
        if (get_deref_alias_set (inner_type) != get_deref_alias_set (outer_type))
  	return false;
  
+       /* Conversions from T* to T[]* are useless.  */
+       if (TREE_CODE (TREE_TYPE (outer_type)) == ARRAY_TYPE
+ 	  && !TYPE_DOMAIN (TREE_TYPE (outer_type))
+ 	  && useless_type_conversion_p_1 (TREE_TYPE (TREE_TYPE (outer_type)),
+ 					  TREE_TYPE (inner_type)))
+ 	return true;
+ 
        /* We do not care for const qualification of the pointed-to types
  	 as const qualification has no semantic value to the middle-end.  */
  
Index: gcc/tree-ssa-ccp.c
===================================================================
*** gcc/tree-ssa-ccp.c.orig	2009-04-24 15:33:33.000000000 +0200
--- gcc/tree-ssa-ccp.c	2009-05-04 11:43:53.000000000 +0200
*************** maybe_fold_stmt_addition (tree res_type,
*** 2136,2141 ****
--- 2136,2147 ----
  	  && host_integerp (TYPE_SIZE_UNIT (TREE_TYPE (op0)), 1))
  	{
  	  gimple offset_def = SSA_NAME_DEF_STMT (op1);
+ 	  while (is_gimple_assign (offset_def)
+ 		 && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (offset_def))
+ 		 && TREE_CODE (gimple_assign_rhs1 (offset_def)) == SSA_NAME
+ 		 && (TYPE_PRECISION (gimple_expr_type (offset_def))
+ 		     >= TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (offset_def)))))
+ 	    offset_def = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (offset_def));
  	  if (!is_gimple_assign (offset_def))
  	    return NULL_TREE;
  
*************** maybe_fold_reference (tree expr, bool is
*** 2262,2270 ****
  	tem = NULL_TREE;
        if (!tem
  	  && TREE_CODE (TREE_OPERAND (*t, 0)) == ADDR_EXPR)
! 	/* If we had a good reason for propagating the address here,
! 	   make sure we end up with valid gimple.  See PR34989.  */
! 	tem = TREE_OPERAND (TREE_OPERAND (*t, 0), 0);
  
        if (tem)
  	{
--- 2268,2280 ----
  	tem = NULL_TREE;
        if (!tem
  	  && TREE_CODE (TREE_OPERAND (*t, 0)) == ADDR_EXPR)
! 	{
! 	  /* If we had a good reason for propagating the address here,
! 	     make sure we end up with valid gimple.  See PR34989.
! 	     ???  Other places in the compiler assume we never see *&x.  */
! 	  gcc_assert (is_gimple_min_invariant (TREE_OPERAND (*t, 0)));
! 	  tem = NULL_TREE;
! 	}
  
        if (tem)
  	{
Index: gcc/tree-ssa-forwprop.c
===================================================================
*** gcc/tree-ssa-forwprop.c.orig	2009-04-29 16:34:51.000000000 +0200
--- gcc/tree-ssa-forwprop.c	2009-05-04 11:43:53.000000000 +0200
*************** forward_propagate_addr_into_variable_arr
*** 623,630 ****
    if (!host_integerp (tunit, 1))
      return false;
  
!   /* Get the offset's defining statement.  */
    offset_def = SSA_NAME_DEF_STMT (offset);
  
    /* Try to find an expression for a proper index.  This is either a
       multiplication expression by the element size or just the ssa name we came
--- 623,636 ----
    if (!host_integerp (tunit, 1))
      return false;
  
!   /* Get the offset's defining statement looking through extensions.  */
    offset_def = SSA_NAME_DEF_STMT (offset);
+   while (is_gimple_assign (offset_def)
+ 	 && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (offset_def))
+ 	 && TREE_CODE (gimple_assign_rhs1 (offset_def)) == SSA_NAME
+ 	 && (TYPE_PRECISION (gimple_expr_type (offset_def))
+ 	     >= TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs1 (offset_def)))))
+     offset_def = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (offset_def));
  
    /* Try to find an expression for a proper index.  This is either a
       multiplication expression by the element size or just the ssa name we came
*************** simplify_bitwise_and (gimple_stmt_iterat
*** 1213,1218 ****
--- 1219,1271 ----
    return;
  }
  
+ static void
+ forward_propagate_pointer_plus_expr (tree name, gimple stmt)
+ {
+   gimple use_stmt;
+   use_operand_p use_p;
+   tree rhs, lhs;
+   gimple_stmt_iterator gsi;
+ 
+   if (!single_imm_use (name, &use_p, &use_stmt)
+       || !gimple_assign_single_p (use_stmt))
+     return;
+ 
+   rhs = gimple_assign_rhs1 (use_stmt);
+   while (handled_component_p (rhs))
+     rhs = TREE_OPERAND (rhs, 0);
+ 
+   lhs = gimple_assign_lhs (use_stmt);
+   while (handled_component_p (lhs))
+     lhs = TREE_OPERAND (lhs, 0);
+ 
+   if ((TREE_CODE (rhs) == INDIRECT_REF
+        && TREE_OPERAND (rhs, 0) == name)
+       || (TREE_CODE (lhs) == INDIRECT_REF
+ 	  && TREE_OPERAND (lhs, 0) == name))
+     {
+       tree ptrtype = TREE_TYPE (gimple_assign_rhs1 (stmt));
+       tree rhs2 = gimple_assign_rhs2 (stmt);
+       tree arr;
+       gsi = gsi_for_stmt (stmt);
+       arr = build1 (INDIRECT_REF,
+ 		    build_array_type (TREE_TYPE (ptrtype), NULL_TREE),
+ 		    gimple_assign_rhs1 (stmt));
+       arr = build4 (ARRAY_REF, TREE_TYPE (ptrtype),
+ 		    arr, integer_zero_node, NULL_TREE, NULL_TREE);
+       arr = build1 (ADDR_EXPR, ptrtype, arr);
+       if (TREE_CODE (rhs2) == INTEGER_CST)
+ 	{
+ 	  tree new_rhs = maybe_fold_stmt_addition (ptrtype, arr, rhs2);
+ 	  if (new_rhs)
+ 	    gimple_assign_set_rhs_from_tree (&gsi, new_rhs);
+ 	}
+       else
+ 	forward_propagate_addr_into_variable_array_index (rhs2, arr, &gsi);
+       update_stmt (gsi_stmt (gsi));
+     }
+ }
+ 
  /* Main entry point for the forward propagation optimizer.  */
  
  static unsigned int
*************** tree_ssa_forward_propagate_single_use_va
*** 1272,1277 ****
--- 1325,1337 ----
  		  if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
  		    gsi_next (&gsi);
  		}
+ 	      else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+ 		       && TREE_CODE (rhs) == SSA_NAME)
+ 		{
+ 		  forward_propagate_pointer_plus_expr (lhs, stmt);
+ 		  if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR)
+ 		    gsi_next (&gsi);
+ 		}
  	      else if ((gimple_assign_rhs_code (stmt) == BIT_NOT_EXPR
  		        || gimple_assign_rhs_code (stmt) == NEGATE_EXPR)
  		       && TREE_CODE (rhs) == SSA_NAME)
Index: gcc/c-common.c
===================================================================
*** gcc/c-common.c.orig	2009-05-04 11:09:18.000000000 +0200
--- gcc/c-common.c	2009-05-04 11:43:53.000000000 +0200
*************** pointer_int_sum (enum tree_code resultco
*** 3806,3848 ****
       they are really pointers.  */
    fold_defer_overflow_warnings ();
  
-   /* If what we are about to multiply by the size of the elements
-      contains a constant term, apply distributive law
-      and multiply that constant term separately.
-      This helps produce common subexpressions.  */
-   if ((TREE_CODE (intop) == PLUS_EXPR || TREE_CODE (intop) == MINUS_EXPR)
-       && !TREE_CONSTANT (intop)
-       && TREE_CONSTANT (TREE_OPERAND (intop, 1))
-       && TREE_CONSTANT (size_exp)
-       /* If the constant comes from pointer subtraction,
- 	 skip this optimization--it would cause an error.  */
-       && TREE_CODE (TREE_TYPE (TREE_OPERAND (intop, 0))) == INTEGER_TYPE
-       /* If the constant is unsigned, and smaller than the pointer size,
- 	 then we must skip this optimization.  This is because it could cause
- 	 an overflow error if the constant is negative but INTOP is not.  */
-       && (!TYPE_UNSIGNED (TREE_TYPE (intop))
- 	  || (TYPE_PRECISION (TREE_TYPE (intop))
- 	      == TYPE_PRECISION (TREE_TYPE (ptrop)))))
-     {
-       enum tree_code subcode = resultcode;
-       tree int_type = TREE_TYPE (intop);
-       if (TREE_CODE (intop) == MINUS_EXPR)
- 	subcode = (subcode == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR);
-       /* Convert both subexpression types to the type of intop,
- 	 because weird cases involving pointer arithmetic
- 	 can result in a sum or difference with different type args.  */
-       ptrop = build_binary_op (EXPR_LOCATION (TREE_OPERAND (intop, 1)),
- 			       subcode, ptrop,
- 			       convert (int_type, TREE_OPERAND (intop, 1)), 1);
-       intop = convert (int_type, TREE_OPERAND (intop, 0));
-     }
- 
    /* Convert the integer argument to a type the same size as sizetype
       so the multiply won't overflow spuriously.  */
!   if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype)
!       || TYPE_UNSIGNED (TREE_TYPE (intop)) != TYPE_UNSIGNED (sizetype))
      intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
! 					     TYPE_UNSIGNED (sizetype)), intop);
  
    /* Replace the integer argument with a suitable product by the object size.
       Do this multiplication as signed, then convert to the appropriate
--- 3806,3816 ----
       they are really pointers.  */
    fold_defer_overflow_warnings ();
  
    /* Convert the integer argument to a type the same size as sizetype
       so the multiply won't overflow spuriously.  */
!   if (TYPE_PRECISION (TREE_TYPE (intop)) != TYPE_PRECISION (sizetype))
      intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
! 					     TYPE_UNSIGNED (TREE_TYPE (intop))), intop);
  
    /* Replace the integer argument with a suitable product by the object size.
       Do this multiplication as signed, then convert to the appropriate
Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-13.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- gcc/testsuite/gcc.dg/tree-ssa/forwprop-13.c	2009-05-04 12:02:50.000000000 +0200
***************
*** 0 ****
--- 1,28 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-copyprop1" } */
+ 
+ int test1(int *p, int i)
+ {
+   int (*q)[] = (int(*)[])p;
+   return (*q)[i];
+ }
+ 
+ int test2(int *p, int i)
+ {
+   int (*q)[] = (int(*)[]) (p + i);
+   return (*q)[0];
+ }
+ 
+ int bar(int *p, int i)
+ {
+   return p[i];
+ }
+ 
+ /* We should get array accesses based on p for all cases.  For test 2 we
+    for now miss propagation of q_6 in
+      q_6 = &(*p_5(D))[D.1601_2];
+      D.1604_7 = (*q_6)[0];  */
+ 
+ /* { dg-final { scan-tree-dump-times "= \\\(\\\*p_.\\\(D\\\)\\\)\\\[" 2 "copyprop1" } } */
+ /* { dg-final { scan-tree-dump-times "= \\\(\\\*p_.\\\(D\\\)\\\)\\\[" 3 "copyprop1" { xfail *-*-* } } } */
+ /* { dg-final { cleanup-tree-dump "copyprop1" } } */
Index: gcc/tree-inline.c
===================================================================
*** gcc/tree-inline.c.orig	2009-04-27 17:31:10.000000000 +0200
--- gcc/tree-inline.c	2009-05-04 11:43:53.000000000 +0200
*************** remap_gimple_op_r (tree *tp, int *walk_s
*** 768,774 ****
  		 INDIRECT_REF, but we absolutely rely on that.  As
  		 fold_indirect_ref does other useful transformations,
  		 try that first, though.  */
! 	      type = TREE_TYPE (TREE_TYPE (*n));
  	      new_tree = unshare_expr (*n);
  	      old = *tp;
  	      *tp = gimple_fold_indirect_ref (new_tree);
--- 768,774 ----
  		 INDIRECT_REF, but we absolutely rely on that.  As
  		 fold_indirect_ref does other useful transformations,
  		 try that first, though.  */
! 	      type = TREE_TYPE (*tp);
  	      new_tree = unshare_expr (*n);
  	      old = *tp;
  	      *tp = gimple_fold_indirect_ref (new_tree);
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-24.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-24.c	2009-05-04 11:43:53.000000000 +0200
***************
*** 0 ****
--- 1,19 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-fre-stats" } */
+ 
+ int f(int *r)
+ {
+   r[0] = 0;
+   r[1] = 1;
+   return r[0];
+ }
+ 
+ int g(int *r)
+ {
+   r[1] = 1;
+   r[0] = 0;
+   return r[1];
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Eliminated: 1" 2 "fre" } } */
+ /* { dg-final { cleanup-tree-dump "fre" } } */
Index: gcc/tree-cfg.c
===================================================================
*** gcc/tree-cfg.c.orig	2009-04-29 17:01:45.000000000 +0200
--- gcc/tree-cfg.c	2009-05-04 11:57:38.000000000 +0200
*************** verify_expr (tree *tp, int *walk_subtree
*** 3094,3099 ****
--- 3094,3117 ----
  }
  
  
+ /* Returns true if there is one pointer type in TYPE_POINTER_TO (SRC_OBJ)
+    list of pointer-to types that is trivially convertible to DEST.  */
+ 
+ static bool
+ one_pointer_from_useless_type_conversion_p (tree dest_obj, tree src)
+ {
+   tree dest;
+ 
+   if (!TYPE_POINTER_TO (dest_obj))
+     return true;
+ 
+   for (dest = TYPE_POINTER_TO (dest_obj); dest; dest = TYPE_NEXT_PTR_TO (dest))
+     if (useless_type_conversion_p (dest, src))
+       return true;
+ 
+   return false;
+ }
+ 
  /* Verify if EXPR is either a GIMPLE ID or a GIMPLE indirect reference.
     Returns true if there is an error, otherwise false.  */
  
*************** verify_types_in_gimple_min_lval (tree ex
*** 3123,3130 ****
        debug_generic_stmt (op);
        return true;
      }
!   if (!useless_type_conversion_p (TREE_TYPE (expr),
! 				  TREE_TYPE (TREE_TYPE (op))))
      {
        error ("type mismatch in indirect reference");
        debug_generic_stmt (TREE_TYPE (expr));
--- 3141,3148 ----
        debug_generic_stmt (op);
        return true;
      }
!   if (!one_pointer_from_useless_type_conversion_p (TREE_TYPE (expr),
! 						   TREE_TYPE (op)))
      {
        error ("type mismatch in indirect reference");
        debug_generic_stmt (TREE_TYPE (expr));


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]