This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Committed] Reduce memory usage of DOM


Hopefully, no-one will object to patches that reduce GCC's memory
usage during stage3, as it's clear our memory footprint has regressed.

I recently instrumented the middle-end's tree building routines to
get some measures of how frequently common tree-node idioms (local
subtrees) occur in the compiler.  The aim was to identify commonly
missed constant folding opportunities, and/or prioritize the many
missed-optimization PRs in bugzilla.  For example, it should come as
no surprise that my recent attack on CONJ_EXPR has absolutely no
improvement on GCC's own code (as it doesn't use complex numbers!).

>From an analysis of 19,171,768 samples captured by the code during
a bootstrap of GCC, the 10 most commonly constructed trees were:

 920027 nop_expr(ssa_name)
 733021 goto_expr(label_decl)
 678831 ordered_expr(ssa_name,integer_cst)
 662869 le_expr(ssa_name,integer_cst)
 647228 mult_expr(nop_expr,integer_cst)
 620024 ne_expr(ssa_name,integer_cst)
 618547 ge_expr(ssa_name,integer_cst)
 586704 nop_expr(var_decl)
 573462 component_ref(indirect_ref,field_decl,NULL)
 536813 label_expr(label_decl)
 ...

And so on for a total of 4528 unique subtrees.

Clearly, such an analysis has been worthwhile, as already by
position 3 in the list (representing 3.5% of the tree nodes
constructed by GCC) is the curious idiom ORDERED_EXPR of an
SSA_NAME and an INTEGER_CST.  What's curious is that the
middle-end tree-code ORDERED_EXPR represents a floating point
comparison of two operands to test whether either is NaN,
i.e. this corresponds to libm's !isunordered(x,y).  But
the smoking gun is the INTEGER_CST operand, indicating that
this is being applied to integral types.  It's not immediately
clear whether the "unordered" comparison codes are even valid
for non-floating point types, or whether fold-const should
immediately simplify them to true or false.

It turns out that the source of this huge number of strange
(if not invalid) trees is the function record_cond in
tree-ssa-dom.c.  This routine, when recording that an expression
such as "x == y" is true across an edge, creates an additional
annotation that "ordered(x,y)" must also be true.  This is
correct and reasonable for floating point equality, but at
best useless for the extremely common case of integer comparisons.

The patch below avoids the creation of these unnecessary tree
nodes in DOM, but only adding the FP ordered/unordered annotations
to floating point comparison edges.  The change is relatively small
(and obvious once explained), but as shown above saves a significant
number of memory allocations and may even have an observable effect
on compile-times.


The following patch has been tested on x86_64-unknown-linux-gnu
with a full "make bootstrap", all default languages, and regression
tested with a top-level "make -k check" with no new failures.

Committed to mainline as revision 114489.  If someone can measure
an observable improvement, we may consider this for 4.1.  For 4.3,
I've a strategy for avoiding the creation of any these trees in
record_conditions, but thats what I said about 4.2 when 4.1 entered
stage3: http://gcc.gnu.org/ml/gcc-patches/2004-09/msg00542.html
Hehe :-}


2006-06-08  Roger Sayle  <roger@eyesopen.com>

	* tree-ssa-dom.c (record_conditions): Only record "unordered"
	conditions from floating point comparisons.


Index: tree-ssa-dom.c
===================================================================
*** tree-ssa-dom.c	(revision 114381)
--- tree-ssa-dom.c	(working copy)
*************** record_conditions (struct edge_info *edg
*** 950,985 ****
      {
      case LT_EXPR:
      case GT_EXPR:
!       edge_info->max_cond_equivalences = 12;
!       edge_info->cond_equivalences = XNEWVEC (tree, 12);
        build_and_record_new_cond ((TREE_CODE (cond) == LT_EXPR
  				  ? LE_EXPR : GE_EXPR),
  				 op0, op1, &edge_info->cond_equivalences[4]);
-       build_and_record_new_cond (ORDERED_EXPR, op0, op1,
- 				 &edge_info->cond_equivalences[6]);
        build_and_record_new_cond (NE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[8]);
!       build_and_record_new_cond (LTGT_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[10]);
        break;

      case GE_EXPR:
      case LE_EXPR:
!       edge_info->max_cond_equivalences = 6;
!       edge_info->cond_equivalences = XNEWVEC (tree, 6);
!       build_and_record_new_cond (ORDERED_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[4]);
        break;

      case EQ_EXPR:
!       edge_info->max_cond_equivalences = 10;
!       edge_info->cond_equivalences = XNEWVEC (tree, 10);
!       build_and_record_new_cond (ORDERED_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[4]);
        build_and_record_new_cond (LE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[6]);
        build_and_record_new_cond (GE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[8]);
        break;

      case UNORDERED_EXPR:
--- 950,1010 ----
      {
      case LT_EXPR:
      case GT_EXPR:
!       if (FLOAT_TYPE_P (TREE_TYPE (op0)))
! 	{
! 	  edge_info->max_cond_equivalences = 12;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 12);
! 	  build_and_record_new_cond (ORDERED_EXPR, op0, op1,
! 				     &edge_info->cond_equivalences[8]);
! 	  build_and_record_new_cond (LTGT_EXPR, op0, op1,
! 				     &edge_info->cond_equivalences[10]);
! 	}
!       else
! 	{
! 	  edge_info->max_cond_equivalences = 8;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 8);
! 	}
!
        build_and_record_new_cond ((TREE_CODE (cond) == LT_EXPR
  				  ? LE_EXPR : GE_EXPR),
  				 op0, op1, &edge_info->cond_equivalences[4]);
        build_and_record_new_cond (NE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[6]);
        break;

      case GE_EXPR:
      case LE_EXPR:
!       if (FLOAT_TYPE_P (TREE_TYPE (op0)))
! 	{
! 	  edge_info->max_cond_equivalences = 6;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 6);
! 	  build_and_record_new_cond (ORDERED_EXPR, op0, op1,
! 				     &edge_info->cond_equivalences[4]);
! 	}
!       else
! 	{
! 	  edge_info->max_cond_equivalences = 4;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 4);
! 	}
        break;

      case EQ_EXPR:
!       if (FLOAT_TYPE_P (TREE_TYPE (op0)))
! 	{
! 	  edge_info->max_cond_equivalences = 10;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 10);
! 	  build_and_record_new_cond (ORDERED_EXPR, op0, op1,
! 				     &edge_info->cond_equivalences[8]);
! 	}
!       else
! 	{
! 	  edge_info->max_cond_equivalences = 8;
! 	  edge_info->cond_equivalences = XNEWVEC (tree, 8);
! 	}
        build_and_record_new_cond (LE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[4]);
        build_and_record_new_cond (GE_EXPR, op0, op1,
! 				 &edge_info->cond_equivalences[6]);
        break;

      case UNORDERED_EXPR:


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]