This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[tree-ssa] DCE with control dependence again (with numbers, for a change)


Hi Jeff, all,

Here are the numbers for CD-DCE computed goto factoring disabled.
I've added two timevars to see if/where we lose.

current:
Results for interpret.ii, flags=-O2, 3 runs:
tree DCE		0.54	0.53	0.53
TOTAL			30.29	30.28	30.27

current + patch:
Results for interpret.ii, flags=-O2, 3 runs:
tree conservative DCE	0.32	0.33	0.32
tree aggressive DCE	1.01	1.02	1.02
control dependence	0.03	0.03	0.03
TOTAL			30.86	30.87	30.86

So in aggressive mode with computed gotos unfactored, yes we are
slower for interpret.ii, but not a lot slower.  For this file the
timing difference disappear when computed goto factoring is enabled.
I don't know how common an insane CFG like this one is in "real
world" code without computed gotos.


Next, I tested 8361 at -O2 and -O3:

current:
Results for 8361.ii, flags=-O2, 3 runs:
tree DCE		0.61	0.59	0.63
TOTAL			57.66	57.53	57.59
.text size:	320954

Results for 8361.ii, flags= -O3, 3 runs:
tree DCE		0.56	0.63	0.65
TOTAL			58.98	58.96	58.96
.text size:	315823

current + patch:
Results for 8361.ii, flags=-O2, 3 runs:
tree conservative DCE	0.41	0.49	0.47
tree aggressive DCE	0.26	0.32	0.29
control dependence	0.04	0.05	0.01
TOTAL			57.15	57.16	57.21
.text size:	317702

Results for 8361.ii, flags=-O3, 3 runs:
tree conservative DCE	0.39	0.45	0.48
tree aggressive DCE	0.26	0.29	0.24
control dependence	0.02	0.07	0.04
TOTAL			58.46	58.45	58.43
.text size:	312524

So we're slightly faster _with_ the patch (!), and the generated
code is smaller.


Then I timed the preprocessed GCC files that Diego kindly provided
to me, and again we're slightly faster:

current:
Results for cc1-i-files, flags=-O2, 3 runs:
Total time:		7m43.476s	7m43.341s	7m43.379s
.text size of *.o:	6621218

current+patch
Results for cc1-i-files, flags=-O2, 3 runs:
Total time:		7m40.380s	7m40.392s	7m40.341s
.text size of *.o:	6620014


Finally, I looked at how big cc1 and cc1plus are with and without
the patch:

current:
.text size of GCC binaries after bootstrap:
cc1	3904480
cc1plus	4367792

current + patch:
.text size of GCC binaries after bootstrap:
cc1	3903181
cc1plus	4366413

So again we produce smaller binaries.  This is quite remarkable IMO
because the new tree-ssa-dce code is much larger than the old one
(.text size 4842 with current, .text size 8910 with the patch).

I cannot give you benchmark performance numbers right now, I'm still
trying to get SPEC numbers with Andreas Jaeger. 
(I doubt it will be a huge win -- but based on the numbers above and
on the code size reductions and on some toy tests I did with gzip, I
think we can win a point or two for some tests...)

I got these results with the attached patch and new tree-ssa-dce.c.
Full testing is still ongoing (Java, sigh).  If you think these numbers
are already unacceptable, lemme know asap so I can press ctrl-C and go
on to something more useful ;-)

Gr.
Steven
Index: timevar.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/timevar.def,v
retrieving revision 1.14.2.29
diff -c -3 -p -r1.14.2.29 timevar.def
*** timevar.def	7 Jan 2004 23:44:14 -0000	1.14.2.29
--- timevar.def	9 Jan 2004 16:53:29 -0000
*************** DEFTIMEVAR (TV_TREE_SSA_DOMINATOR_OPTS  
*** 73,79 ****
  DEFTIMEVAR (TV_TREE_SRA              , "tree SRA")
  DEFTIMEVAR (TV_TREE_CCP		     , "tree CCP")
  DEFTIMEVAR (TV_TREE_PRE		     , "tree PRE")
! DEFTIMEVAR (TV_TREE_DCE		     , "tree DCE")
  DEFTIMEVAR (TV_TREE_LOOP	     , "tree loop optimization")
  DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL    , "tree SSA to normal")
  DEFTIMEVAR (TV_TREE_SSA_VERIFY       , "tree SSA verifier")
--- 73,80 ----
  DEFTIMEVAR (TV_TREE_SRA              , "tree SRA")
  DEFTIMEVAR (TV_TREE_CCP		     , "tree CCP")
  DEFTIMEVAR (TV_TREE_PRE		     , "tree PRE")
! DEFTIMEVAR (TV_TREE_DCE		     , "tree conservative DCE")
! DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
  DEFTIMEVAR (TV_TREE_LOOP	     , "tree loop optimization")
  DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL    , "tree SSA to normal")
  DEFTIMEVAR (TV_TREE_SSA_VERIFY       , "tree SSA verifier")
*************** DEFTIMEVAR (TV_TREE_STMT_VERIFY      , "
*** 81,86 ****
--- 82,88 ----
  DEFTIMEVAR (TV_CFG_VERIFY            , "CFG verifier")
  DEFTIMEVAR (TV_CGRAPH_VERIFY         , "callgraph verifier")
  DEFTIMEVAR (TV_DOM_FRONTIERS         , "dominance frontiers")
+ DEFTIMEVAR (TV_CONTROL_DEPENDENCES   , "control dependences")
  DEFTIMEVAR (TV_OVERLOAD              , "overload resolution")
  DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, "template instantiation")
  DEFTIMEVAR (TV_EXPAND		     , "expand")
Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-flow.h,v
retrieving revision 1.1.4.179
diff -c -3 -p -r1.1.4.179 tree-flow.h
*** tree-flow.h	7 Jan 2004 23:44:15 -0000	1.1.4.179
--- tree-flow.h	9 Jan 2004 16:53:31 -0000
*************** extern void debug_dominator_optimization
*** 523,529 ****
  extern void propagate_copy (tree *, tree);
  
  /* In tree-ssa-dce.c  */
! void tree_ssa_dce (tree, enum tree_dump_index);
  
  /* In tree-ssa-loop.c  */
  void tree_ssa_loop_opt (tree, enum tree_dump_index);
--- 523,529 ----
  extern void propagate_copy (tree *, tree);
  
  /* In tree-ssa-dce.c  */
! void tree_ssa_dce (tree, bool, enum tree_dump_index);
  
  /* In tree-ssa-loop.c  */
  void tree_ssa_loop_opt (tree, enum tree_dump_index);
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 1.1.4.104
diff -c -3 -p -r1.1.4.104 tree-optimize.c
*** tree-optimize.c	8 Jan 2004 20:08:57 -0000	1.1.4.104
--- tree-optimize.c	9 Jan 2004 16:53:31 -0000
*************** optimize_function_tree (tree fndecl, tre
*** 134,140 ****
        /* Do a first DCE pass to remove dead pointer assignments taking the
  	 address of local variables.  */
        if (flag_tree_dce)
! 	tree_ssa_dce (fndecl, TDI_dce_1);
  
        ggc_collect ();
  
--- 134,140 ----
        /* Do a first DCE pass to remove dead pointer assignments taking the
  	 address of local variables.  */
        if (flag_tree_dce)
! 	tree_ssa_dce (fndecl, /*aggressive=*/ false, TDI_dce_1);
  
        ggc_collect ();
  
*************** optimize_function_tree (tree fndecl, tre
*** 191,197 ****
        /* Do a second DCE pass.  */
        if (flag_tree_dce)
  	{
! 	  tree_ssa_dce (fndecl, TDI_dce_2);
  	  ggc_collect ();
  
  #ifdef ENABLE_CHECKING
--- 191,197 ----
        /* Do a second DCE pass.  */
        if (flag_tree_dce)
  	{
! 	  tree_ssa_dce (fndecl, /*aggressive=*/ false, TDI_dce_2);
  	  ggc_collect ();
  
  #ifdef ENABLE_CHECKING
*************** optimize_function_tree (tree fndecl, tre
*** 260,269 ****
  #endif
  	}
  
!       /* Do a third DCE pass.  */
        if (flag_tree_dce)
  	{
! 	  tree_ssa_dce (fndecl, TDI_dce_3);
  	  ggc_collect ();
  
  #ifdef ENABLE_CHECKING
--- 260,270 ----
  #endif
  	}
  
!       /* Do a third DCE pass.  Do more aggressive DCE using control
! 	 dependence at -O2 or better.  */
        if (flag_tree_dce)
  	{
! 	  tree_ssa_dce (fndecl, /*aggressive=*/ optimize >= 2, TDI_dce_3);
  	  ggc_collect ();
  
  #ifdef ENABLE_CHECKING
/* Dead code elimination pass for the GNU compiler.
   Copyright (C) 2002, 2004, 2004 Free Software Foundation, Inc.
   Contributed by Ben Elliston <bje@redhat.com>
   and Andrew MacLeod <amacleod@redhat.com>
 
This file is part of GCC.
   
GCC is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.
   
GCC is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.
   
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING.  If not, write to the Free
Software Foundation, 59 Temple Place - Suite 330, Boston, MA
02111-1307, USA.  */

/* Dead code elimination.

   References:

     Building an Optimizing Compiler,
     Robert Morgan, Butterworth-Heinemann, 1998, Section 8.9.

     Advanced Compiler Design and Implementation,
     Steven Muchnick, Morgan Kaufmann, 1997, Section 18.10.

   Dead-code elimination is the removal of statements which have no
   impact on the program's output.  "Dead statements" have no impact
   on the program's output, while "necessary statements" may have
   impact on the output.

   The algorithm consists of three phases:
   1. Marking as necessary all statements known to be necessary,
      e.g. most function calls, writing a value to memory, etc;
   2. Propagating necessary statements, e.g., the statements
      giving values to operands in necessary statements; and
   3. Removing dead statements.  */

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "errors.h"
#include "ggc.h"

/* These RTL headers are needed for basic-block.h.  */
#include "rtl.h"
#include "tm_p.h"
#include "hard-reg-set.h"
#include "basic-block.h"

#include "tree.h"
#include "bitmap.h"
#include "diagnostic.h"
#include "tree-flow.h"
#include "tree-simple.h"
#include "tree-dump.h"
#include "timevar.h"

/* Debugging dumps.  */
static FILE *tree_dump_file;
static int tree_dump_flags;

static struct stmt_stats
{
  int total;
  int total_phis;
  int removed;
  int removed_phis;
} stats;

static varray_type worklist;

/* Vector indicating an SSA name has already been processed and marked
   as necessary.  */
static sbitmap processed;

/* Vector indicating that last_stmt if a basic block has already been
   marked as necessary.  */
static sbitmap last_stmt_necessary;

/* Before we can determine whether a control branch is dead, we need to
   compute which blocks are control dependent on which edges.

   We expect each block to be control dependent on very few edges so we
   use a bitmap for each block recording its edges.  An array holds the
   bitmap.  The Ith bit in the bitmap is set if that block is dependent
   on the Ith edge.  */
bitmap *control_dependence_map;

/* Execute CODE for each edge (given number EDGE_NUMBER within the CODE)
   for which the block with index N is control dependent.  */
#define EXECUTE_IF_CONTROL_DEPENDENT(N, EDGE_NUMBER, CODE)		      \
  EXECUTE_IF_SET_IN_BITMAP (control_dependence_map[N], 0, EDGE_NUMBER, CODE)

/* Local function prototypes.  */
static inline void set_control_dependence_map_bit (basic_block, int);
static inline void clear_control_dependence_bitmap (basic_block);
static void find_all_control_dependences (struct edge_list *);
static void find_control_dependence (struct edge_list *, int);
static inline basic_block find_pdom (basic_block);

static inline bool necessary_p (tree);
static inline void clear_necessary (tree);
static inline void mark_stmt_necessary (tree, bool);
static inline void mark_operand_necessary (tree);

static bool need_to_preserve_store (tree);
static void mark_stmt_if_obviously_necessary (tree, bool);
static void find_obviously_necessary_stmts (bool);

static void mark_control_dependent_edges_necessary (basic_block, struct edge_list *);
static void propagate_necessity (struct edge_list *);

static void eliminate_unnecessary_stmts (void);
static void remove_dead_phis (basic_block);
static void remove_dead_stmt (block_stmt_iterator *, basic_block);

static void print_stats (void);
static void tree_dce_init (bool);
static void tree_dce_done (bool);

/* Indicate block BB is control dependent on an edge with index EDGE_INDEX.  */
static inline void
set_control_dependence_map_bit (basic_block bb, int edge_index)
{
  if (bb == ENTRY_BLOCK_PTR)
    return;
  if (bb == EXIT_BLOCK_PTR)
    abort ();
  bitmap_set_bit (control_dependence_map[bb->index], edge_index);
}

/* Clear all control dependences for block BB.  */
static inline
void clear_control_dependence_bitmap (basic_block bb)
{
  bitmap_clear (control_dependence_map[bb->index]);
}

/* Record all blocks' control dependences on all edges in the edge
   list EL, ala Morgan, Section 3.6.  */

static void
find_all_control_dependences (struct edge_list *el)
{
  int i;

  for (i = 0; i < NUM_EDGES (el); ++i)
    find_control_dependence (el, i);
}

/* Determine all blocks' control dependences on the given edge with edge_list
   EL index EDGE_INDEX, ala Morgan, Section 3.6.  */

static void
find_control_dependence (struct edge_list *el, int edge_index)
{
  basic_block current_block;
  basic_block ending_block;

#ifdef ENABLE_CHECKING
  if (INDEX_EDGE_PRED_BB (el, edge_index) == EXIT_BLOCK_PTR)
    abort ();
#endif

  if (INDEX_EDGE_PRED_BB (el, edge_index) == ENTRY_BLOCK_PTR)
    ending_block = ENTRY_BLOCK_PTR->next_bb;
  else
    ending_block = find_pdom (INDEX_EDGE_PRED_BB (el, edge_index));

  for (current_block = INDEX_EDGE_SUCC_BB (el, edge_index);
       current_block != ending_block && current_block != EXIT_BLOCK_PTR;
       current_block = find_pdom (current_block))
    {
      edge e = INDEX_EDGE (el, edge_index);

      /* For abnormal edges, we don't make current_block control
	 dependent because instructions that throw are always necessary
	 anyway.  */
      if (e->flags & EDGE_ABNORMAL)
	continue;

      set_control_dependence_map_bit (current_block, edge_index);
    }
}

/* Find the immediate postdominator PDOM of the specified basic block BLOCK.
   This function is necessary because some blocks have negative numbers.  */

static inline basic_block
find_pdom (basic_block block)
{
  if (block == ENTRY_BLOCK_PTR)
    abort ();
  else if (block == EXIT_BLOCK_PTR)
    return EXIT_BLOCK_PTR;
  else
    {
      basic_block bb = get_immediate_dominator (CDI_POST_DOMINATORS, block);
      if (!bb)
	return EXIT_BLOCK_PTR;
      return bb;
    }
}

#define NECESSARY(stmt)		stmt->common.asm_written_flag

/* Return true if T is marked necessary.  */
static inline bool
necessary_p (tree t)
{
  return NECESSARY (t);
}

/* Clear the necessary mark for T.  */
static inline void
clear_necessary (tree t)
{
  NECESSARY (t) = 0;
}

/* If STMT is not already marked necessary, mark it, and add it to the
   worklist if ADD_TO_WORKLIST is true.  */
static inline void
mark_stmt_necessary (tree stmt, bool add_to_worklist)
{
#ifdef ENABLE_CHECKING
  if (stmt == NULL
      || stmt == error_mark_node
      || (stmt && DECL_P (stmt)))
    abort ();
#endif

  if (necessary_p (stmt))
    return;

  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    {
      fprintf (tree_dump_file, "Marking useful stmt: ");
      print_generic_stmt (tree_dump_file, stmt, TDF_SLIM);
      fprintf (tree_dump_file, "\n");
    }

  NECESSARY (stmt) = 1;
  if (add_to_worklist)
    VARRAY_PUSH_TREE (worklist, stmt);
}

/* Mark the statement defining operand OP as necessary.  */

static inline void
mark_operand_necessary (tree op)
{
  tree stmt;
  int ver;

#ifdef ENABLE_CHECKING
  if (op == NULL)
    abort ();
#endif

  ver = SSA_NAME_VERSION (op);
  if (TEST_BIT (processed, ver))
    return;
  SET_BIT (processed, ver);

  stmt = SSA_NAME_DEF_STMT (op);
#ifdef ENABLE_CHECKING
  if (stmt == NULL)
    abort ();
#endif

  if (necessary_p (stmt))
    return;

  NECESSARY (stmt) = 1;
  VARRAY_PUSH_TREE (worklist, stmt);
}
#undef NECESSARY

/* Return true if a store to a variable needs to be preserved.  */

static bool
need_to_preserve_store (tree var)
{
  tree base_symbol;
  tree sym;

  if (var == NULL)
    return false;

  sym = SSA_NAME_VAR (var);
  base_symbol = get_base_symbol (var);

  /* Store to global variables must be preserved.  */
  if (decl_function_context (base_symbol) != current_function_decl)
    return true;

  /* Static locals must be preserved as well.  */
  if (TREE_STATIC (base_symbol))
    return true;

  /* If SYM may alias global memory, we also need to preserve the store.  */
  if (may_alias_global_mem_p (sym))
    return true;

  return false;
}


/* Mark STMT as necessery if it is obvously is.  Add it to the worklist if
   it can make other statements necessary.

   If AGGRESSIVE is false, control statements are conservatively marked as
   necessary.  */

static void
mark_stmt_if_obviously_necessary (tree stmt, bool aggressive)
{
  def_optype defs;
  vdef_optype vdefs;
  stmt_ann_t ann;
  size_t i;

  clear_necessary (stmt);

  /* Statements that are implicitly live.  Most function calls, asm and return
     statements are required.  Labels and BIND_EXPR nodes are kept because
     they are control flow, and we have no way of knowing whether they can be
     removed.  DCE can eliminate all the other statements in a block, and CFG
     can then remove the block and labels.  */
  switch (TREE_CODE (stmt))
    {
    case BIND_EXPR:
    case LABEL_EXPR:
    case CASE_LABEL_EXPR:
      mark_stmt_necessary (stmt, false);
      return;

    case ASM_EXPR:
    case RESX_EXPR:
    case RETURN_EXPR:
      mark_stmt_necessary (stmt, true);
      return;

    case CALL_EXPR:
      /* Most, but not all function calls are required.  Function calls that
	 produce no result and have no side effects (i.e. const pure
	 functions) are unnecessary.  */
      if (TREE_SIDE_EFFECTS (stmt))
	mark_stmt_necessary (stmt, true);
      return;

    case MODIFY_EXPR:
      if (TREE_CODE (TREE_OPERAND (stmt, 1)) == CALL_EXPR
	  && TREE_SIDE_EFFECTS (TREE_OPERAND (stmt, 1)))
	{
	  mark_stmt_necessary (stmt, true);
	  return;
	}

      /* These values are mildly magic bits of the EH runtime.  We can't
	 see the entire lifetime of these values until landing pads are
	 generated.  */
      if (TREE_CODE (TREE_OPERAND (stmt, 0)) == EXC_PTR_EXPR
	  || TREE_CODE (TREE_OPERAND (stmt, 0)) == FILTER_EXPR)
	{
	  mark_stmt_necessary (stmt, true);
	  return;
	}
      break;

    case GOTO_EXPR:
      mark_stmt_necessary (stmt, true);
      return;

    case COND_EXPR:
      if (GOTO_DESTINATION (COND_EXPR_THEN (stmt))
	  == GOTO_DESTINATION (COND_EXPR_ELSE (stmt)))
	{
	  /* A COND_EXPR is obviously dead if the target labels are the same.
	     We cannot kill the statement at this point, so to preven the
	     statement from being marked necessary, we replace the condition
	     with a constant.  The stmt is killed later on in cfg_cleanup.  */
	  COND_EXPR_COND (stmt) = integer_zero_node;
	  modify_stmt (stmt);
	  return;
	}
      /* Fall through.  */

    case SWITCH_EXPR:
      if (!aggressive)
	mark_stmt_necessary (stmt, true);
      break;

    default:
      break;
    }

  ann = stmt_ann (stmt);
  /* If the statement has volatile operands, it needs to be preserved.  Same
     for statements that can alter control flow in unpredictable ways.  */
  if (ann->has_volatile_ops
      || is_ctrl_altering_stmt (stmt))
    {
      mark_stmt_necessary (stmt, true);
      return;
    }

  get_stmt_operands (stmt);

  defs = DEF_OPS (ann);
  for (i = 0; i < NUM_DEFS (defs); i++)
    {
      tree def = DEF_OP (defs, i);
      if (need_to_preserve_store (def))
	{
	  mark_stmt_necessary (stmt, true);
	  return;
        }
    }

  vdefs = VDEF_OPS (ann);
  for (i = 0; i < NUM_VDEFS (vdefs); i++)
    {
      tree vdef = VDEF_RESULT (vdefs, i);
      if (need_to_preserve_store (vdef))
	{
	  mark_stmt_necessary (stmt, true);
	  return;
        }
    }

  return;
}

/* Find obviously necessary statements.  These are things like most function
   calls, and stores to file level variables.

   If AGGRESSIVE is false, control statements are conservatively marked as
   necessary.  */

static void
find_obviously_necessary_stmts (bool aggressive)
{
  basic_block bb;
  block_stmt_iterator i;

  FOR_EACH_BB (bb)
    {
      tree phi;

      /* Check any PHI nodes in the block.  */
      for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi))
	{
	  clear_necessary (phi);

	  /* PHIs for virtual variables do not directly affect code
	     generation and need not be considered inherently necessary
	     regardless of the bits set in their decl.

	     Thus, we only need to mark PHIs for real variables which
	     need their result preserved as being inherently necessary.  */
	  if (is_gimple_reg (PHI_RESULT (phi))
	      && need_to_preserve_store (PHI_RESULT (phi)))
	    mark_stmt_necessary (phi, true);
        }

      /* Check all statements in the block.  */
      for (i = bsi_start (bb); !bsi_end_p (i); bsi_next (&i))
	mark_stmt_if_obviously_necessary (bsi_stmt (i), aggressive);
    }
}

/* Make corresponding control dependent edges necessary.  We only
   have to do this once for each basic block, so we clear the bitmap
   after we're done.  */
static void
mark_control_dependent_edges_necessary (basic_block bb, struct edge_list *el)
{
  int edge_number;

  EXECUTE_IF_CONTROL_DEPENDENT (bb->index, edge_number,
    {
      tree t;
      basic_block cd_bb = INDEX_EDGE_PRED_BB (el, edge_number);

      if (TEST_BIT (last_stmt_necessary, cd_bb->index))
	continue;
      SET_BIT (last_stmt_necessary, cd_bb->index);

      t = last_stmt (cd_bb);
      if (is_ctrl_stmt (t))
	mark_stmt_necessary (t, true);
    });
  clear_control_dependence_bitmap (bb);
}

/* Propagate necessity using the operands of necessary statements.  Process
   the uses on each statement in the worklist, and add all feeding statements
   which contribute to the calculation of this value to the worklist.

   In conservative mode, EL is NULL.  */

static void
propagate_necessity (struct edge_list *el)
{
  tree i;
  bool aggressive = (el ? true : false); 

  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    fprintf (tree_dump_file, "\nProcessing worklist:\n");

  while (VARRAY_ACTIVE_SIZE (worklist) > 0)
    {
      /* Take `i' from worklist.  */
      i = VARRAY_TOP_TREE (worklist);
      VARRAY_POP (worklist);

      if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
	{
	  fprintf (tree_dump_file, "processing: ");
	  print_generic_stmt (tree_dump_file, i, TDF_SLIM);
	  fprintf (tree_dump_file, "\n");
	}

      if (aggressive)
	mark_control_dependent_edges_necessary (bb_for_stmt (i), el);

      if (TREE_CODE (i) == PHI_NODE)
	{
	  /* PHI nodes are somewhat special in that each PHI alternative has
	     data and control dependencies.  All the statements feeding the
	     PHI node's arguments are always necessary.  In aggressive mode,
	     we also consider the control dependent edges leading to the
	     predecessor block associated with each PHI alternative as
	     necessary.  */
	  int k;
	  for (k = 0; k < PHI_NUM_ARGS (i); k++)
            {
	      tree arg = PHI_ARG_DEF (i, k);
	      if (TREE_CODE (arg) == SSA_NAME)
		mark_operand_necessary (arg);
	    }

	  if (aggressive)
	    {
	      for (k = 0; k < PHI_NUM_ARGS (i); k++)
		{
		  basic_block arg_bb = PHI_ARG_EDGE (i, k)->src;
		  mark_control_dependent_edges_necessary (arg_bb, el);
		}
	    }
	}
      else
	{
	  /* Propagate through the operands.  Examine all the USE, VUSE and
	     VDEF operands in this statement.  Mark all the statements which
	     feed this statement's uses as necessary.  */
	  vuse_optype vuses;
	  vdef_optype vdefs;
	  use_optype uses;
	  stmt_ann_t ann;
	  size_t k;

	  get_stmt_operands (i);
	  ann = stmt_ann (i);

	  uses = USE_OPS (ann);
	  for (k = 0; k < NUM_USES (uses); k++)
	    mark_operand_necessary (USE_OP (uses, k));

	  vuses = VUSE_OPS (ann);
	  for (k = 0; k < NUM_VUSES (vuses); k++)
	    mark_operand_necessary (VUSE_OP (vuses, k));

	  /* The operands of VDEF expressions are also needed as they
	     represent potential definitions that may reach this
	     statement (VDEF operands allow us to follow def-def links).  */
	  vdefs = VDEF_OPS (ann);
	  for (k = 0; k < NUM_VDEFS (vdefs); k++)
	    mark_operand_necessary (VDEF_OP (vdefs, k));
	}
    }
}

/* Eliminate unnecessary statements. Any instruction not marked as necessary
   contributes nothing to the program, and can be deleted.  */

static void
eliminate_unnecessary_stmts (void)
{
  basic_block bb;
  block_stmt_iterator i;

  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    fprintf (tree_dump_file, "\nEliminating unnecessary statements:\n");

  clear_special_calls ();
  FOR_EACH_BB (bb)
    {
      /* Remove dead PHI nodes.  */
      remove_dead_phis (bb);

      /* Remove dead statements.  */
      for (i = bsi_start (bb); !bsi_end_p (i) ; )
	{
	  tree t = bsi_stmt (i);

	  stats.total++;

	  /* If `i' is not necessary then remove it.  */
	  if (!necessary_p (t))
	    remove_dead_stmt (&i, bb);
	  else
	    {
	      if (TREE_CODE (t) == CALL_EXPR)
		notice_special_calls (t);
	      else if (TREE_CODE (t) == MODIFY_EXPR
		       && TREE_CODE (TREE_OPERAND (t, 1)) == CALL_EXPR)
		notice_special_calls (TREE_OPERAND (t, 1));
	      bsi_next (&i);
	    }
	}
    }
}

/* Remove dead PHI nodes from block BB.  */

static void
remove_dead_phis (basic_block bb)
{
  tree prev, phi;

  prev = NULL_TREE;
  phi = phi_nodes (bb);
  while (phi)
    {
      stats.total_phis++;

      if (!necessary_p (phi))
	{
	  tree next = TREE_CHAIN (phi);

	  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
	    {
	      fprintf (tree_dump_file, "Deleting : ");
	      print_generic_stmt (tree_dump_file, phi, TDF_SLIM);
	      fprintf (tree_dump_file, "\n");
	    }

	  remove_phi_node (phi, prev, bb);
	  stats.removed_phis++;
	  phi = next;
	}
      else
	{
	  prev = phi;
	  phi = TREE_CHAIN (phi);
	}
    }
}

/* Remove dead statement pointed by iterator I.  Receives the basic block BB
   containing I so that we don't have to look it up.  */

static void
remove_dead_stmt (block_stmt_iterator *i, basic_block bb)
{
  tree t = bsi_stmt (*i);

  if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
    {
      fprintf (tree_dump_file, "Deleting : ");
      print_generic_stmt (tree_dump_file, t, TDF_SLIM);
      fprintf (tree_dump_file, "\n");
    }

  stats.removed++;

  /* If we have determined that a conditional branch statement contributes
     nothing to the program, then we not only remove it, but change the
     flowgraph so that the block points directly to the immediate
     post-dominator.  The flow graph will remove the blocks we are
     circumventing, and this block will then simply fall-thru to the
     post-dominator.  This prevents us from having to add any branch
     instuctions to replace the conditional statement.  */
  if (is_ctrl_stmt (t))
    {
      basic_block post_dom_bb;
      edge e;
#ifdef ENABLE_CHECKING
      /* The post dominance info has to be up-to-date.  */
      if (dom_computed[CDI_POST_DOMINATORS] != DOM_OK)
	abort ();
#endif
      /* Get the immediate post dominator of bb.  */
      post_dom_bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
      /* Some blocks don't have an immediate post dominator.  This can happen
	 for example with infinite loops.  Removing an infinite loop is an
	 inappropriate transformation anyway...  */
      if (!post_dom_bb)
	{
	  bsi_next (i);
	  return;
	}

      /* Remove all outgoing edges, and add an edge to the post dominator.  */
      for (e = bb->succ; e != NULL;)
	{
	  edge tmp = e;
	  e = e->succ_next;
	  remove_edge (tmp);
	}
      make_edge (bb, post_dom_bb,
		 (post_dom_bb == EXIT_BLOCK_PTR ? 0 : EDGE_FALLTHRU));
    }

  bsi_remove (i);
}

/* Print out removed statement statistics.  */

static void
print_stats (void)
{
  if (tree_dump_file && (tree_dump_flags & (TDF_STATS|TDF_DETAILS)))
    {
      float percg;

      percg = ((float) stats.removed / (float) stats.total) * 100;
      fprintf (tree_dump_file, "Removed %d of %d statements (%d%%)\n",
	       stats.removed, stats.total, (int) percg);

      if (stats.total_phis == 0)
	percg = 0;
      else
	percg = ((float) stats.removed_phis / (float) stats.total_phis) * 100;

      fprintf (tree_dump_file, "Removed %d of %d PHI nodes (%d%%)\n",
	       stats.removed_phis, stats.total_phis, (int) percg);
    }
}

/* Initialization for this pass.  Set up the used data structures.  */

static void
tree_dce_init (bool aggressive)
{
  if (aggressive)
    {
      int i;

      control_dependence_map 
	= xmalloc (last_basic_block * sizeof (bitmap));
      for (i = 0; i < last_basic_block; ++i)
	control_dependence_map[i] = BITMAP_XMALLOC ();

      last_stmt_necessary = sbitmap_alloc (last_basic_block);
      sbitmap_zero (last_stmt_necessary);
    }

  processed = sbitmap_alloc (highest_ssa_version + 1);
  sbitmap_zero (processed);

  VARRAY_TREE_INIT (worklist, 64, "work list");
}

/* Cleanup after this pass.  */

static void
tree_dce_done (bool aggressive)
{
  if (aggressive)
    {
      int i;

      for (i = 0; i < last_basic_block; ++i)
	BITMAP_XFREE (control_dependence_map[i]);
      free (control_dependence_map);

      sbitmap_free (last_stmt_necessary);
    }

  sbitmap_free (processed);
}

/* Main routine to eliminate dead code.

   FLAGS controls the aggressiveness of the algorithm.
   In conservative mode, we ignore control dependence and simply declare
   all but the most trivially dead branches necessary.  This mode is fast.
   In aggressive mode, control dependences are taken into account, which
   results in more dead code elimination, but at the cost of some time.

   PHASE indicates which dump file from the DUMP_FILES array to use when
   dumping debugging information.

   FIXME: Aggressive mode before PRE doesn't work currently because
	  the dominance info is not invalidated after DCE1.  */

void
tree_ssa_dce (tree fndecl, bool aggressive, enum tree_dump_index phase)
{
  struct edge_list *el = NULL;

  timevar_push (aggressive ? TV_TREE_CD_DCE : TV_TREE_DCE);
  tree_dce_init (aggressive);

  /* Initialize tree_dump_file for debugging dumps.  */
  tree_dump_file = dump_begin (phase, &tree_dump_flags);

  if (aggressive)
    {
      /* Compute control dependence.  */
      timevar_push (TV_CONTROL_DEPENDENCES);
      calculate_dominance_info (CDI_POST_DOMINATORS);
      el = create_edge_list ();
      find_all_control_dependences (el);
      timevar_pop (TV_CONTROL_DEPENDENCES);
    }

  find_obviously_necessary_stmts (aggressive);

  propagate_necessity (el);

  eliminate_unnecessary_stmts ();

  if (aggressive)
    free_dominance_info (CDI_POST_DOMINATORS);

  cleanup_tree_cfg ();

  /* Debugging dumps.  */
  if (tree_dump_file)
    {
      dump_function_to_file (fndecl, tree_dump_file, tree_dump_flags);
      print_stats ();
      dump_end (phase, tree_dump_file);
    }

  tree_dce_done (aggressive);
  timevar_pop (aggressive ? TV_TREE_CD_DCE : TV_TREE_DCE);
}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]