[tree-ssa] Dead Store Elimination

law@redhat.com law@redhat.com
Mon Feb 16 14:37:00 GMT 2004



This adds dead store elimination to our list of supported optimizations.

So, given something like this:

foo (z, y, xx)
{ 
  # BLOCK 0
  # PRED: ENTRY [100.0%]  (fallthru)
  #   TMT.0_8 = VDEF <TMT.0_7>;
  *z_2 = 1;
  if (xx_3 != 0) goto <L0>; else goto <L2>;
  # SUCC: 2 [50.0%]  (false) 1 [50.0%]  (true)

  # BLOCK 1
  # PRED: 0 [50.0%]  (true)
<L0>:;
  # SUCC: 2 [100.0%]  (fallthru)

  # BLOCK 2
  # PRED: 0 [50.0%]  (false) 1 [100.0%]  (fallthru)
  # xx_1 = PHI <30(0), 20(1)>;
<L2>:;
  #   TMT.0_9 = VDEF <TMT.0_8>;
  *z_2 = 2;
  #   TMT.0_10 = VDEF <TMT.0_9>;
  *z_2 = 3;
  return xx_1;
  # SUCC: EXIT [100.0%] 

}

We can easily see there are two dead stores to *z.  And gee, the DSE pass
comes along and kills them :-)

foo (z, y, xx)
{
  # BLOCK 0
  # PRED: ENTRY [100.0%]  (fallthru)
  if (xx_3 != 0) goto <L0>; else goto <L2>;
  # SUCC: 2 [50.0%]  (false) 1 [50.0%]  (true)
  
  # BLOCK 1 
  # PRED: 0 [50.0%]  (true)
<L0>:;
  # SUCC: 2 [100.0%]  (fallthru)

  # BLOCK 2 
  # PRED: 0 [50.0%]  (false) 1 [100.0%]  (fallthru)
  # xx_1 = PHI <30(0), 20(1)>;
<L2>:;
  #   TMT.0_10 = VDEF <TMT.0_7>;
  *z_2 = 3;
  return xx_1;
  # SUCC: EXIT [100.0%]
  
} 


DSE works with a walk over the post-dominator tree, walking statements in
reverse order within blocks.  Basically we're looking for pairs of stores
in the walk in which the VDEF_RESULT of the earlier store is used once
and only once in a VDEF_OP of a later store.

There's a few things we do not currently handle optimally, but I believe
can be added within the existing framework:

  1. Optimization of store-load pairs.  The dominator optimizer gets
     _most_ of these already.  There's a class of them it tends to
     miss though and tree-ssa-dse may be the best place to catch them.

  2. Optimization of load-store pairs.

  3. Better ability to see through PHI nodes.  For DSE a PHI node is a
     non-event.  Right now the only time we can see through a PHI is
     when the store has a single VDEF.  For multiple VDEFs attached to
     a store we just need to verify that each VDEF feeds a PHI in the
     same block and that each PHI_RESULT is used once by a later store
     with the same address as the earlier store.



	* Makefile.in (OBJS-common): Add tree-ssa-dse.o
	(tree-ssa-dse.o): Add dependencies.
	* common.opt (ftree-dse): New option.
	* flags.h (flag_tree_dse): New.
	(flag_tree_dom): Fix comments.
	* opts.c (decode_options): Turn on flag_tree_dse.
	(common_handle_option): Handle  OPT_ftree_dse.
	* timevar.def (TV_TREE_PHIOPT): Update text.
	(TV_TREE_DSE): New timevar.
	* toplev.c (flag_tree_dse): New.
	(flag_tree_dom): Fix comments.
	(lang_independent_options): Add -ftree-dse.
	* tree-dfa.c (redirect_immediate_use): New function.
	(redirect_immediate_uses): New function.
	* tree-flow.h (stmt_ann_d): Add UID field.
	(redirect_immediate_uses): Declare.
	* tree-optimize.c (init_tree_optimization_passes): Link in DSE pass.
	* tree-pass.h (pass_dse): Declare.
	* tree-ssa-dse.c: New file implementing DSE.
	* doc/invoke.texi: Document new option.


Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.903.2.180
diff -c -p -r1.903.2.180 Makefile.in
*** Makefile.in	16 Feb 2004 12:35:39 -0000	1.903.2.180
--- Makefile.in	16 Feb 2004 13:55:18 -0000
*************** OBJS-common = \
*** 867,873 ****
   tree-alias-type.o gimplify.o tree-pretty-print.o                          \
   tree-alias-common.o tree-ssa-ccp.o tree-browser.o @ANDER@ tree-ssa-dce.o  \
   tree-ssa-pre.o tree-ssa-live.o tree-ssa-operands.o tree-ssa-alias.o       \
!  tree-ssa-phiopt.o tree-ssa-forwprop.o tree-nested.o			   \
   tree-ssa-dom.o domwalk.o tree-tailcall.o gimple-low.o tree-iterator.o	   \
   tree-phinodes.o tree-ssanames.o tree-sra.o tree-complex.o tree-ssa-loop.o \
   alias.o bb-reorder.o bitmap.o builtins.o caller-save.o calls.o	  	   \
--- 867,873 ----
   tree-alias-type.o gimplify.o tree-pretty-print.o                          \
   tree-alias-common.o tree-ssa-ccp.o tree-browser.o @ANDER@ tree-ssa-dce.o  \
   tree-ssa-pre.o tree-ssa-live.o tree-ssa-operands.o tree-ssa-alias.o       \
!  tree-ssa-phiopt.o tree-ssa-forwprop.o tree-nested.o tree-ssa-dse.o	   \
   tree-ssa-dom.o domwalk.o tree-tailcall.o gimple-low.o tree-iterator.o	   \
   tree-phinodes.o tree-ssanames.o tree-sra.o tree-complex.o tree-ssa-loop.o \
   alias.o bb-reorder.o bitmap.o builtins.o caller-save.o calls.o	  	   \
*************** tree-ssa.o : tree-ssa.c $(TREE_FLOW_H) $
*** 1553,1558 ****
--- 1553,1561 ----
     errors.h toplev.h function.h $(TIMEVAR_H) tree-alias-common.h \
     $(TM_H) coretypes.h $(TREE_DUMP_H) tree-ssa-live.h langhooks.h \
     cfgloop.h domwalk.h tree-pass.h
+ tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+    $(TM_H) errors.h $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \
+    $(TREE_FLOW_H) tree-pass.h $(TREE_DUMP_H) domwalk.h flags.h
  tree-ssa-forwprop.o : tree-ssa-phiopt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
     $(TM_H) errors.h $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \
     $(TREE_FLOW_H) tree-pass.h $(TREE_DUMP_H)
Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.14.2.20
diff -c -p -r1.14.2.20 common.opt
*** common.opt	13 Feb 2004 13:11:06 -0000	1.14.2.20
--- common.opt	16 Feb 2004 13:55:19 -0000
*************** ftree-dominator-opts
*** 727,732 ****
--- 727,736 ----
  Common
  Enable dominator optimizations
  
+ ftree-dse
+ Common
+ Enable dead store elimination
+ 
  ftree-loop-optimize
  Common
  Enable loop optimizations on trees
Index: flags.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/flags.h,v
retrieving revision 1.86.2.46
diff -c -p -r1.86.2.46 flags.h
*** flags.h	13 Feb 2004 13:11:25 -0000	1.86.2.46
--- flags.h	16 Feb 2004 13:55:20 -0000
*************** extern int flag_tree_combine_temps;
*** 730,737 ****
  /* Enable SSA->normal pass expression replacement.  */
  extern int flag_tree_ter;
  
! /* Enable dominator optimizations while re-writing into SSA form.  */
  extern int flag_tree_dom;
  
  /* Enable loop optimization on tree-ssa.  */
  extern int flag_tree_loop;
--- 730,740 ----
  /* Enable SSA->normal pass expression replacement.  */
  extern int flag_tree_ter;
  
! /* Enable dominator optimizations.  */
  extern int flag_tree_dom;
+ 
+ /* Enable dead store and redundant load elimination */
+ extern int flag_tree_dse;
  
  /* Enable loop optimization on tree-ssa.  */
  extern int flag_tree_loop;
Index: opts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/opts.c,v
retrieving revision 1.31.2.28
diff -c -p -r1.31.2.28 opts.c
*** opts.c	13 Feb 2004 13:11:40 -0000	1.31.2.28
--- opts.c	16 Feb 2004 13:55:23 -0000
*************** decode_options (unsigned int argc, const
*** 542,547 ****
--- 542,548 ----
        flag_tree_ccp = 1;
        flag_tree_dce = 1;
        flag_tree_dom = 1;
+       flag_tree_dse = 1;
        flag_tree_loop = 0;
        flag_tree_pre = 1;
        flag_tree_ter = 1;
*************** common_handle_option (size_t scode, cons
*** 1458,1463 ****
--- 1459,1468 ----
  
      case OPT_ftree_dominator_opts:
        flag_tree_dom = value;
+       break;
+ 
+     case OPT_ftree_dse:
+       flag_tree_dse = value;
        break;
  
      case OPT_ftree_loop_optimize:
Index: timevar.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/timevar.def,v
retrieving revision 1.14.2.35
diff -c -p -r1.14.2.35 timevar.def
*** timevar.def	13 Feb 2004 13:11:51 -0000	1.14.2.35
--- timevar.def	16 Feb 2004 13:55:23 -0000
*************** DEFTIMEVAR (TV_TREE_SRA              , "
*** 75,84 ****
  DEFTIMEVAR (TV_TREE_CCP		     , "tree CCP")
  DEFTIMEVAR (TV_TREE_SPLIT_EDGES      , "tree split crit edges")
  DEFTIMEVAR (TV_TREE_PRE		     , "tree PRE")
! DEFTIMEVAR (TV_TREE_PHIOPT	     , "tree optimize phi nodes ")
  DEFTIMEVAR (TV_TREE_FORWPROP	     , "tree forward propagate")
  DEFTIMEVAR (TV_TREE_DCE		     , "tree conservative DCE")
  DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
  DEFTIMEVAR (TV_TREE_LOOP	     , "tree loop optimization")
  DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL    , "tree SSA to normal")
  DEFTIMEVAR (TV_TREE_SSA_VERIFY       , "tree SSA verifier")
--- 75,85 ----
  DEFTIMEVAR (TV_TREE_CCP		     , "tree CCP")
  DEFTIMEVAR (TV_TREE_SPLIT_EDGES      , "tree split crit edges")
  DEFTIMEVAR (TV_TREE_PRE		     , "tree PRE")
! DEFTIMEVAR (TV_TREE_PHIOPT	     , "tree linearize phis")
  DEFTIMEVAR (TV_TREE_FORWPROP	     , "tree forward propagate")
  DEFTIMEVAR (TV_TREE_DCE		     , "tree conservative DCE")
  DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
+ DEFTIMEVAR (TV_TREE_DSE		     , "tree DSE")
  DEFTIMEVAR (TV_TREE_LOOP	     , "tree loop optimization")
  DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL    , "tree SSA to normal")
  DEFTIMEVAR (TV_TREE_SSA_VERIFY       , "tree SSA verifier")
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.654.2.97
diff -c -p -r1.654.2.97 toplev.c
*** toplev.c	13 Feb 2004 13:11:52 -0000	1.654.2.97
--- toplev.c	16 Feb 2004 13:55:29 -0000
*************** int flag_tree_combine_temps = 0;
*** 995,1003 ****
  /* Enable SSA->normal pass expression replacement.  */
  int flag_tree_ter = 0;
  
! /* Enable dominator optimizations while re-writing into SSA form.  */
  int flag_tree_dom = 0;
  
  /* Nonzero if we perform superblock formation.  */
  int flag_tracer = 0;
  
--- 995,1006 ----
  /* Enable SSA->normal pass expression replacement.  */
  int flag_tree_ter = 0;
  
! /* Enable dominator optimizations.  */
  int flag_tree_dom = 0;
  
+ /* Enable dead store elimination.  */
+ int flag_tree_dse = 0;
+ 
  /* Nonzero if we perform superblock formation.  */
  int flag_tracer = 0;
  
*************** static const lang_independent_options f_
*** 1200,1205 ****
--- 1203,1209 ----
    { "tree-ccp", &flag_tree_ccp, 1 },
    { "tree-dce", &flag_tree_dce, 1 },
    { "tree-dominator-opts", &flag_tree_dom, 1 },
+   { "tree-dse", &flag_tree_dse, 1 },
    { "tree-combine-temps", &flag_tree_combine_temps, 1 },
    { "tree-ter", &flag_tree_ter, 1 },
    { "tree-loop-optimize", &flag_tree_loop, 1 }
Index: tree-dfa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-dfa.c,v
retrieving revision 1.1.4.217
diff -c -p -r1.1.4.217 tree-dfa.c
*** tree-dfa.c	10 Feb 2004 18:24:15 -0000	1.1.4.217
--- tree-dfa.c	16 Feb 2004 13:55:31 -0000
*************** add_immediate_use (tree stmt, tree use_s
*** 361,366 ****
--- 361,411 ----
    VARRAY_PUSH_TREE (ann->df->immediate_uses, use_stmt);
  }
  
+ /* If the any immediate use of USE points to OLD, then redirect it to NEW.  
*/
+  
+ static void
+ redirect_immediate_use (tree use, tree old, tree new)
+ {
+   tree imm_stmt = SSA_NAME_DEF_STMT (use);
+   struct dataflow_d *df = get_stmt_ann (imm_stmt)->df;
+   unsigned int num_uses = num_immediate_uses (df);
+   unsigned int i;
+ 
+   for (i = 0; i < num_uses; i++)
+     {
+       if (immediate_use (df, i) == old)
+ 	{
+ 	  if (i == 0 || i == 1)
+ 	    df->uses[i] = new;
+ 	  else
+ 	    VARRAY_TREE (df->immediate_uses, i - 2) = new;
+ 	}
+     }
+ }
+ 
+ /* Redirect all immediate uses for operands in OLD so that they point
+    to NEW.  This routine should have no knowledge of how immediate
+    uses are stored.  */
+ 
+ void
+ redirect_immediate_uses (tree old, tree new)
+ {
+   stmt_ann_t ann = get_stmt_ann (old);
+   use_optype uses = USE_OPS (ann);
+   vuse_optype vuses = VUSE_OPS (ann);
+   vdef_optype vdefs = VDEF_OPS (ann);
+   unsigned int i;
+ 
+   /* Look at USE_OPS or VUSE_OPS according to FLAGS.  */
+   for (i = 0; i < NUM_USES (uses); i++)
+     redirect_immediate_use (USE_OP (uses, i), old, new); 
+ 
+   for (i = 0; i < NUM_VUSES (vuses); i++)
+     redirect_immediate_use (VUSE_OP (vuses, i), old, new);
+ 
+   for (i = 0; i < NUM_VDEFS (vdefs); i++)
+     redirect_immediate_use (VDEF_OP (vdefs, i), old, new);
+ }
  
  /*---------------------------------------------------------------------------
  			    Manage annotations
Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-flow.h,v
retrieving revision 1.1.4.189
diff -c -p -r1.1.4.189 tree-flow.h
*** tree-flow.h	16 Feb 2004 12:35:58 -0000	1.1.4.189
--- tree-flow.h	16 Feb 2004 13:55:33 -0000
*************** struct stmt_ann_d GTY(())
*** 246,251 ****
--- 246,256 ----
  
    /* Set of variables that have had their address taken in the statement.  */
    bitmap addresses_taken;
+ 
+   /* Unique identifier for this statement.  These ID's are to be created
+      by each pass on an as-needed basis in any order convenient for the
+      pass which needs statement UIDs.  */
+   unsigned int uid;
  };
  
  
*************** extern tree get_virtual_var (tree);
*** 514,519 ****
--- 519,525 ----
  extern void add_referenced_tmp_var (tree var);
  extern void mark_new_vars_to_rename (tree, bitmap);
  extern void discover_nonconstant_array_refs (void);
+ extern void redirect_immediate_uses (tree, tree);
  
  /* Flags used when computing reaching definitions and reached uses.  */
  #define TDFA_USE_OPS		1 << 0
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 1.1.4.123
diff -c -p -r1.1.4.123 tree-optimize.c
*** tree-optimize.c	16 Feb 2004 12:36:00 -0000	1.1.4.123
--- tree-optimize.c	16 Feb 2004 13:55:34 -0000
*************** init_tree_optimization_passes (void)
*** 295,300 ****
--- 295,301 ----
    NEXT_PASS (DUP_PASS (pass_dominator));
    NEXT_PASS (DUP_PASS (pass_redundant_phi));
    NEXT_PASS (DUP_PASS (pass_dce));
+   NEXT_PASS (pass_dse);
    NEXT_PASS (DUP_PASS (pass_forwprop));
    NEXT_PASS (DUP_PASS (pass_phiopt));
    NEXT_PASS (pass_tail_recursion);
*************** init_tree_optimization_passes (void)
*** 307,312 ****
--- 308,314 ----
    NEXT_PASS (DUP_PASS (pass_dominator));
    NEXT_PASS (DUP_PASS (pass_redundant_phi));
    NEXT_PASS (pass_cd_dce);
+   NEXT_PASS (DUP_PASS (pass_dse));
    NEXT_PASS (DUP_PASS (pass_forwprop));
    NEXT_PASS (DUP_PASS (pass_phiopt));
    NEXT_PASS (pass_tail_calls);
Index: tree-pass.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-pass.h,v
retrieving revision 1.1.2.12
diff -c -p -r1.1.2.12 tree-pass.h
*** tree-pass.h	13 Feb 2004 11:15:57 -0000	1.1.2.12
--- tree-pass.h	16 Feb 2004 13:55:34 -0000
*************** extern struct tree_opt_pass pass_warn_fu
*** 120,125 ****
--- 120,126 ----
  extern struct tree_opt_pass pass_phiopt;
  extern struct tree_opt_pass pass_forwprop;
  extern struct tree_opt_pass pass_redundant_phi;
+ extern struct tree_opt_pass pass_dse;
  
  
  #endif /* GCC_TREE_PASS_H */
Index: tree-ssa-dse.c
===================================================================
RCS file: tree-ssa-dse.c
diff -N tree-ssa-dse.c
*** /dev/null	1 Jan 1970 00:00:00 -0000
--- tree-ssa-dse.c	16 Feb 2004 13:55:35 -0000
***************
*** 0 ****
--- 1,407 ----
+ /* Dead store elimination
+    Copyright (C) 2004 Free Software Foundation, Inc.
+ 
+ This file is part of GCC.
+ 
+ GCC is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2, or (at your option)
+ any later version.
+ 
+ GCC is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ GNU General Public License for more details.
+ 
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING.  If not, write to
+ the Free Software Foundation, 59 Temple Place - Suite 330,
+ Boston, MA 02111-1307, USA.  */
+ 
+ #include "config.h"
+ #include "system.h"
+ #include "coretypes.h"
+ #include "tm.h"
+ #include "errors.h"
+ #include "ggc.h"
+ #include "tree.h"
+ #include "rtl.h"
+ #include "tm_p.h"
+ #include "basic-block.h"
+ #include "timevar.h"
+ #include "diagnostic.h"
+ #include "tree-flow.h"
+ #include "tree-pass.h"
+ #include "tree-dump.h"
+ #include "domwalk.h"
+ #include "flags.h"
+ 
+ /* This is the global bitmap for store statements.
+ 
+    Each statement has a unique ID.  When we encounter a store statement
+    that we want to record, set the bit corresponding to the statement's
+    unique ID in this bitmap.  */
+ struct dse_global_data
+ {
+   bitmap stores;
+ };
+ 
+ /* We allocate a bitmap-per-block for stores which are encountered
+    during the scan of that block.  This allows us to restore the 
+    global bitmap of stores when we finish processing a block.  */
+ struct dse_block_local_data
+ {
+   bitmap stores;
+ };
+ 
+ static bool gate_dse (void);
+ static void tree_ssa_dse (void);
+ static void dse_initialize_block_local_data (struct dom_walk_data *,
+ 					     basic_block,
+ 					     bool);
+ static void dse_optimize_stmt (struct dom_walk_data *,
+ 			       basic_block,
+ 			       block_stmt_iterator);
+ static void dse_record_phis (struct dom_walk_data *, basic_block);
+ static void dse_finalize_block (struct dom_walk_data *, basic_block);
+ static void fix_phi_uses (tree, tree);
+ static void fix_stmt_vdefs (tree, tree);
+ static void record_voperand_set (bitmap, bitmap *, unsigned int);
+ 
+ /* Function indicating whether we ought to include information for 'var'
+    when calculating immediate uses.  For this pass we only want use
+    information for virtual variables.  */
+                                                                              

+ static bool
+ need_imm_uses_for (tree var)
+ {
+   return !is_gimple_reg (var);
+ }
+                                                                              

+ 
+ /* Replace uses in PHI which match VDEF_RESULTs in STMT with the 
+    corresponding VDEF_OP in STMT.  */
+ 
+ static void
+ fix_phi_uses (tree phi, tree stmt)
+ {
+   stmt_ann_t ann = stmt_ann (stmt);
+   vdef_optype vdefs;
+   unsigned int i;
+   int j;
+ 
+   get_stmt_operands (stmt);
+   vdefs = VDEF_OPS (ann);
+ 
+   /* Walk each VDEF in STMT.  */
+   for (i = 0; i < NUM_VDEFS (vdefs); i++)
+     {
+       tree vdef = VDEF_RESULT (vdefs, i);
+ 
+       /* Find any uses in the PHI which match VDEF and replace
+ 	 them with the appropriate VDEF_OP.  */
+       for (j = 0; j < PHI_NUM_ARGS (phi); j++)
+ 	if (vdef == PHI_ARG_DEF (phi, j))
+ 	  PHI_ARG_DEF (phi, j) = VDEF_OP (vdefs, i);
+     }
+ }
+ 
+ /* Replace the VDEF_OPs in STMT1 which match VDEF_RESULTs in STMT2 with
+    the appropriate VDEF_OPs from STMT2.  */
+ 
+ static void
+ fix_stmt_vdefs (tree stmt1, tree stmt2)
+ {
+   stmt_ann_t ann1 = stmt_ann (stmt1);
+   stmt_ann_t ann2 = stmt_ann (stmt2);
+   vdef_optype vdefs1;
+   vdef_optype vdefs2;
+   unsigned int i, j;
+ 
+   get_stmt_operands (stmt1);
+   get_stmt_operands (stmt2);
+   vdefs1 = VDEF_OPS (ann1);
+   vdefs2 = VDEF_OPS (ann2);
+ 
+   /* Walk each VDEF_OP in stmt1.  */
+   for (i = 0; i < NUM_VDEFS (vdefs1); i++)
+     {
+       tree vdef1 = VDEF_OP (vdefs1, i);
+ 
+       /* Find the appropriate VDEF_RESULT in STMT2.  */
+       for (j = 0; j < NUM_VDEFS (vdefs2); j++)
+ 	{
+ 	  if (vdef1 == VDEF_RESULT (vdefs2, j))
+ 	    {
+ 	      /* Update.  */
+ 	      *VDEF_OP_PTR (vdefs1, i) = VDEF_OP (vdefs2, j);
+ 	      break;
+ 	    }
+ 	}
+ 
+ #ifdef ENABLE_CHECKING
+       /* If we did not find a corresponding VDEF_RESULT, then something
+ 	 has gone terribly wrong.  */
+       if (j == NUM_VDEFS (vdefs2))
+ 	abort ();
+ #endif
+ 
+     }
+ }
+ 
+ 
+ /* Set bit UID in bitmaps GLOBAL and *LOCAL, creating *LOCAL as needed.  */
+ static void
+ record_voperand_set (bitmap global, bitmap *local, unsigned int uid)
+ {
+   /* Lazily allocate the bitmap.  Note that we do not get a notification
+      when the block local data structures die, so we allocate the local
+      bitmap backed by the GC system.  */
+   if (*local == NULL)
+     *local = BITMAP_GGC_ALLOC ();
+ 
+   /* Set the bit in the local and global bitmaps.  */
+   bitmap_set_bit (*local, uid);
+   bitmap_set_bit (global, uid);
+ }
+ /* Initialize block local data structures.  */
+ 
+ static void
+ dse_initialize_block_local_data (struct dom_walk_data *walk_data,
+ 				 basic_block bb ATTRIBUTE_UNUSED,
+ 				 bool recycled)
+ {
+   struct dse_block_local_data *bd
+     = VARRAY_TOP_GENERIC_PTR (walk_data->block_data_stack);
+ 
+   /* If we are given a recycled block local data structure, ensure any
+      bitmap associated with the block is cleared.  */
+   if (recycled)
+     {
+       if (bd->stores)
+ 	bitmap_clear (bd->stores);
+     }
+ }
+ 
+ /* Attempt to eliminate dead stores in the statement referenced by BSI.
+ 
+    A dead store is a store into a memory location which will later be
+    overwritten by another store without any intervening loads.  In this
+    case the earlier store can be deleted.
+ 
+    In our SSA + virtual operand world we use immediate uses of virtual
+    operands to detect dead stores.  If a store's virtual definition
+    is used precisely once by a later store to the same location which
+    post dominates the first store, then the first store is dead.  */
+ 
+ static void
+ dse_optimize_stmt (struct dom_walk_data *walk_data,
+ 		   basic_block bb ATTRIBUTE_UNUSED,
+ 		   block_stmt_iterator bsi)
+ {
+   struct dse_block_local_data *bd
+     = VARRAY_TOP_GENERIC_PTR (walk_data->block_data_stack);
+   struct dse_global_data *dse_gd = walk_data->global_data;
+   tree stmt = bsi_stmt (bsi);
+   stmt_ann_t ann = stmt_ann (stmt);
+   vdef_optype vdefs;
+ 
+   get_stmt_operands (stmt);
+   vdefs = VDEF_OPS (ann);
+ 
+   /* If this statement has no virtual uses, then there is nothing
+      to do.  */
+   if (NUM_VDEFS (vdefs) == 0)
+     return;
+ 
+   /* We know we have virtual definitions.  If this is a MODIFY_EXPR, then
+      record it into our table.  */
+   if (TREE_CODE (stmt) == MODIFY_EXPR
+       && TREE_CODE (TREE_OPERAND (stmt, 1)) != CALL_EXPR)
+     {
+       dataflow_t df = get_immediate_uses (stmt);
+       unsigned int num_uses = num_immediate_uses (df);
+       tree use;
+       tree skipped_phi;
+ 
+ 
+       /* If there are no uses then there is nothing left to do.  */
+       if (num_uses == 0)
+ 	{
+ 	  record_voperand_set (dse_gd->stores, &bd->stores, ann->uid);
+ 	  return;
+ 	}
+ 
+       use = immediate_use (df, 0);
+       skipped_phi = NULL;
+ 
+       /* Skip through any PHI nodes we have already seen if the PHI
+ 	 represents the only use of this store.
+ 
+ 	 Note this does not handle the case where the store has
+ 	 multiple VDEFs which all reach a set of PHI nodes in the
+ 	 same block.  */
+       while (num_uses == 1
+ 	     && TREE_CODE (use) == PHI_NODE
+ 	     && bitmap_bit_p (dse_gd->stores, stmt_ann (use)->uid))
+ 	{
+ 	  /* Record the first PHI we skip so that we can fix its
+ 	     uses if we find that STMT is a dead store.  */
+ 	  if (!skipped_phi)
+ 	    skipped_phi = use;
+ 
+ 	  /* Skip past this PHI and loop again in case we had a PHI
+ 	     chain.  */
+ 	  df = get_immediate_uses (use);
+ 	  num_uses = num_immediate_uses (df);
+ 	  use = immediate_use (df, 0);
+ 	}
+ 
+       /* If we have precisely one immediate use at this point, then we may
+ 	 have found redundant store.  */
+       if (num_uses == 1
+ 	  && bitmap_bit_p (dse_gd->stores, stmt_ann (use)->uid)
+ 	  && operand_equal_p (TREE_OPERAND (stmt, 0),
+ 			      TREE_OPERAND (use, 0), 0))
+ 	{
+ 	  /* We need to fix the operands if either the first PHI we
+ 	     skipped, or the store which we are not deleting if we did
+ 	     not skip any PHIs.  */
+ 	  if (skipped_phi)
+ 	    fix_phi_uses (skipped_phi, stmt);
+ 	  else
+ 	    fix_stmt_vdefs (use, stmt);
+ 
+ 	  /* Any immediate uses which reference STMT need to instead
+ 	     reference USE.  This allows us to cascade dead stores.  */
+ 	  redirect_immediate_uses (stmt, use);
+ 
+ 	  /* Finally remove the dead store.  */
+ 	  bsi_remove (&bsi);
+ 	}
+ 
+       record_voperand_set (dse_gd->stores, &bd->stores, ann->uid);
+     }
+ }
+ 
+ /* Record that we have seen the PHIs at the start of BB which correspond
+    to virtual operands.  */
+ static void
+ dse_record_phis (struct dom_walk_data *walk_data, basic_block bb)
+ {
+   struct dse_block_local_data *bd
+     = VARRAY_TOP_GENERIC_PTR (walk_data->block_data_stack);
+   struct dse_global_data *dse_gd = walk_data->global_data;
+   tree phi;
+ 
+   for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi))
+     if (need_imm_uses_for (PHI_RESULT (phi)))
+       record_voperand_set (dse_gd->stores,
+ 			   &bd->stores,
+ 			   get_stmt_ann (phi)->uid);
+ }
+ 
+ static void
+ dse_finalize_block (struct dom_walk_data *walk_data,
+ 		    basic_block bb ATTRIBUTE_UNUSED)
+ {
+   struct dse_block_local_data *bd
+     = VARRAY_TOP_GENERIC_PTR (walk_data->block_data_stack);
+   struct dse_global_data *dse_gd = walk_data->global_data;
+   bitmap stores = dse_gd->stores;
+   unsigned int i;
+ 
+   /* Unwind the stores noted in this basic block.  */
+   if (bd->stores)
+     EXECUTE_IF_SET_IN_BITMAP (bd->stores, 0, i, bitmap_clear_bit (stores, 
i););
+ }
+ 
+ static void
+ tree_ssa_dse (void)
+ {
+   struct dom_walk_data walk_data;
+   struct dse_global_data dse_gd;
+   unsigned int uid = 0;
+   basic_block bb;
+ 
+   /* Create a UID for each statement in the function.  Ordering of the
+      UIDs is not important for this pass.  */
+   FOR_EACH_BB (bb)
+     {
+       block_stmt_iterator bsi;
+       tree phi;
+ 
+       for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+ 	stmt_ann (bsi_stmt (bsi))->uid = uid++;
+ 
+       for (phi = phi_nodes (bb); phi; phi = TREE_CHAIN (phi))
+ 	stmt_ann (phi)->uid = uid++;
+     }
+ 
+   /* We might consider making this a property of each pass so that it
+      can be [re]computed on an as-needed basis.  Particularly since
+      this pass could be seen as an extension of DCE which needs post
+      dominators.  */
+   calculate_dominance_info (CDI_POST_DOMINATORS);
+ 
+   /* We also need immediate use information for virtual operands.  */
+   compute_immediate_uses (TDFA_USE_VOPS, need_imm_uses_for);
+ 
+   /* Dead store elimination is fundamentally a walk of the post-dominator
+      tree and a backwards walk of statements within each block.  */
+   walk_data.walk_stmts_backward = true;
+   walk_data.dom_direction = CDI_POST_DOMINATORS;
+   walk_data.initialize_block_local_data = dse_initialize_block_local_data;
+   walk_data.before_dom_children_before_stmts = NULL;
+   walk_data.before_dom_children_walk_stmts = dse_optimize_stmt;
+   walk_data.before_dom_children_after_stmts = dse_record_phis;
+   walk_data.after_dom_children_before_stmts = NULL;
+   walk_data.after_dom_children_walk_stmts = NULL;
+   walk_data.after_dom_children_after_stmts = dse_finalize_block;
+ 
+   walk_data.block_local_data_size = sizeof (struct dse_block_local_data);
+ 
+   /* This is the main hash table for the dead store elimination pass.  */
+   dse_gd.stores = BITMAP_XMALLOC ();
+   walk_data.global_data = &dse_gd;
+ 
+   /* Initialize the dominator walker.  */
+   init_walk_dominator_tree (&walk_data);
+ 
+   /* Recursively walk the dominator tree.  */
+   walk_dominator_tree (&walk_data, EXIT_BLOCK_PTR);
+ 
+   /* Finalize the dominator walker.  */
+   fini_walk_dominator_tree (&walk_data);
+ 
+   /* Release the main bitmap.  */
+   BITMAP_XFREE (dse_gd.stores);
+ 
+   /* Free dataflow information.  It's probably out of date now anyway.  */
+   free_df ();
+ 
+   /* For now, just wipe the post-dominator information.  */
+   free_dominance_info (CDI_POST_DOMINATORS);
+ }
+ 
+ static bool
+ gate_dse (void)
+ {
+   return flag_tree_dse != 0;
+ }
+ 
+ struct tree_opt_pass pass_dse = {
+   "dse",			/* name */
+   gate_dse,			/* gate */
+   tree_ssa_dse,			/* execute */
+   NULL,				/* sub */
+   NULL,				/* next */
+   0,				/* static_pass_number */
+   TV_TREE_DSE,			/* tv_id */
+   PROP_cfg | PROP_ssa,		/* properties_required */
+   0,				/* properties_provided */
+   0,				/* properties_destroyed */
+   0,				/* todo_flags_start */
+   TODO_dump_func | TODO_ggc_collect	/* todo_flags_finish */
+   | TODO_verify_ssa
+ };
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.152.2.73
diff -c -p -r1.152.2.73 invoke.texi
*** doc/invoke.texi	13 Feb 2004 13:13:41 -0000	1.152.2.73
--- doc/invoke.texi	16 Feb 2004 13:56:11 -0000
*************** in the following sections.
*** 254,259 ****
--- 254,260 ----
  -fdump-tree-ccp@r{[}-@var{n}@r{]} -fdump-tree-dce@r{[}-@var{n}@r{]} @gol
  -fdump-tree-gimple@r{[}-raw@r{]} -fdump-tree-mudflap@r{[}-@var{n}@r{]} @gol
  -fdump-tree-dom@r{[}-@var{n}@r{]} @gol
+ -fdump-tree-dse@r{[}-@var{n}@r{]} @gol
  -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol
  -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol
  -fdump-tree-loop@r{[}-@var{n}@r{]} -fdump-tree-sra@r{[}-@var{n}@r{]} @gol
*************** in the following sections.
*** 307,313 ****
  -funroll-all-loops  -funroll-loops  -fpeel-loops @gol
  -funswitch-loops  -fold-unroll-loops  -fold-unroll-all-loops @gol
  -ftree-pre  -ftree-ccp  -ftree-dce  -ftree-copyprop  @gol
! -ftree-dominator-opts @gol
  -ftree-loop-optimize -ftree-sra @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os}
--- 308,314 ----
  -funroll-all-loops  -funroll-loops  -fpeel-loops @gol
  -funswitch-loops  -fold-unroll-loops  -fold-unroll-all-loops @gol
  -ftree-pre  -ftree-ccp  -ftree-dce  -ftree-copyprop  @gol
! -ftree-dominator-opts -ftree-dse @gol
  -ftree-loop-optimize -ftree-sra @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os}
*************** file name is made by appending @file{.sr
*** 3520,3525 ****
--- 3521,3531 ----
  @opindex fdump-tree-dom
  Dump each function after applying dominator tree optimizations.  The file
  name is made by appending @file{.dom} to the source file name.
+ 
+ @item dse
+ @opindex fdump-tree-dse
+ Dump each function after applying dead store elimination.  The file
+ name is made by appending @file{.dse} to the source file name.
  
  @item phiopt
  @opindex fdump-tree-phiopt





More information about the Gcc-patches mailing list