[PATCH] [PR tree-optimization/69347] Fix memory consumption in threader & minor speed improvement

Jeff Law law@redhat.com
Thu Jan 21 22:22:00 GMT 2016



BZ69347 shows excessive memory consumption in the FSM threading code. 
The underlying problem has been there since the introduction of the FSM 
threader, but the increased reliance on the FSM threader has brought the 
issue to the forefront of things we need to look at.

Essentially the FSM bits, in two places, allocate vectors large enough 
for every basic block in the current function and does so each call into 
the FSM threader.  The vectors in question are filled with safe_push 
style operations, so it's safe to initialize them to any constant size. 
  I selected "10" entries.  Doing that brings the memory consumption 
*way* down.

Second, I noticed that record_temporary_equivalences has gotten 
considerably more expensive with its immediate-use walking.  I'm 
pondering how to bring that cost down, but in the mean time we can 
simply remove roughly 1/3 of the calls into the routine that were 
totally redundant.

We would call it from dom_opt_walker::thread_across_edge to record 
temporary equivalences related to the given edge.  Then we'd call 
::thread_across_edge from tree-ssa-threadedge.c, which in turn calls 
thread_through_normal_block which calls record_temporary_equivalences 
again on the same edge.

Bootstrapped and regression tested on x86_64.  Also verified no code 
generation differences on my bucket of .i files.  Installed on the trunk.

Jeff
-------------- next part --------------
commit 3ef2e1cc4f4c6904642891a4f779230737d50613
Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Thu Jan 21 22:21:55 2016 +0000

    [PATCH] [PR tree-optimization/69347] Fix memory consumption in threader & minor speed improvement
    
    	PR middle-end/69347
    	* tree-ssa-dom.c (dom_opt_dom_walker::thread_across_edge): Avoid
    	useless call to record_temporary_equivalences.
    	* tree-ssa-threadbackward.c (find_jump_threads_backwards): Just
    	allocate 10 slots in the bb_path vector and let it grow as needed.
    	(fsm_find_control_statement_thread_paths): Similarly for the next_path
    	vector.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232711 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5f0d7c0..c3908ea 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2016-01-21  Jeff Law  <law@redhat.com>
+
+	PR middle-end/69347
+	* tree-ssa-dom.c (dom_opt_dom_walker::thread_across_edge): Avoid
+	useless call to record_temporary_equivalences.
+	* tree-ssa-threadbackward.c (find_jump_threads_backwards): Just
+	allocate 10 slots in the bb_path vector and let it grow as needed.
+	(fsm_find_control_statement_thread_paths): Similarly for the next_path
+	vector.
+
 2016-01-21  David Edelsohn  <dje.gcc@gmail.com>
 
 	* configure.ac (gcc_cv_as_powerpc_mfcrf, gcc_cv_as_machine_directive):
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 3eeaa9c..84c9a6a 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -935,9 +935,6 @@ dom_opt_dom_walker::thread_across_edge (edge e)
   m_avail_exprs_stack->push_marker ();
   m_const_and_copies->push_marker ();
 
-  /* Traversing E may result in equivalences we can utilize.  */
-  record_temporary_equivalences (e, m_const_and_copies, m_avail_exprs_stack);
-
   /* With all the edge equivalences in the tables, go ahead and attempt
      to thread through E->dest.  */
   ::thread_across_edge (m_dummy_cond, e, false,
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 8d8aa30..8be57a0 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -142,7 +142,7 @@ fsm_find_control_statement_thread_paths (tree name,
       int e_count = 0;
       edge_iterator ei;
       vec<basic_block, va_gc> *next_path;
-      vec_alloc (next_path, n_basic_blocks_for_fn (cfun));
+      vec_alloc (next_path, 10);
 
       /* When VAR_BB == LAST_BB_IN_PATH, then the first block in the path
 	 will already be in VISITED_BBS.  When they are not equal, then we
@@ -379,7 +379,7 @@ find_jump_threads_backwards (edge e)
     return;
 
   vec<basic_block, va_gc> *bb_path;
-  vec_alloc (bb_path, n_basic_blocks_for_fn (cfun));
+  vec_alloc (bb_path, 10);
   vec_safe_push (bb_path, e->dest);
   hash_set<basic_block> *visited_bbs = new hash_set<basic_block>;
 


More information about the Gcc-patches mailing list