This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fix PR59802, LCM compile-time slowness
- From: Richard Biener <rguenther at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Cc: stevenb dot gcc at gmail dot com
- Date: Tue, 14 Jan 2014 14:42:32 +0100 (CET)
- Subject: [PATCH] Fix PR59802, LCM compile-time slowness
- Authentication-results: sourceware.org; auth=none
This fixes the slowness seen in LCM compute_available accounted
to RTL cprop. Currently the dataflow problem uses a "random"
basic-block order to seed the initial worklist (it wants to
visit predecessors before successors) - the following patch
makes it use inverted postorder (similar to tree PRE antic
computation).
This reduces the compile-time for the testcase in PR59802 at -O3
from
CPROP : 54.53 (55%) usr 0.04 ( 6%) sys 54.57 (55%)
wall 4444 kB ( 2%) ggc
PRE : 4.47 ( 5%) usr 0.03 ( 5%) sys 4.48 ( 5%)
wall 1833 kB ( 1%) ggc
TOTAL : 98.51 0.63 99.13
220195 kB
to
CPROP : 2.13 ( 5%) usr 0.06 (10%) sys 2.20 ( 5%)
wall 4444 kB ( 2%) ggc
PRE : 0.52 ( 1%) usr 0.02 ( 3%) sys 0.54 ( 1%)
wall 1833 kB ( 1%) ggc
TOTAL : 42.22 0.60 42.81
220195 kB
which is nice. I checked for other compile-time hog PRs with
CPROP but didn't find one, have yet to check for PRE (three-letter,
likely too much noise).
Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok for trunk
(and branch?)
Thanks,
Richard.
2014-01-14 Richard Biener <rguenther@suse.de>
PR rtl-optimization/59802
* lcm.c (compute_available): Use inverted postorder to seed
the initial worklist.
Index: gcc/lcm.c
===================================================================
*** gcc/lcm.c (revision 206599)
--- gcc/lcm.c (working copy)
*************** compute_available (sbitmap *avloc, sbitm
*** 496,507 ****
bitmap_vector_ones (avout, last_basic_block_for_fn (cfun));
/* Put every block on the worklist; this is necessary because of the
! optimistic initialization of AVOUT above. */
! FOR_EACH_BB_FN (bb, cfun)
{
*qin++ = bb;
bb->aux = bb;
}
qin = worklist;
qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS];
--- 496,515 ----
bitmap_vector_ones (avout, last_basic_block_for_fn (cfun));
/* Put every block on the worklist; this is necessary because of the
! optimistic initialization of AVOUT above. Use inverted postorder
! to make the dataflow problem require less iterations. */
! int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
! int postorder_num = inverted_post_order_compute (postorder);
! for (int i = 0; i < postorder_num; ++i)
{
+ bb = BASIC_BLOCK_FOR_FN (cfun, postorder[i]);
+ if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
+ || bb == ENTRY_BLOCK_PTR_FOR_FN (cfun))
+ continue;
*qin++ = bb;
bb->aux = bb;
}
+ free (postorder);
qin = worklist;
qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS];