This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Add splay-tree "view" for bitmap
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Richard Biener <rguenther at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org, stevenb dot gcc at gmail dot com
- Date: Thu, 18 Oct 2018 14:58:10 +0100
- Subject: Re: [PATCH] Add splay-tree "view" for bitmap
- References: <alpine.LSU.2.20.1810181503430.4374@zhemvz.fhfr.qr>
Richard Biener <rguenther@suse.de> writes:
> PR63155 made me pick up this old work from Steven, it turns our
> linked-list implementation to a two-mode one with one being a
> splay tree featuring O(log N) complexity for find/remove.
>
> Over Stevens original patch I added a bitmap_tree_to_vec helper
> that I use from the debug/print methods to avoid changing view
> there. In theory the bitmap iterator could get a "stack"
> as well and we could at least support EXECUTE_IF_SET_IN_BITMAP.
>
> This can be used to fix the two biggest bottlenecks in the PRs
> testcase, namely SSA propagator worklist handling and out-of-SSA
> coalesce list building. perf shows the following data, first
> unpatched, second patched - also watch the thrid coulumn (samples)
> when comparing percentages.
>
> -O0
> - 18.19% 17.35% 407 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 8.77% create_coalesce_list_for_region ▒
> + 4.21% calculate_live_ranges ▒
> + 2.02% build_ssa_conflict_graph ▒
> + 1.66% insert_phi_nodes_for ▒
> + 0.86% coalesce_ssa_name
> patched:
> - 12.39% 10.48% 129 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 5.27% calculate_live_ranges ▒
> + 2.76% insert_phi_nodes_for ▒
> + 1.90% create_coalesce_list_for_region ▒
> + 1.63% build_ssa_conflict_graph ▒
> + 0.35% coalesce_ssa_name
>
> -O1
> - 17.53% 17.53% 842 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 12.39% add_ssa_edge ▒
> + 1.48% create_coalesce_list_for_region ▒
> + 0.82% solve_constraints ▒
> + 0.71% calculate_live_ranges ▒
> + 0.64% add_implicit_graph_edge ▒
> + 0.41% insert_phi_nodes_for ▒
> + 0.34% build_ssa_conflict_graph
> patched:
> - 5.79% 5.00% 167 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 1.41% add_ssa_edge ▒
> + 0.88% calculate_live_ranges ▒
> + 0.75% add_implicit_graph_edge ▒
> + 0.68% solve_constraints ▒
> + 0.48% insert_phi_nodes_for ▒
> + 0.45% build_ssa_conflict_graph
>
> -O3
> - 12.37% 12.34% 1145 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 9.14% add_ssa_edge ▒
> + 0.80% create_coalesce_list_for_region ▒
> + 0.69% add_implicit_graph_edge ▒
> + 0.54% solve_constraints ▒
> + 0.34% calculate_live_ranges ▒
> + 0.27% insert_phi_nodes_for ▒
> + 0.21% build_ssa_conflict_graph
> - 4.36% 3.86% 227 cc1 cc1 [.] bitmap_set_b▒
> - bitmap_set_bit ▒
> + 0.98% add_ssa_edge ▒
> + 0.86% add_implicit_graph_edge ▒
> + 0.64% solve_constraints ▒
> + 0.57% calculate_live_ranges ▒
> + 0.32% build_ssa_conflict_graph ▒
> + 0.29% mark_all_vars_used_1 ▒
> + 0.20% insert_phi_nodes_for ▒
> + 0.16% create_coalesce_list_for_region
>
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
What's the performance like for more reasonable testcases? E.g. is there
a measurable change either way in --enable-checking=release for some gcc
.iis at -g or -O2 -g?
Thanks,
Richard