This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Enable jump threading on paths meeting hot paths
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: Jeff Law <law at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 2 Feb 2017 15:27:57 +0100
- Subject: Re: Enable jump threading on paths meeting hot paths
- Authentication-results: sourceware.org; auth=none
- References: <20170125181107.GB14317@kam.mff.cuni.cz> <CAFiYyc12K=xk_XLp2bQf5VE47WpcybN6DF_zSvWT22ecw03FZw@mail.gmail.com> <20170126100425.GA60443@kam.mff.cuni.cz> <CAFiYyc3kXUfmB09zJVVUX=O7wLmfPag-M_Tv3uOhh8nCfwwfWQ@mail.gmail.com> <20170126103916.GB60443@kam.mff.cuni.cz> <2e81a817-bda6-207d-bbee-16f05a5ee42f@redhat.com> <20170202134919.GA91024@kam.mff.cuni.cz>
On Thu, Feb 2, 2017 at 2:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> it seems I forgot to send the updated patch. Here it is.
> We now dump info like:
> Checking profitability of path: 5 (16 insns) 3 (2 insns) 34 (2 insns) 33 (4 insns) 32 (1 insns) 10 (3 insns) 6
> Control statement insns: 16
> Overall: 12 insns
> Registering FSM jump thread: (6, 10) incoming edge; (10, 32) (32, 33) (33, 34) (34, 3) (3, 5) (5, 16) nocopy; (5, 16)
>
> path is printed backwards. It is how the loop process it.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> Honza
>
> PR middle-end/77445
> * gcc.dg/tree-ssa/pr77445-2.c: Update testcase to check that all
> threading is done.
> * tree-ssa-threadbackward.c (profitable_jump_thread_path): Dump
> statistics of the analyzed path; allow threading for speed when
> any of BBs along the path are optimized for speed.
>
> Index: testsuite/gcc.dg/tree-ssa/pr77445-2.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/pr77445-2.c (revision 245124)
> +++ testsuite/gcc.dg/tree-ssa/pr77445-2.c (working copy)
> @@ -1,5 +1,5 @@
> /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-thread1-details-blocks-stats" } */
> +/* { dg-options "-O2 -fdump-tree-thread1-details-blocks-stats -fdump-tree-thread2-details-blocks-stats -fdump-tree-thread3-details-blocks-stats -fdump-tree-thread4-details-blocks-stats" } */
You can use -fdump-tree-thread-details-blocks-stats to get all thread dumps.
> typedef enum STATES {
> START=0,
> INVALID,
> @@ -121,3 +121,7 @@ enum STATES FMS( u8 **in , u32 *transiti
> increase much. */
> /* { dg-final { scan-tree-dump "Jumps threaded: 1\[1-9\]" "thread1" } } */
> /* { dg-final { scan-tree-dump-times "Invalid sum" 2 "thread1" } } */
> +/* { dg-final { scan-tree-dump-not "not considered" "thread1" } } */
> +/* { dg-final { scan-tree-dump-not "not considered" "thread2" } } */
> +/* { dg-final { scan-tree-dump-not "not considered" "thread3" } } */
> +/* { dg-final { scan-tree-dump-not "not considered" "thread4" } } */
> Index: tree-ssa-threadbackward.c
> ===================================================================
> --- tree-ssa-threadbackward.c (revision 245124)
> +++ tree-ssa-threadbackward.c (working copy)
> @@ -159,6 +159,10 @@ profitable_jump_thread_path (vec<basic_b
> bool threaded_through_latch = false;
> bool multiway_branch_in_path = false;
> bool threaded_multiway_branch = false;
> + bool contains_hot_bb = false;
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "Checking profitability of path: ");
>
> /* Count the number of instructions on the path: as these instructions
> will have to be duplicated, we will not record the path if there
> @@ -168,6 +172,8 @@ profitable_jump_thread_path (vec<basic_b
> {
> basic_block bb = (*path)[j];
>
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, " %i", bb->index);
> /* Remember, blocks in the path are stored in opposite order
> in the PATH array. The last entry in the array represents
> the block with an outgoing edge that we will redirect to the
> @@ -177,6 +183,7 @@ profitable_jump_thread_path (vec<basic_b
> branch. */
> if (j < path_length - 1)
> {
> + int orig_n_insns = n_insns;
> if (bb->loop_father != loop)
> {
> path_crosses_loops = true;
> @@ -219,6 +226,9 @@ profitable_jump_thread_path (vec<basic_b
> }
> }
>
> +
Please do not add excess vertical space.
> + if (!contains_hot_bb && speed_p && j < path_length - 1)
j < path_length - 1 is already checked above?
Otherwise looks ok. If it does fix the regression - does it?
Richard.
> + contains_hot_bb |= optimize_bb_for_speed_p (bb);
> for (gsi = gsi_after_labels (bb);
> !gsi_end_p (gsi);
> gsi_next_nondebug (&gsi))
> @@ -229,8 +239,10 @@ profitable_jump_thread_path (vec<basic_b
> && !(gimple_code (stmt) == GIMPLE_ASSIGN
> && gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
> && !is_gimple_debug (stmt))
> - n_insns += estimate_num_insns (stmt, &eni_size_weights);
> + n_insns += estimate_num_insns (stmt, &eni_size_weights);
> }
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, " (%i insns)", n_insns-orig_n_insns);
>
> /* We do not look at the block with the threaded branch
> in this loop. So if any block with a last statement that
> @@ -264,7 +276,13 @@ profitable_jump_thread_path (vec<basic_b
> last block in the threading path. So don't count it against our
> statement count. */
>
> - n_insns-= estimate_num_insns (stmt, &eni_size_weights);
> + int stmt_insns = estimate_num_insns (stmt, &eni_size_weights);
> + n_insns-= stmt_insns;
> +
> + if (dump_file && (dump_flags & TDF_DETAILS))
> + fprintf (dump_file, "\n Control statement insns: %i\n"
> + " Overall: %i insns\n",
> + stmt_insns, n_insns);
>
> /* We have found a constant value for ARG. For GIMPLE_SWITCH
> and GIMPLE_GOTO, we use it as-is. However, for a GIMPLE_COND
> @@ -311,7 +329,11 @@ profitable_jump_thread_path (vec<basic_b
> return NULL;
> }
>
> - if (speed_p && optimize_edge_for_speed_p (taken_edge))
> + /* Threading is profitable if the path duplicated is hot but also
> + in a case we separate cold path from hot path and permit optimization
> + of the hot path later. Be on the agressive side here. In some testcases,
> + as in PR 78407 this leads to noticeable improvements. */
> + if (speed_p && (optimize_edge_for_speed_p (taken_edge) || contains_hot_bb))
> {
> if (n_insns >= PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATH_INSNS))
> {