This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Hoist reciprocal invariant out of loop
- From: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- To: David Edelsohn <dje at watson dot ibm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Sun, 3 Apr 2005 10:06:44 +0200
- Subject: Re: [PATCH] Hoist reciprocal invariant out of loop
- References: <200504022119.j32LJ6D30254@makai.watson.ibm.com>
Hello,
> When flag_unsafe_math_optimizations is true, GCC's expand phase
> currently splits divides into reciprocal and multiply expressions to
> create an opportunity for the reciprocal to be hoisted out of a loop if it
> is invariant. Combine is expected to reconstitute the divide if not
> invariant. As has been discussed many times, this transformation causes
> problems and expand is not the appropriate place to implement this
> optimizations.
>
> The following patch re-implements the optimization at the tree
> level as part of loop invariant motion. This patch depends on the earlier
> patch to fix tree_could_trap_p() to correcly determine whether the
> operation can be hoisted.
>
> The patch implements the optimization in two phases. The first
> phase modifies the existing movement_possibility() function to test if the
> divisor in an RDIV expression is invariant and return the new value
> MOVE_RECIPROCAL. The second phase modifies the existing
> determine_invariantness_stmt(), which is iterating over the stmts, to
> create the invariant reciprocal and allow it to be hoisted by the existing
> machinery.
having this piece of code in determine_invariantness_stmt seems a bit
hacky to me. Determine_invariantness and subroutines is ment as an
analysis phase and it should not modify the code.
I think it would be a bit cleaner to move the code that modifies the
statements to move_computations_stmt. It would be also nice if
the code of the transformation was in a separate function.
Zdenek
> After this patch is accepted and settles, I plan to follow up with
> another patch to remove the code to create a reciprocal in
> expand_expr_real_1().
>
> Bootstrapped on powerpc-ibm-aix5.2.0.0 with no regressions.
>
> Okay for mainline?
>
> Thanks, David
>
>
> * tree-flow.h (enum mov_pos): Add MOVE_RECIPROCAL.
> * tree-ssa-loop-im.c: Include real.h.
> (movement_possibility): If real division divisor is invariant and
> flag_unsafe_math_optimizations enabled, return MOVE_RECIPROCAL.
> (determine_invariantness_stmt): If MOVE_RECIPROCAL, generate
> invariant reciprocal for hoisting.
> * Makefile.in (tree-ssa-loop-im.o): Add real.h dependency.
>
> Index: tree-flow.h
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-flow.h,v
> retrieving revision 2.85
> diff -c -p -r2.85 tree-flow.h
> *** tree-flow.h 10 Mar 2005 08:55:57 -0000 2.85
> --- tree-flow.h 2 Apr 2005 00:43:57 -0000
> *************** enum move_pos
> *** 699,705 ****
> MOVE_IMPOSSIBLE, /* No movement -- side effect expression. */
> MOVE_PRESERVE_EXECUTION, /* Must not cause the non-executed statement
> become executed -- memory accesses, ... */
> ! MOVE_POSSIBLE /* Unlimited movement. */
> };
> extern enum move_pos movement_possibility (tree);
>
> --- 699,706 ----
> MOVE_IMPOSSIBLE, /* No movement -- side effect expression. */
> MOVE_PRESERVE_EXECUTION, /* Must not cause the non-executed statement
> become executed -- memory accesses, ... */
> ! MOVE_POSSIBLE, /* Unlimited movement. */
> ! MOVE_RECIPROCAL /* Generate reciprocal and move. */
> };
> extern enum move_pos movement_possibility (tree);
>
> Index: tree-ssa-loop-im.c
> ===================================================================
> RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-im.c,v
> retrieving revision 2.33
> diff -c -p -r2.33 tree-ssa-loop-im.c
> *** tree-ssa-loop-im.c 31 Mar 2005 16:01:53 -0000 2.33
> --- tree-ssa-loop-im.c 2 Apr 2005 20:56:12 -0000
> *************** Software Foundation, 59 Temple Place - S
> *** 37,42 ****
> --- 37,43 ----
> #include "params.h"
> #include "tree-pass.h"
> #include "flags.h"
> + #include "real.h"
>
> /* TODO: Support for predicated code motion. I.e.
>
> *************** movement_possibility (tree stmt)
> *** 245,252 ****
> if (TREE_SIDE_EFFECTS (rhs))
> return MOVE_IMPOSSIBLE;
>
> ! if (TREE_CODE (lhs) != SSA_NAME
> ! || tree_could_trap_p (rhs))
> return MOVE_PRESERVE_EXECUTION;
>
> if (get_call_expr_in (stmt))
> --- 246,270 ----
> if (TREE_SIDE_EFFECTS (rhs))
> return MOVE_IMPOSSIBLE;
>
> ! if (TREE_CODE (lhs) != SSA_NAME)
> ! return MOVE_PRESERVE_EXECUTION;
> !
> ! /* If divisor is invariant, convert a/b to a*(1/b), allowing reciprocal
> ! to be hoisted out of loop, saving expensive divide. */
> ! if (TREE_CODE (rhs) == RDIV_EXPR && flag_unsafe_math_optimizations
> ! && !tree_could_trap_p (rhs))
> ! {
> ! /* Entire statement is invariant. */
> ! if (expr_invariant_in_loop_p (loop_containing_stmt (stmt), rhs))
> ! return MOVE_POSSIBLE;
> !
> ! /* Divisor invariant. */
> ! if (expr_invariant_in_loop_p (loop_containing_stmt (stmt),
> ! TREE_OPERAND (rhs, 1)))
> ! return MOVE_RECIPROCAL;
> ! }
> !
> ! if (tree_could_trap_p (rhs))
> return MOVE_PRESERVE_EXECUTION;
>
> if (get_call_expr_in (stmt))
> *************** determine_invariantness_stmt (struct dom
> *** 588,593 ****
> --- 606,643 ----
> continue;
> }
>
> + if (pos == MOVE_RECIPROCAL)
> + {
> + tree stmt1, stmt2, lhs, rhs, var, name;
> +
> + lhs = TREE_OPERAND (stmt, 0);
> + rhs = TREE_OPERAND (stmt, 1);
> +
> + /* stmt must be MODIFY_EXPR. */
> + var = create_tmp_var (TREE_TYPE (rhs), "reciptmp");
> + add_referenced_tmp_var (var);
> +
> + stmt1 = build2 (MODIFY_EXPR, void_type_node, var,
> + build2 (RDIV_EXPR, TREE_TYPE (rhs),
> + build_real (TREE_TYPE (rhs), dconst1),
> + TREE_OPERAND (rhs, 1)));
> + name = make_ssa_name (var, stmt1);
> + TREE_OPERAND (stmt1, 0) = name;
> + stmt2 = build2 (MODIFY_EXPR, void_type_node, lhs,
> + build2 (MULT_EXPR, TREE_TYPE (rhs),
> + name, TREE_OPERAND (rhs, 0)));
> +
> + /* Replace division stmt with reciprocal and multiply stmts.
> + The multiply stmt is not invariant, so update iterator
> + and avoid rescanning. */
> + bsi_replace (&bsi, stmt1, true);
> + bsi_insert_after (&bsi, stmt2, BSI_NEW_STMT);
> + SSA_NAME_DEF_STMT (lhs) = stmt2;
> +
> + /* Continue processing with invariant reciprocal statment. */
> + stmt = stmt1;
> + }
> +
> stmt_ann (stmt)->common.aux = xcalloc (1, sizeof (struct lim_aux_data));
> LIM_DATA (stmt)->always_executed_in = outermost;
>