This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: folding (vec_)cond_expr in a binary operation
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Marc Glisse <marc dot glisse at inria dot fr>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 30 Aug 2013 11:36:54 +0200
- Subject: Re: folding (vec_)cond_expr in a binary operation
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 02 dot 1307061712010 dot 5572 at stedding dot saclay dot inria dot fr>
On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> the first attached patch does not bootstrap and has at least 2 main issues.
> The second patch does pass bootstrap+testsuite, but I liked the first
> more...
>
> First, the fold-const bit causes an assertion failure (building libjava) in
> combine_cond_expr_cond, which calls:
>
> t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>
> and then checks:
>
> /* Require that we got a boolean type out if we put one in. */
> gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>
> which makes sense... Note that the 2 relevant types are:
well, the check is probably old and can be relaxed to also allow all
compatible types.
> (gdb) call debug_tree((tree)0x7ffff5e78930)
> <integer_type 0x7ffff5e78930 jboolean asm_written public unsigned type_3 QI
> size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
> unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
> align 8 symtab -169408800 alias set -1 canonical type 0x7ffff6635c78
> precision 1 min <integer_cst 0x7ffff66321e0 0> max <integer_cst
> 0x7ffff6632200 1>
> pointer_to_this <pointer_type 0x7ffff5a1e540> chain <integer_type
> 0x7ffff6635dc8>>
> (gdb) call debug_tree(type)
> <boolean_type 0x7ffff64f69d8 bool sizes-gimplified asm_written public
> unsigned QI
> size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
> unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
> align 8 symtab -170348640 alias set 19 canonical type 0x7ffff64f69d8
> precision 1 min <integer_cst 0x7ffff64f92a0 0> max <integer_cst
> 0x7ffff64f92e0 1>
> pointer_to_this <pointer_type 0x7ffff5912dc8>>
>
> I need to relax the conditions on the vec?-1:0 optimization because with
> vectors we often end up with several types that are equivalent. Can you
> think of a good way to do it? In the second patch I limit the transformation
> to vectors and hope it doesn't cause as much trouble there...
>
>
>
> Second, the way the forwprop transformation is done, it can be necessary to
> run the forwprop pass several times in a row, which is not nice. It takes:
>
> stmt_cond
> stmt_binop
>
> and produces:
>
> stmt_folded
> stmt_cond2
>
> But one of the arguments of stmt_folded could be an ssa_name defined by a
> cond_expr, which could now be propagated into stmt_folded (there may be
> other new possible transformations). However, that cond_expr has already
> been visited and won't be again. The combine part of the pass uses a PLF to
> revisit statements, but the forwprop part doesn't have any specific
> mechanism.
forwprop is designed to re-process stmts it has folded to catch cascading
effects. Now, as it as both a forward and a backward run you don't catch
2nd-order effects with iterating them. On my TODO is to only do one kind,
either forward or backward propagation.
> The workarounds I can think of are:
> - manually check if some of the arguments of stmt_folded are cond_expr and
> recursively call the function on them;
> - move the transformation to the combine part of the pass;
this it is.
> - setup some system to revisit all earlier statements?
>
> I went with the second option in the second patch, but this limitation of
> forward propagation seems strange to me.
>
>
> (s/combine_binop_with_condition/forward_propagate_condition/ for the first
> patch)
Btw, as for the patch I don't like that you basically feed everything into
fold. Yes, I know we do that for conditions because that's quite important
and nobody has re-written the condition folding as gimple pattern matcher.
I doubt that this transform is important enough to warrant another exception ;)
Richard.
> 2013-07-06 Marc Glisse <marc.glisse@inria.fr>
>
> PR tree-optimization/57755
> gcc/
> * tree-ssa-forwprop.c (combine_binop_with_condition): New function.
> (ssa_forward_propagate_and_combine): Call it.
> * fold-const.c (fold_ternary_loc): In gimple form, don't check for
> type equality.
>
> gcc/testsuite/
> * g++.dg/tree-ssa/pr57755.C: New testcase.
>
> (this message does not supersede
> http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html )
>
> --
> Marc Glisse
> Index: fold-const.c
> ===================================================================
> --- fold-const.c (revision 200736)
> +++ fold-const.c (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
> /* Convert A ? 1 : 0 to simply A. */
> if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
> : (integer_onep (op1)
> && !VECTOR_TYPE_P (type)))
> && integer_zerop (op2)
> /* If we try to convert OP0 to our type, the
> call to fold will try to move the conversion inside
> a COND, which will recurse. In that case, the COND_EXPR
> is probably the best choice, so leave it alone. */
> - && type == TREE_TYPE (arg0))
> + && (type == TREE_TYPE (arg0)
> + || (in_gimple_form
> + && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
> return pedantic_non_lvalue_loc (loc, arg0);
>
> /* Convert A ? 0 : 1 to !A. This prefers the use of NOT_EXPR
> over COND_EXPR in cases such as floating point comparisons. */
> if (integer_zerop (op1)
> && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
> : (integer_onep (op2)
> && !VECTOR_TYPE_P (type)))
> && truth_value_p (TREE_CODE (arg0)))
> return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,135 @@ forward_propagate_comparison (gimple_stm
>
> /* Remove defining statements. */
> return remove_prop_source_from_use (name);
>
> bailout:
> gsi_next (defgsi);
> return false;
> }
>
>
> +/* Forward propagate the condition defined in *DEFGSI to uses in
> + binary operations.
> + Returns true if stmt is now unused. Advance DEFGSI to the next
> + statement. */
> +
> +static bool
> +forward_propagate_condition (gimple_stmt_iterator *defgsi)
> +{
> + gimple stmt = gsi_stmt (*defgsi);
> + tree name = gimple_assign_lhs (stmt);
> + enum tree_code defcode = gimple_assign_rhs_code (stmt);
> + gimple_stmt_iterator gsi;
> + enum tree_code code;
> + tree lhs, type;
> + gimple use_stmt;
> +
> + /* Don't propagate ssa names that occur in abnormal phis. */
> + if ((TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
> + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs1 (stmt)))
> + || (TREE_CODE (gimple_assign_rhs2 (stmt)) == SSA_NAME
> + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs2 (stmt)))
> + || (TREE_CODE (gimple_assign_rhs3 (stmt)) == SSA_NAME
> + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs3 (stmt))))
> + goto bailout;
> +
> + /* Do not un-cse, but propagate through copies. */
> + use_stmt = get_prop_dest_stmt (name, &name);
> + if (!use_stmt
> + || !is_gimple_assign (use_stmt)
> + || gimple_has_side_effects (stmt)
> + || gimple_has_side_effects (use_stmt))
> + goto bailout;
> +
> + gsi = gsi_for_stmt (use_stmt);
> + code = gimple_assign_rhs_code (use_stmt);
> + lhs = gimple_assign_lhs (use_stmt);
> + type = TREE_TYPE (lhs);
> +
> + if (TREE_CODE_CLASS (code) == tcc_binary
> + || TREE_CODE_CLASS (code) == tcc_comparison)
> + {
> + bool cond_is_left = false;
> + bool trueok = false;
> + bool falseok = false;
> + if (name == gimple_assign_rhs1 (use_stmt))
> + cond_is_left = true;
> + else
> + gcc_assert (name == gimple_assign_rhs2 (use_stmt));
> + tree true_op, false_op;
> + if (cond_is_left)
> + {
> + true_op = fold_build2_loc (gimple_location (stmt), code, type,
> + gimple_assign_rhs2 (stmt),
> + gimple_assign_rhs2 (use_stmt));
> + false_op = fold_build2_loc (gimple_location (stmt), code, type,
> + gimple_assign_rhs3 (stmt),
> + gimple_assign_rhs2 (use_stmt));
> + }
> + else
> + {
> + true_op = fold_build2_loc (gimple_location (stmt), code, type,
> + gimple_assign_rhs1 (use_stmt),
> + gimple_assign_rhs2 (stmt));
> + false_op = fold_build2_loc (gimple_location (stmt), code, type,
> + gimple_assign_rhs1 (use_stmt),
> + gimple_assign_rhs3 (stmt));
> + }
> +
> + if (is_gimple_val (true_op))
> + trueok = true;
> + if (is_gimple_val (false_op))
> + falseok = true;
> + /* At least one of them has to simplify, or it isn't worth it. */
> + if (!trueok && !falseok)
> + goto bailout;
> + if (!trueok)
> + {
> + if (!valid_gimple_rhs_p (true_op))
> + goto bailout;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> true_op);
> + true_op = gimple_assign_lhs (g);
> + gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> + }
> + if (!falseok)
> + {
> + if (!valid_gimple_rhs_p (false_op))
> + goto bailout;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> + false_op = gimple_assign_lhs (g);
> + gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> + }
> +
> + gimple_assign_set_rhs_with_ops_1 (&gsi, defcode,
> + gimple_assign_rhs1 (stmt),
> + true_op, false_op);
> + }
> + else
> + goto bailout;
> +
> + fold_stmt (&gsi);
> + update_stmt (gsi_stmt (gsi));
> +
> + /* When we remove stmt now the iterator defgsi goes off it's current
> + sequence, hence advance it now. */
> + gsi_next (defgsi);
> +
> + /* Remove defining statements. */
> + return remove_prop_source_from_use (name);
> +
> +bailout:
> + gsi_next (defgsi);
> + return false;
> +}
> +
> +
> /* GSI_P points to a statement which performs a narrowing integral
> conversion.
>
> Look for cases like:
>
> t = x & c;
> y = (T) t;
>
> Turn them into:
>
> @@ -3382,20 +3497,25 @@ ssa_forward_propagate_and_combine (void)
> gsi_next (&gsi);
> }
> else
> gsi_next (&gsi);
> }
> else if (TREE_CODE_CLASS (code) == tcc_comparison)
> {
> if (forward_propagate_comparison (&gsi))
> cfg_changed = true;
> }
> + else if (code == COND_EXPR || code == VEC_COND_EXPR)
> + {
> + if (forward_propagate_condition (&gsi))
> + cfg_changed = true;
> + }
> else
> gsi_next (&gsi);
> }
>
> /* Combine stmts with the stmts defining their operands.
> Note we update GSI within the loop as necessary. */
> for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> {
> gimple stmt = gsi_stmt (gsi);
> bool changed = false;
> Index: testsuite/g++.dg/tree-ssa/pr57755.c
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> + vec z=a!=a;
> + vec o=z+1;
> + vec c=(a>3)?o:z;
> + vec d=(b>2)?o:z;
> + vec e=c&d;
> + return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> + vec z=a!=a;
> + vec c=(a<=42)?b:z;
> + return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.c
> ___________________________________________________________________
> Added: svn:eol-style
> + native
> Added: svn:keywords
> + Author Date Id Revision URL
>
>
> Index: testsuite/g++.dg/tree-ssa/pr57755.C
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> + vec z=a!=a;
> + vec o=z+1;
> + vec c=(a>3)?o:z;
> + vec d=(b>2)?o:z;
> + vec e=c&d;
> + return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> + vec z=a!=a;
> + vec c=(a<=42)?b:z;
> + return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.C
> ___________________________________________________________________
> Added: svn:keywords
> + Author Date Id Revision URL
> Added: svn:eol-style
> + native
>
> Index: fold-const.c
> ===================================================================
> --- fold-const.c (revision 200736)
> +++ fold-const.c (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
> /* Convert A ? 1 : 0 to simply A. */
> if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
> : (integer_onep (op1)
> && !VECTOR_TYPE_P (type)))
> && integer_zerop (op2)
> /* If we try to convert OP0 to our type, the
> call to fold will try to move the conversion inside
> a COND, which will recurse. In that case, the COND_EXPR
> is probably the best choice, so leave it alone. */
> - && type == TREE_TYPE (arg0))
> + && (type == TREE_TYPE (arg0)
> + || (in_gimple_form && code == VEC_COND_EXPR
> + && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
> return pedantic_non_lvalue_loc (loc, arg0);
>
> /* Convert A ? 0 : 1 to !A. This prefers the use of NOT_EXPR
> over COND_EXPR in cases such as floating point comparisons. */
> if (integer_zerop (op1)
> && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
> : (integer_onep (op2)
> && !VECTOR_TYPE_P (type)))
> && truth_value_p (TREE_CODE (arg0)))
> return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,131 @@ forward_propagate_comparison (gimple_stm
>
> /* Remove defining statements. */
> return remove_prop_source_from_use (name);
>
> bailout:
> gsi_next (defgsi);
> return false;
> }
>
>
> +/* Combine the binary operation defined in *GSI with one of its arguments
> + that is a condition.
> + Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
> + run. Else it returns 0. */
> +
> +static int
> +combine_binop_with_condition (gimple_stmt_iterator *gsi)
> +{
> + gimple stmt = gsi_stmt (*gsi);
> + tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> + enum tree_code code = gimple_assign_rhs_code (stmt);
> + tree rhs1 = gimple_assign_rhs1 (stmt);
> + tree rhs2 = gimple_assign_rhs2 (stmt);
> + gimple def_stmt;
> + enum tree_code defcode;
> + bool trueok = false;
> + bool falseok = false;
> + tree true_op, false_op;
> + location_t loc = gimple_location (stmt);
> +
> + if (TREE_CODE (rhs1) != SSA_NAME)
> + goto second_op;
> +
> + def_stmt = get_prop_source_stmt (rhs1, true, NULL);
> + if (!def_stmt || !can_propagate_from (def_stmt))
> + goto second_op;
> +
> + defcode = gimple_assign_rhs_code (def_stmt);
> + if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> + goto second_op;
> +
> + true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2
> (def_stmt),
> + gimple_assign_rhs2 (stmt));
> + false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3
> (def_stmt),
> + gimple_assign_rhs2 (stmt));
> +
> + if (is_gimple_val (true_op))
> + trueok = true;
> + if (is_gimple_val (false_op))
> + falseok = true;
> + /* At least one of them has to simplify, or it isn't worth it. */
> + if (!trueok && !falseok)
> + goto second_op;
> + if (!trueok)
> + {
> + if (!valid_gimple_rhs_p (true_op))
> + goto second_op;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> + true_op = gimple_assign_lhs (g);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + }
> + if (!falseok)
> + {
> + if (!valid_gimple_rhs_p (false_op))
> + goto second_op;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> + false_op = gimple_assign_lhs (g);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + }
> + goto finish;
> +
> +second_op:
> + if (TREE_CODE (rhs2) != SSA_NAME)
> + return 0;
> +
> + def_stmt = get_prop_source_stmt (rhs2, true, NULL);
> + if (!def_stmt || !can_propagate_from (def_stmt))
> + return 0;
> +
> + defcode = gimple_assign_rhs_code (def_stmt);
> + if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> + return 0;
> +
> + true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> + gimple_assign_rhs2 (def_stmt));
> + false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> + gimple_assign_rhs3 (def_stmt));
> +
> + trueok = is_gimple_val (true_op);
> + falseok = is_gimple_val (false_op);
> + /* At least one of them has to simplify, or it isn't worth it. */
> + if (!trueok && !falseok)
> + return 0;
> + if (!trueok)
> + {
> + if (!valid_gimple_rhs_p (true_op))
> + return 0;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> + true_op = gimple_assign_lhs (g);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + }
> + if (!falseok)
> + {
> + if (!valid_gimple_rhs_p (false_op))
> + return 0;
> + gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> + false_op = gimple_assign_lhs (g);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + }
> +finish:
> + gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1
> (def_stmt),
> + true_op, false_op);
> +
> + fold_stmt (gsi);
> + update_stmt (gsi_stmt (*gsi));
> +
> + /* Remove defining statements. */
> + return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
> +}
> +
> +
> /* GSI_P points to a statement which performs a narrowing integral
> conversion.
>
> Look for cases like:
>
> t = x & c;
> y = (T) t;
>
> Turn them into:
>
> @@ -3403,21 +3514,35 @@ ssa_forward_propagate_and_combine (void)
> /* Mark stmt as potentially needing revisiting. */
> gimple_set_plf (stmt, GF_PLF_1, false);
>
> switch (gimple_code (stmt))
> {
> case GIMPLE_ASSIGN:
> {
> tree rhs1 = gimple_assign_rhs1 (stmt);
> enum tree_code code = gimple_assign_rhs_code (stmt);
>
> - if ((code == BIT_NOT_EXPR
> + /* Something (associate_plusminus?) doesn't set "changed"
> + properly, so we can't put this at the end with
> + if (!changed && ...). */
> + if (TREE_CODE_CLASS (code) == tcc_binary
> + || TREE_CODE_CLASS (code) == tcc_comparison)
> + {
> + int did_something;
> + did_something = combine_binop_with_condition (&gsi);
> + if (did_something == 2)
> + cfg_changed = true;
> + changed = did_something != 0;
> + }
> + if (changed)
> + ;
> + else if ((code == BIT_NOT_EXPR
> || code == NEGATE_EXPR)
> && TREE_CODE (rhs1) == SSA_NAME)
> changed = simplify_not_neg_expr (&gsi);
> else if (code == COND_EXPR
> || code == VEC_COND_EXPR)
> {
> /* In this case the entire COND_EXPR is in rhs1. */
> if (forward_propagate_into_cond (&gsi)
> || combine_cond_exprs (&gsi))
> {
>