This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix PR66313
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Richard Biener <rguenther at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 13 Jun 2017 10:57:00 +0100
- Subject: Re: [PATCH] Fix PR66313
- Authentication-results: sourceware.org; auth=none
- References: <alpine.LSU.2.20.1705311603540.20726@zhemvz.fhfr.qr>
Richard Biener <rguenther@suse.de> writes:
> So I've come back to PR66313 and found a solution to the tailrecursion
> missed optimization when fixing the factoring folding to use an unsigned
> type when we're not sure of overflow.
>
> The folding part is identical to my last try from 2015, the tailrecursion
> part makes us handle intermittent stmts that were introduced by foldings
> that "clobber" our quest walking the single-use chain of stmts between
> the call and the return (and failing at all stmts that are not part
> of said chain). A simple solution is to move the stmts that are not
> part of the chain and that we can move before the call. That handles
> the leaf conversions that now appear for tree-ssa/tailrecursion-6.c
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Richard.
>
> 2017-05-31 Richard Biener <rguenther@suse.de>
>
> PR middle-end/66313
> * fold-const.c (fold_plusminus_mult_expr): If the factored
> factor may be zero use a wrapping type for the inner operation.
> * tree-tailcall.c (independent_of_stmt_p): Pass in to_move bitmap
> and handle moved defs.
> (process_assignment): Properly guard the unary op case. Return a
> tri-state indicating that moving the stmt before the call may allow
> to continue. Pass through to_move.
> (find_tail_calls): Handle moving unrelated defs before
> the call.
>
> * c-c++-common/ubsan/pr66313.c: New testcase.
> * gcc.dg/tree-ssa/loop-15.c: Adjust.
>
> Index: gcc/fold-const.c
> ===================================================================
> *** gcc/fold-const.c.orig 2015-10-29 12:32:33.302782318 +0100
> --- gcc/fold-const.c 2015-10-29 14:08:39.936497739 +0100
> *************** fold_plusminus_mult_expr (location_t loc
> *** 6916,6925 ****
> }
> same = NULL_TREE;
>
> ! if (operand_equal_p (arg01, arg11, 0))
> ! same = arg01, alt0 = arg00, alt1 = arg10;
> ! else if (operand_equal_p (arg00, arg10, 0))
> same = arg00, alt0 = arg01, alt1 = arg11;
> else if (operand_equal_p (arg00, arg11, 0))
> same = arg00, alt0 = arg01, alt1 = arg10;
> else if (operand_equal_p (arg01, arg10, 0))
> --- 6916,6926 ----
> }
> same = NULL_TREE;
>
> ! /* Prefer factoring a common non-constant. */
> ! if (operand_equal_p (arg00, arg10, 0))
> same = arg00, alt0 = arg01, alt1 = arg11;
> + else if (operand_equal_p (arg01, arg11, 0))
> + same = arg01, alt0 = arg00, alt1 = arg10;
> else if (operand_equal_p (arg00, arg11, 0))
> same = arg00, alt0 = arg01, alt1 = arg10;
> else if (operand_equal_p (arg01, arg10, 0))
> *************** fold_plusminus_mult_expr (location_t loc
> *** 6974,6987 ****
> }
> }
>
> ! if (same)
> return fold_build2_loc (loc, MULT_EXPR, type,
> fold_build2_loc (loc, code, type,
> fold_convert_loc (loc, type, alt0),
> fold_convert_loc (loc, type, alt1)),
> fold_convert_loc (loc, type, same));
>
> ! return NULL_TREE;
> }
>
> /* Subroutine of native_encode_expr. Encode the INTEGER_CST
> --- 6975,7010 ----
> }
> }
>
> ! if (!same)
> ! return NULL_TREE;
> !
> ! if (! INTEGRAL_TYPE_P (type)
> ! || TYPE_OVERFLOW_WRAPS (type)
> ! /* We are neither factoring zero nor minus one. */
> ! || TREE_CODE (same) == INTEGER_CST)
> return fold_build2_loc (loc, MULT_EXPR, type,
> fold_build2_loc (loc, code, type,
> fold_convert_loc (loc, type, alt0),
> fold_convert_loc (loc, type, alt1)),
> fold_convert_loc (loc, type, same));
>
> ! /* Same may be zero and thus the operation 'code' may overflow. Likewise
> ! same may be minus one and thus the multiplication may overflow. Perform
> ! the operations in an unsigned type. */
> ! tree utype = unsigned_type_for (type);
> ! tree tem = fold_build2_loc (loc, code, utype,
> ! fold_convert_loc (loc, utype, alt0),
> ! fold_convert_loc (loc, utype, alt1));
> ! /* If the sum evaluated to a constant that is not -INF the multiplication
> ! cannot overflow. */
> ! if (TREE_CODE (tem) == INTEGER_CST
> ! && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED)))
> ! return fold_build2_loc (loc, MULT_EXPR, type,
> ! fold_convert (type, tem), same);
> !
> ! return fold_convert_loc (loc, type,
> ! fold_build2_loc (loc, MULT_EXPR, utype, tem,
> ! fold_convert_loc (loc, utype, same)));
> }
>
> /* Subroutine of native_encode_expr. Encode the INTEGER_CST
Sorry if you already know, but this part means that we can no longer
vectorise:
int
f (int *x, int b1, int b2, int b3)
{
int foo = 0;
for (int i1 = 0; i1 < b1; ++i1)
for (int i2 = 0; i2 < b2; ++i2)
for (int i3 = 0; i3 < b3; ++i3)
foo += x[i1 * b2 * b3 + i2 * b3 + (i3 - 1)];
return foo;
}
We now convert all the arithmetic in the [...] to unsigned int
and reassociate it so that the "- 1" is applied last. We then assume
that the overflow in this subtraction is well-defined and that the
&x[...] might not be linear for the inner loop.
Thanks,
Richard