This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/68234] tree-vrp pass need to be improved when handling ASSERT/PLUS/MINUS/_EXPR and Phi node
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 06 Nov 2015 11:08:35 +0000
- Subject: [Bug tree-optimization/68234] tree-vrp pass need to be improved when handling ASSERT/PLUS/MINUS/_EXPR and Phi node
- Auto-submitted: auto-generated
- References: <bug-68234-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68234
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
CC| |rguenth at gcc dot gnu.org
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jiong Wang from comment #0)
> Created attachment 36661 [details]
> experimental-patch
>
> r226850 will cause gcc generate more ASSERT_EXPR thus trigger some latent
> issues in gcc tree-vrp pass. For example, we are generating sub-optimal
> code for gcc.c-torture/compile/20121027-1.c since r226850.
>
> For ARM target:
>
> ./cc1 -O3 -nostdinc 20121027-1.c -march=armv8-a -mthumb
> -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -nostdinc
>
> before r226850:
>
> "bl/64" turned into "bl >> 6"
> ==
> .L3:
> asrs r3, r4, #6
> add r2, sp, #1024
> adds r4, r4, #1
> add r3, r2, r3, lsl #3
> subw r3, r3, #1019
> vld1.64 {d16}, [r3]
> vmov r0, r1, d16 @ int
> bl ff
> ldr r3, [r5]
> cmp r3, r4
> bgt .L3
>
>
> after ("bl/64" is not turned into "bl >> 6")
> ===
> .L3:
> add r3, r4, #63
> add r2, sp, #1024
> ands r3, r3, r4, asr #32
> it cc
> movcc r3, r4
> adds r4, r4, #1
> asrs r3, r3, #6
> add r3, r2, r3, lsl #3
> subw r3, r3, #1019
> vld1.64 {d16}, [r3]
> vmov r0, r1, d16 @ int
> bl ff
> ldr r3, [r5]
> cmp r3, r4
> bgt .L3
>
> For mips target:
>
> ./cc1 -O2 -march=mips32r2 -mabi=32 20121027-1.c -nostdinc
>
> before
> ===
> $L8:
> addiu $16,$16,1
> sll $2,$2,3
> addu $2,$17,$2
> lwl $5,9($2)
> lwl $4,5($2)
> lwr $5,12($2)
> lwr $4,8($2)
> jal ff
> lw $2,0($18)
> slt $2,$16,$2
> bne $2,$0,$L8
> sra $2,$16,6
> after
> ===
> $L9:
> slt $2,$16,0
> movz $3,$16,$2
> addiu $16,$16,1
> sra $2,$3,6
> sll $2,$2,3
> addu $2,$17,$2
> lwl $5,9($2)
> lwl $4,5($2)
> lwr $5,12($2)
> lwr $4,8($2)
> jal ff
> lw $2,0($18)
> slt $2,$16,$2
> bne $2,$0,$L9
> addiu $3,$16,63
>
> This is because previously gcc can deduct the value range for the variable
> "bl", and conclude it will be positive, then later optimization can turn the
> signed division to right shift thus we can avoid runtime overhead for signed
> division. After r226850, gcc can't deduct this. The initial cause if r226850
> will introduce more ASSERT_EXPR which is fine but it caused problem for
> tree-vrp.
>
> From .vrp1 dump, tree-vrp pass is too conservative at three places:
>
> 1. When handling ASSERT_EXPR
>
> Visiting statement:
> c_13 = ASSERT_EXPR <c_1, c_1 < nc.0_4>;
> Intersecting
> [-INF, nc.0_4 + -1]
> and
> [0, 0]
> to
> [-INF, nc.0_4 + -1]
>
> the range info should be [0, nc.0_4 + -1].
No, the intersection is [0,0]. Basically the code fails to properly
intersect and can choose either original range. We at the moment
choose the first because always choosing a non-symbolic range
regresses symbolic range handling (AFAIR).
Note that [0, nc.0_4 + 1] is not correct as nc.0_4 + 1 could be
less than zero (in which case the intersection result would be
UNDEFINED). So eventually choosing [0, nc.0_4 + 1] is ok.
>
> my understanding is for ASSERT_EXPR <var, var < limit>, if var is SSA_NAME
> and
> have valid range info, then it's minimum should be honored.
>
> 2. When handling PLUS_EXPR with symbolic range
>
> After fixed issue 1, the range of c_13 will be [0, nc.0_4 + -1], but the
> following PLUS_EXPR range b1_11 to be VARYING which is wrong.
>
> Visiting statement:
> bl_11 = c_13 + 1;
> Found new range for bl_11: VARYING
>
> The range of bl_11 should be [1, nc.0_4].
If the type is a wrapping type then nc.0_4 -1 + 1 might wrap around
and the range become invalid (an anti-range).
Note that symbolic range handling is _very_ limited right now.
> looks to me the following code at the bottom of
> extract_range_from_binary_expr_1 is overkilling. At least for
> PLUS_EXPR/MINUS_EXPR. "cmp == -2" which means there is symbolic range,
> should be allowed for both, otherwise why there are lots of code in
> PLUS_EXPR/MINUS_EXPR hunk deliberately handling symbolic range?
>
> cmp = compare_values (min, max);
> if (cmp == -2 || cmp == 1)
> set_value_range_to_varying (vr);
>
> So I think we should relax the condition to not included
> PLUS_EXPR/MINUS_EXPR when cmp == -2.
See above for the overflow issues. For example with using
x = [1, nc.0_4] we'd conclude that x > 0 is true but if
nc.0_4 is zero this is not true.
> 3. When handling phi node
>
> Even after both 1 and 2 fixed, there still be another issue for phi node.
>
> Given, bl_11 now with the range [1, nc.0_4], I found it's range info is
> not honored when visiting PHI node, looks to me, the following code in
> vrp_visit_phi_node is overkilling, and I don't know how to relax it
> properly, if we simply remove this block of code, then gcc can finally get
> correct range info for the testcase 20121027-1.c and generate optimized
> instruction sequences.
This case is the hardest and I can't see an easy way out (we've tried).
> /* Do not allow equivalences or symbolic ranges to leak in from
> backedges. That creates invalid equivalencies.
> See PR53465 and PR54767. */
> if (e->flags & EDGE_DFS_BACK)
> {
> if (vr_arg.type == VR_RANGE
> || vr_arg.type == VR_ANTI_RANGE)
> {
> vr_arg.equiv = NULL;
> if (symbolic_range_p (&vr_arg))
> {
> vr_arg.type = VR_VARYING;
> vr_arg.min = NULL_TREE;
> vr_arg.max = NULL_TREE;
> }
>
>
> Visiting PHI node: c_1 = PHI <0(2), bl_11(3)>
> Argument #0 (2 -> 4 executable)
> 0: [0, 0]
> Argument #1 (3 -> 4 executable)
> bl_11: VARYING <--- bl_11 is with VR_RANGE, but forced to
> VARYING because of it's symbolic range
> Meeting
> [0, 0]
> and
> VARYING
> to
> VARYING
>
>
> attachment is experiment patch to show the bug places from my understanding.
>
> Toughts?
I think the issue is that we insert the assert in the first place or
that we intersect to a symbolic range this causes us to not use SCEV / loop
analysis to get at the range for c_1. That is, in vrp_visit_phi_node
the early outs to varying: shouldn't skip
/* If we dropped either bound to +-INF then if this is a loop
PHI node SCEV may known more about its value-range. */
if ((cmp_min > 0 || cmp_min < 0
|| cmp_max < 0 || cmp_max > 0)
&& (l = loop_containing_stmt (phi))
&& l->header == gimple_bb (phi))
adjust_range_with_scev (&vr_result, l, phi, lhs);
sth like
Index: gcc/tree-vrp.c
===================================================================
--- gcc/tree-vrp.c (revision 229804)
+++ gcc/tree-vrp.c (working copy)
@@ -8827,6 +8805,24 @@ update_range:
/* No match found. Set the LHS to VARYING. */
varying:
+
+ /* If we dropped either bound to +-INF then if this is a loop
+ PHI node SCEV may known more about its value-range. */
+ if ((l = loop_containing_stmt (phi))
+ && l->header == gimple_bb (phi))
+ {
+ adjust_range_with_scev (&vr_result, l, phi, lhs);
+
+ /* If we will end up with a (-INF, +INF) range, set it to
+ VARYING. Same if the previous max value was invalid for
+ the type and we end up with vr_result.min > vr_result.max. */
+ if (!((vrp_val_is_max (vr_result.max)
+ && vrp_val_is_min (vr_result.min))
+ || compare_values (vr_result.min,
+ vr_result.max) > 0))
+ goto update_range;
+ }
+
set_value_range_to_varying (lhs_vr);
return SSA_PROP_VARYING;
}
which ends up with
Value ranges after VRP:
c_1: [0, +INF]
as desired. Maybe you can take the above and put it to testing.
> --
> Regards,
> Jiong