This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/68234] New: tree-vrp pass need to be improved when handling ASSERT/PLUS/MINUS/_EXPR and Phi node
- From: "jiwang at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 06 Nov 2015 10:26:28 +0000
- Subject: [Bug tree-optimization/68234] New: tree-vrp pass need to be improved when handling ASSERT/PLUS/MINUS/_EXPR and Phi node
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68234
Bug ID: 68234
Summary: tree-vrp pass need to be improved when handling
ASSERT/PLUS/MINUS/_EXPR and Phi node
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: jiwang at gcc dot gnu.org
Target Milestone: ---
Created attachment 36661
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36661&action=edit
experimental-patch
r226850 will cause gcc generate more ASSERT_EXPR thus trigger some latent
issues in gcc tree-vrp pass. For example, we are generating sub-optimal
code for gcc.c-torture/compile/20121027-1.c since r226850.
For ARM target:
./cc1 -O3 -nostdinc 20121027-1.c -march=armv8-a -mthumb
-mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -nostdinc
before r226850:
"bl/64" turned into "bl >> 6"
==
.L3:
asrs r3, r4, #6
add r2, sp, #1024
adds r4, r4, #1
add r3, r2, r3, lsl #3
subw r3, r3, #1019
vld1.64 {d16}, [r3]
vmov r0, r1, d16 @ int
bl ff
ldr r3, [r5]
cmp r3, r4
bgt .L3
after ("bl/64" is not turned into "bl >> 6")
===
.L3:
add r3, r4, #63
add r2, sp, #1024
ands r3, r3, r4, asr #32
it cc
movcc r3, r4
adds r4, r4, #1
asrs r3, r3, #6
add r3, r2, r3, lsl #3
subw r3, r3, #1019
vld1.64 {d16}, [r3]
vmov r0, r1, d16 @ int
bl ff
ldr r3, [r5]
cmp r3, r4
bgt .L3
For mips target:
./cc1 -O2 -march=mips32r2 -mabi=32 20121027-1.c -nostdinc
before
===
$L8:
addiu $16,$16,1
sll $2,$2,3
addu $2,$17,$2
lwl $5,9($2)
lwl $4,5($2)
lwr $5,12($2)
lwr $4,8($2)
jal ff
lw $2,0($18)
slt $2,$16,$2
bne $2,$0,$L8
sra $2,$16,6
after
===
$L9:
slt $2,$16,0
movz $3,$16,$2
addiu $16,$16,1
sra $2,$3,6
sll $2,$2,3
addu $2,$17,$2
lwl $5,9($2)
lwl $4,5($2)
lwr $5,12($2)
lwr $4,8($2)
jal ff
lw $2,0($18)
slt $2,$16,$2
bne $2,$0,$L9
addiu $3,$16,63
This is because previously gcc can deduct the value range for the variable
"bl", and conclude it will be positive, then later optimization can turn the
signed division to right shift thus we can avoid runtime overhead for signed
division. After r226850, gcc can't deduct this. The initial cause if r226850
will introduce more ASSERT_EXPR which is fine but it caused problem for
tree-vrp.
From .vrp1 dump, tree-vrp pass is too conservative at three places:
1. When handling ASSERT_EXPR
Visiting statement:
c_13 = ASSERT_EXPR <c_1, c_1 < nc.0_4>;
Intersecting
[-INF, nc.0_4 + -1]
and
[0, 0]
to
[-INF, nc.0_4 + -1]
the range info should be [0, nc.0_4 + -1].
my understanding is for ASSERT_EXPR <var, var < limit>, if var is SSA_NAME and
have valid range info, then it's minimum should be honored.
2. When handling PLUS_EXPR with symbolic range
After fixed issue 1, the range of c_13 will be [0, nc.0_4 + -1], but the
following PLUS_EXPR range b1_11 to be VARYING which is wrong.
Visiting statement:
bl_11 = c_13 + 1;
Found new range for bl_11: VARYING
The range of bl_11 should be [1, nc.0_4].
looks to me the following code at the bottom of
extract_range_from_binary_expr_1 is overkilling. At least for
PLUS_EXPR/MINUS_EXPR. "cmp == -2" which means there is symbolic range,
should be allowed for both, otherwise why there are lots of code in
PLUS_EXPR/MINUS_EXPR hunk deliberately handling symbolic range?
cmp = compare_values (min, max);
if (cmp == -2 || cmp == 1)
set_value_range_to_varying (vr);
So I think we should relax the condition to not included
PLUS_EXPR/MINUS_EXPR when cmp == -2.
3. When handling phi node
Even after both 1 and 2 fixed, there still be another issue for phi node.
Given, bl_11 now with the range [1, nc.0_4], I found it's range info is
not honored when visiting PHI node, looks to me, the following code in
vrp_visit_phi_node is overkilling, and I don't know how to relax it
properly, if we simply remove this block of code, then gcc can finally get
correct range info for the testcase 20121027-1.c and generate optimized
instruction sequences.
/* Do not allow equivalences or symbolic ranges to leak in from
backedges. That creates invalid equivalencies.
See PR53465 and PR54767. */
if (e->flags & EDGE_DFS_BACK)
{
if (vr_arg.type == VR_RANGE
|| vr_arg.type == VR_ANTI_RANGE)
{
vr_arg.equiv = NULL;
if (symbolic_range_p (&vr_arg))
{
vr_arg.type = VR_VARYING;
vr_arg.min = NULL_TREE;
vr_arg.max = NULL_TREE;
}
Visiting PHI node: c_1 = PHI <0(2), bl_11(3)>
Argument #0 (2 -> 4 executable)
0: [0, 0]
Argument #1 (3 -> 4 executable)
bl_11: VARYING <--- bl_11 is with VR_RANGE, but forced to
VARYING because of it's symbolic range
Meeting
[0, 0]
and
VARYING
to
VARYING
attachment is experiment patch to show the bug places from my understanding.
Toughts?
--
Regards,
Jiong