The following code sample exhibits a bug in a gcc -O2 optimization pass. Namely, having defined two() and six() with the obvious return values, the value of two() * 2 + six() * 5 gets an assembly of 1 shl 5 (i.e. 32, instead of the correct 34). Code: http://web.mit.edu/madars/Public/gcc-basic-arithmetic-bug.c gcc 4.9.1 and 4.9.2 with -O2 are both buggy, -O1 makes the bug go away; the bug does not seem to be present in gcc 4.8 and earlier.
I should clarify that the same behavior can be observed on Debian, Ubuntu and Fedora: gcc (Ubuntu 4.9.1-16ubuntu6) 4.9.1 gcc (Debian 4.9.1-19) 4.9.1 gcc (GCC) 4.9.2 20141101 (Red Hat 4.9.2-1) Moreover, Alex Chernyakhovsky reports that switching two() * 2 to two() + two() or two() << 1 produces correct results, so this seems to have to do with the handling of multiplication.
Confirmed, except that changing two() * 2 to two() + two() or two() << 1 does not make a difference for me. This looks more related to inlining: -O1: works -O2: fails -O1 -finline-small-functions: fails -O2 -fno-inline-small-functions: works And if I mark two() and six() as inline, then it fails even at -O1. (GCC 4.9.2-10ubuntu7 on Ubuntu vivid amd64)
Confirmed for 4.9. It's been fixed on trunk with r216728.
This is a duplicate of PR/65216. Underlying issue is in reassoc1-pass. This issue is fixed for 5.0 *** This bug has been marked as a duplicate of bug 65216 ***
Why do you think it is a duplicate? PR65216 was a 5 Regression, something that worked in 4.9; this one is an issue with 4.9 and not 5. PR65216 was also about -O3 only, this one fails even with -O2. PR65216 started after this one got fixed.
(In reply to Kai Tietz from comment #4) > This is a duplicate of PR/65216. Underlying issue is in reassoc1-pass. > This issue is fixed for 5.0 > > *** This bug has been marked as a duplicate of bug 65216 *** Is it fixed in 4.9.x? Because that PR 65216 is marked as fixed and does not mention 4.9.x, only 5.0.
Well, it looked like the same issue by inspection dumps, as folding issue happens in reassoc-pass. Of course it might be that forward-prop patch is the actual issue. I noticed for -O3 on 4.9.x that valid computation (dse1): ... g (const unsigned int from_f) { const unsigned int thirty_four; unsigned int _3; unsigned int _4; unsigned int global.0_8; unsigned int _9; unsigned int _10; <bb 2>: global.0_8 = global; _9 = global.0_8 * 2; _3 = _9 * 2; _10 = _9 * 3; _4 = _10 * 5; thirty_four_5 = _3 + _4; ... Shows after reassoc1 pass: ... g (const unsigned int from_f) { const unsigned int thirty_four; unsigned int _3; unsigned int _4; unsigned int global.0_8; unsigned int _9; unsigned int _10; <bb 2>: global.0_8 = global; _9 = global.0_8 * 2; _4 = 15; _3 = _4 + 2; _10 = _9 * _3; thirty_four_5 = _10; ... so it seems to be the forward-propagate patch. Sorry for the noise
A simpler testcase: static inline unsigned cp(unsigned x) { return x; } unsigned f(unsigned x) { return cp(x * 2) * 2 + cp(cp(x * 2) * 3) * 5; } $ cat ./test1.c.085t.phiopt2 ;; Function f (f, funcdef_no=1, decl_uid=1393, symbol_order=1) f (unsigned int x) { unsigned int _2; unsigned int _3; unsigned int _4; unsigned int _5; unsigned int _6; <bb 2>: _2 = x_1(D) * 2; _5 = 15; _3 = _5 + 2; _4 = _2 * _3; _6 = _4; return _6; } $ cat ./test1.c.087t.ccp3 ;; Function f (f, funcdef_no=1, decl_uid=1393, symbol_order=1) Pass statistics: ---------------- Immediate_uses: x_1(D) : --> single use. _2 = x_1(D) * 2; _2 : --> single use. _4 = _2 * _3; _3 : --> single use. _4 = _2 * _3; _4 : --> single use. _6 = _4; _5 : --> single use. _3 = _5 + 2; _6 : --> single use. return _6; .MEM_7(D) : --> single use. # VUSE <.MEM_7(D)> return _6; Adding Destination of edge (0 -> 2) to worklist Simulating block 2 Visiting statement: # RANGE [0, 4294967295] NONZERO 0x000000000fffffffe _2 = x_1(D) * 2; which is likely CONSTANT Lattice value changed to CONSTANT 0x00000000000000000 (0x000000000fffffffe). Adding SSA edges to worklist. Visiting statement: # RANGE [0, 4294967295] NONZERO 0x000000000fffffffe _5 = 15; which is likely CONSTANT Lattice value changed to CONSTANT 14. Adding SSA edges to worklist. Visiting statement: _3 = _5 + 2; which is likely CONSTANT Lattice value changed to CONSTANT 16. Adding SSA edges to worklist. Visiting statement: _4 = _2 * _3; which is likely CONSTANT Lattice value changed to CONSTANT 0x00000000000000000 (0x000000000ffffffe0). Adding SSA edges to worklist. Visiting statement: # RANGE [0, 4294967295] NONZERO 0x000000000fffffffe _6 = _4; Lattice value changed to CONSTANT 0x00000000000000000 (0x000000000ffffffe0). Adding SSA edges to worklist. Visiting statement: # VUSE <.MEM_7(D)> return _6; No interesting values produced. Marked VARYING. Substituting values and folding statements Folding statement: return _6; Not folded Folding statement: _6 = _4; Not folded Folding statement: _4 = _2 * _3; Folded into: _4 = _2 * 16; Removing dead stmt _3 = 16; Removing dead stmt _5 = 14; Folding statement: _2 = x_1(D) * 2; Not folded Pass statistics: ---------------- Constants propagated: 1 Statements deleted: 2 f (unsigned intD.4 xD.1392) { unsigned intD.4 _2; unsigned intD.4 _4; unsigned intD.4 _6; ;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 0, next block 1, flags: (NEW, REACHABLE) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) # RANGE [0, 4294967295] NONZERO 0x000000000fffffffe _2 = x_1(D) * 2; # RANGE [0, 4294967295] NONZERO 0x000000000ffffffe0 _4 = _2 * 16; # RANGE [0, 4294967295] NONZERO 0x000000000ffffffe0 _6 = _4; # VUSE <.MEM_7(D)> return _6; ;; succ: EXIT [100.0%] }
Visiting statement: # RANGE [0, 4294967295] NONZERO 0x000000000fffffffe _5 = 15; which is likely CONSTANT Lattice value changed to CONSTANT 14. Adding SSA edges to worklist. is of course overly optimistic ;) (but yes, nonzero bits info is bogus) I'd say we should eventually assert instead of miscompiling things. That is, static prop_value_t evaluate_stmt (gimple stmt) { ... if (flag_tree_bit_ccp && ((is_constant && TREE_CODE (val.value) == INTEGER_CST) || (!is_constant && likelyvalue != UNDEFINED)) && gimple_get_lhs (stmt) && TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME) { if (!is_constant) { ... } else { double_int valv = tree_to_double_int (val.value); here assert gcc_assert ((valv & ~val.mask & ~nonzero_bits).is_zero ()); that is, known bits in valv should be consistent with nonzero_bits info. if (!(valv & ~nonzero_bits & mask).is_zero ()) val.value = double_int_to_tree (TREE_TYPE (lhs), valv & nonzero_bits); if (nonzero_bits.is_zero ()) val.mask = double_int_zero; else val.mask = val.mask & (nonzero_bits | ~mask); }
The simpler testcase reproduces on trunk for me. Trunk assert: Index: gcc/tree-ssa-ccp.c =================================================================== --- gcc/tree-ssa-ccp.c (revision 221174) +++ gcc/tree-ssa-ccp.c (working copy) @@ -1901,9 +1922,13 @@ evaluate_stmt (gimple stmt) } else { - if (wi::bit_and_not (val.value, nonzero_bits) != 0) - val.value = wide_int_to_tree (TREE_TYPE (lhs), - nonzero_bits & val.value); + wide_int tem = wi::bit_and_not (val.value, nonzero_bits); + if (tem != 0) + { + gcc_assert (extend_mask (tem).and_not (val.mask) == 0); + val.value = wide_int_to_tree (TREE_TYPE (lhs), + nonzero_bits & val.value); + } if (nonzero_bits == 0) val.mask = 0; else
Fixed 4.9 assert: gcc_assert ((valv & ~val.mask & ~nonzero_bits & mask).is_zero ()); fixed trunk assert: @@ -1901,9 +1922,14 @@ evaluate_stmt (gimple stmt) } else { - if (wi::bit_and_not (val.value, nonzero_bits) != 0) - val.value = wide_int_to_tree (TREE_TYPE (lhs), - nonzero_bits & val.value); + widest_int tem = wi::bit_and_not (wi::to_widest (val.value), + extend_mask (nonzero_bits)); + if (tem != 0) + { + gcc_assert (tem.and_not (val.mask) == 0); + val.value = wide_int_to_tree (TREE_TYPE (lhs), + nonzero_bits & val.value); + } if (nonzero_bits == 0) val.mask = 0; else That is, when CCP computes a constant it verifies that nonzero bits info is consistent with that. Such assert might not be ready for prime-time as for example in unreachable code-regions any inconsistency can happen, like in if ((i & 1) == 0) { if (i == 3) { .... } }
By the way, in g++ the bug can be triggered even with -O1 and without marking any functions inline (implicitly or explicitly): http://web.mit.edu/madars/Public/gcc-basic-arithmetic-bug-O1.cpp
GCC 4.9.3 has been released.
Fixed.
(In reply to Richard Biener from comment #14) > Fixed. By which commit was this fixed? Is the fix applicable to the now-closed 4.9 branch at all?