Folowing code (https://gist.github.com/fboranek/5c99f96a99f902f5cf9dc5d597ff8fe3) for 4 thread doesn't compute the correct result if it is compiled by gcc (Debian 6.3.0-6) 6.3.0 20170205. The code is corect compiled by gcc 4.9.2 and 5.4.1. It is also correct compiled if the variable counter is not static bat global. This issue can be related to bug 78778, but bug 78778 is appear in 5.1.0 while this issue is since version 6. static int counter {0}; static std::atomic<bool> flag {false}; void increment1(int cycles) { for (int i=0; i < cycles; ++i) { while (flag.exchange(true, std::memory_order_acquire)); ++counter; flag.store(false, std::memory_order_release); } } // asm for gcc 5.4.1 00000000004010f0 <_Z10increment1i>: 4010f0: 31 c9 xor %ecx,%ecx 4010f2: 85 ff test %edi,%edi 4010f4: ba 01 00 00 00 mov $0x1,%edx 4010f9: 7e 26 jle 401121 <_Z10increment1i+0x31> 4010fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 401100: 89 d0 mov %edx,%eax 401102: 86 05 ec 20 20 00 xchg %al,0x2020ec(%rip) # 6031f4 <_ZL4flag> 401108: 84 c0 test %al,%al 40110a: 75 f4 jne 401100 <_Z10increment1i+0x10> 40110c: 83 c1 01 add $0x1,%ecx 40110f: 83 05 e2 20 20 00 01 addl $0x1,0x2020e2(%rip) # 6031f8 <_ZL7counter> 401116: 39 cf cmp %ecx,%edi 401118: c6 05 d5 20 20 00 00 movb $0x0,0x2020d5(%rip) # 6031f4 <_ZL4flag> 40111f: 75 df jne 401100 <_Z10increment1i+0x10> 401121: f3 c3 repz retq 401123: 0f 1f 00 nopl (%rax) 401126: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40112d: 00 00 00 // asm for gcc 6.3.0 0000000000001420 <_Z10increment1i>: 1420: 8b 0d d2 1d 20 00 mov 0x201dd2(%rip),%ecx # 2031f8 <_ZL7counter> 1426: 85 ff test %edi,%edi 1428: ba 01 00 00 00 mov $0x1,%edx 142d: 8d 34 0f lea (%rdi,%rcx,1),%esi 1430: 7e 26 jle 1458 <_Z10increment1i+0x38> 1432: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 1438: 89 d0 mov %edx,%eax 143a: 86 05 b4 1d 20 00 xchg %al,0x201db4(%rip) # 2031f4 <_ZL4flag> 1440: 84 c0 test %al,%al 1442: 75 f4 jne 1438 <_Z10increment1i+0x18> 1444: 83 c1 01 add $0x1,%ecx 1447: 39 ce cmp %ecx,%esi 1449: 89 0d a9 1d 20 00 mov %ecx,0x201da9(%rip) # 2031f8 <_ZL7counter> 144f: c6 05 9e 1d 20 00 00 movb $0x0,0x201d9e(%rip) # 2031f4 <_ZL4flag> 1456: 75 e0 jne 1438 <_Z10increment1i+0x18> 1458: f3 c3 repz retq 145a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
Confirmed. Somehow PRE messes up here, investigating.
The conservative handling in call_may_clobber_ref_p_1 doesn't work: /* Handle those builtin functions explicitly that do not act as escape points. See tree-ssa-structalias.c:find_func_aliases for the list of builtins we might need to handle here. */ if (callee != NULL_TREE && gimple_call_builtin_p (call, BUILT_IN_NORMAL)) switch (DECL_FUNCTION_CODE (callee)) { ... /* __sync_* builtins and some OpenMP builtins act as threading barriers. */ #undef DEF_SYNC_BUILTIN #define DEF_SYNC_BUILTIN(ENUM, NAME, TYPE, ATTRS) case ENUM: #include "sync-builtins.def" #undef DEF_SYNC_BUILTIN because gimple_call_builtin_p returns false. _10 = __atomic_exchange_1 (&MEM[(struct __atomic_base *)&flag]._M_i, 1, 2); not sure what fixed it on trunk... The '1' is of type int while the prototype seeks for a 'unsigned char'. Somebody should bisect what fixed this.
r235792
Testing backport.
Author: rguenth Date: Tue Mar 14 12:57:08 2017 New Revision: 246122 URL: https://gcc.gnu.org/viewcvs?rev=246122&root=gcc&view=rev Log: 2017-03-14 Richard Biener <rguenther@suse.de> Backport from mainline 2016-05-02 Jakub Jelinek <jakub@redhat.com> PR middle-end/80004 PR target/49244 * gimple.c (gimple_builtin_call_types_compatible_p): Allow char/short arguments promoted to int because of promote_prototypes. 2017-03-09 Richard Biener <rguenther@suse.de> PR tree-optimization/79977 * graphite-scop-detection.c (scop_detection::merge_sese): Handle the case of extra exits to blocks dominating the entry. * gcc.dg/graphite/pr79977.c: New testcase. 2017-03-09 Richard Biener <rguenther@suse.de> PR middle-end/79971 * gimple-expr.c (useless_type_conversion_p): Preserve TYPE_SATURATING for fixed-point types. * gcc.dg/fixed-point/pr79971.c: New testcase. 2017-03-06 Richard Biener <rguenther@suse.de> PR fortran/79894 * trans.c (gfc_add_modify_loc): Weaken assert. 2017-03-02 Richard Biener <rguenther@suse.de> PR c/79756 * c-common.c (c_common_mark_addressable_vec): Look through C_MAYBE_CONST_EXPR. * gcc.dg/vector-1.c: New testcase. 2017-02-22 Richard Biener <rguenther@suse.de> PR tree-optimization/79666 * tree-vrp.c (extract_range_from_binary_expr_1): Make sure to not symbolically negate if that may introduce undefined overflow. * gcc.dg/torture/pr79666.c: New testcase. 2017-02-17 Richard Biener <rguenther@suse.de> PR middle-end/79576 * params.def (max-ssa-name-query-depth): Limit to 10. Added: branches/gcc-6-branch/gcc/testsuite/gcc.dg/fixed-point/pr79971.c branches/gcc-6-branch/gcc/testsuite/gcc.dg/graphite/pr79977.c branches/gcc-6-branch/gcc/testsuite/gcc.dg/torture/pr79666.c branches/gcc-6-branch/gcc/testsuite/gcc.dg/vector-1.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/c-family/ChangeLog branches/gcc-6-branch/gcc/c-family/c-common.c branches/gcc-6-branch/gcc/fortran/ChangeLog branches/gcc-6-branch/gcc/fortran/trans.c branches/gcc-6-branch/gcc/gimple-expr.c branches/gcc-6-branch/gcc/gimple.c branches/gcc-6-branch/gcc/graphite-scop-detection.c branches/gcc-6-branch/gcc/params.def branches/gcc-6-branch/gcc/testsuite/ChangeLog branches/gcc-6-branch/gcc/tree-vrp.c
Fixed.