/* { dg-do run } */ /* { dg-options "-fopenmp -O2" } */ extern int omp_get_num_threads (void); extern void abort (void); short e[64]; int num_threads; int g; _Complex double d, f; __attribute__((noinline)) void foo (int x, long long y) { #pragma omp parallel num_threads (4) { int i; #pragma omp barrier for (i = 0; i < 2400; i++) { if (i == 0) num_threads = omp_get_num_threads (); #pragma omp atomic e[0] += x; } } } int main (void) { int i; foo (3, 3LL); if (e[0] != 3 * 2400 * num_threads) abort (); return 0; } as well as atomic-3.c fails on ppc-linux. In *.optimized dump the atomic operation is: D.1288 = (short unsigned int) .omp_data_i->x; D.1290 = e[0]; <bb 6>: D.1294 = __sync_val_compare_and_swap_2 (&e, D.1290, (short int) ((short unsigned int) D.1290 + D.1288)); D.1295 = D.1290; D.1290 = D.1294; if (D.1294 != D.1295) goto <bb 6>; else goto <bb 7>; which looks correct.
Patch: 2008-03-07 Jakub Jelinek <jakub@redhat.com> PR target/35498 * config/rs6000/rs6000.c (rs6000_expand_compare_and_swapqhi): Shift wdst back after sync_compare_and_swapqhi_internal. --- gcc/config/rs6000/rs6000.c.jj 2008-02-29 09:11:54.000000000 +0100 +++ gcc/config/rs6000/rs6000.c 2008-03-07 17:22:27.000000000 +0100 @@ -13858,6 +13858,9 @@ rs6000_expand_compare_and_swapqhi (rtx d emit_insn (gen_sync_compare_and_swapqhi_internal (wdst, mask, oldval, newval, mem)); + /* Shift the result back. */ + emit_insn (gen_lshrsi3 (wdst, wdst, shift)); + emit_move_insn (dst, gen_lowpart (mode, wdst)); } will test it now and if successfull, post.
Subject: Bug 35498 Author: jakub Date: Sat Mar 8 07:30:55 2008 New Revision: 133024 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133024 Log: PR target/35498 * config/rs6000/rs6000.c (rs6000_expand_compare_and_swapqhi): Shift wdst back after sync_compare_and_swapqhi_internal. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c
Subject: Bug 35498 Author: jakub Date: Sat Mar 8 07:36:35 2008 New Revision: 133025 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133025 Log: PR target/35498 * config/rs6000/rs6000.c (rs6000_expand_compare_and_swapqhi): Shift wdst back after sync_compare_and_swapqhi_internal. Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/config/rs6000/rs6000.c
The reason why the old code without the right shift almost worked is that for the 4 byte aligned 16-bit vars each loop was executed usually twice. .L6: lha 0,0(27) lhz 8,2(26) .align 4 .L4: sync add 9,8,0 rlwinm 10,0,0,0xffff rlwinm 9,9,0,0xffff slw 11,10,31 slw 9,9,31 .L11: lwarx 7,0,29 and 0,7,28 cmpw 0,0,11 bne- 0,.L12 andc 7,7,28 or 7,7,9 stwcx. 7,0,29 bne- 0,.L11 isync .L12: ! srw 0,0,31 ! This insn was added by this patch rlwinm 0,0,0,0xffff cmpw 7,0,10 extsh 0,0 bne 7,.L4 The first time usually the atomic instruction succeeded, but r0 after rlwinm was 0, so most often different from r10. This means the code then jumped to .L4, with r0 = 0 as the expected value of e[0]. r10 then becomes 0 as new expected value, lwarx reads the new actual value of e[0], which will be different from the expected 0. So it jumps to .L12, r0 now contains the e[0] value in upper half and 0 in lower half and r10 is 0, so in the second big loop nothing is changed and the loop exits. This is what happens if there is no contention. If there is contention though, the first loop doesn't compare and swap anything and as shown above, the second loop iteration won't change anything unless e[0] is 0.