Bug 42869 - GOMP_critical_start wrong on Itanium due to __sync miscompilation
Summary: GOMP_critical_start wrong on Itanium due to __sync miscompilation
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.2
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: openmp, wrong-code
Depends on:
Blocks:
 
Reported: 2010-01-26 02:12 UTC by Hans Boehm
Modified: 2010-03-12 18:22 UTC (History)
3 users (show)

See Also:
Host:
Target: ia64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hans Boehm 2010-01-26 02:12:02 UTC
The Itanium code for GOMP_start_critical starts 

0x2000000000334900 <GOMP_critical_start>:
    [MMI]       alloc r16=ar.pfs,1,1,0
0x2000000000334901 <GOMP_critical_start+1>:                 addl r32=840,r1
0x2000000000334902 <GOMP_critical_start+2>:                 nop.i 0x0
0x2000000000334910 <GOMP_critical_start+16>:    [MMI]       mf;;
0x2000000000334911 <GOMP_critical_start+17>:                mov.m ar.ccv=0
0x2000000000334912 <GOMP_critical_start+18>:                mov r14=1;;
0x2000000000334920 <GOMP_critical_start+32>:    [MMI]       nop.m 0x0
0x2000000000334921 <GOMP_critical_start+33>:
                cmpxchg4.rel r14=[r32],r14,ar.ccv
0x2000000000334922 <GOMP_critical_start+34>:                nop.i 0x0;;
0x2000000000334930 <GOMP_critical_start+48>:    [MIB]       nop.m 0x0
0x2000000000334931 <GOMP_critical_start+49>:                cmp.eq p6,p7=0,r14
0x2000000000334932 <GOMP_critical_start+50>:
          (p06) br.ret.dptk.many b0;;

Note the mf followed by a cmxchg4.rel.  I don't believe this enforces sufficient memory ordering constraints.  A subsequent store from inside the critical section may become visible to other threads before the cmpxchg4.rel, which is only intended to prevent reordering in the OTHER direction.  Thus a store inside the critical section can become visible before the lock is really acquired, which is, at least theoretically, very bad.

I do not know whether current hardware may actually execute these out of order.  I observed this while trying to understand the GNU OpenMP support.  I also don't know whether this problem is limited to Itanium.  I expect it doesn't exist on X86.  It may exist onother weakly-ordered architectures.

I believe that this is due to incorrect code generated for the __sync_bool_compare_and_swap in gomp_mutex_lock().
Comment 1 Richard Biener 2010-01-26 12:06:03 UTC
Please check trunk.
Comment 2 Steve Ellcey 2010-03-09 23:25:22 UTC
I think the code is wrong.  Looking at ToT, config/ia64/sync.md will generate the code shown where the memory fence is in front of the cmpxchg4.rel instruction.

cmpxchg4.rel has release semantics which are defined in the Itanium manual as: "Release instructions guarentee that all previous orderable instructions are made visible prior to being made visible themselves."  This would imply to me that the memory fence before the cmpxchg4.rel is not needed but we do need one after it in order to prevent things being moved up ahead of the cmpxchg4.rel.  Alternatively, we could change to code to use cmpxchg4.acq instead of cmpxchg4.rel and keep the memory fence where it is.  cmpxchg4.acq has acquire semantics instead of release semantics.
Comment 3 Jakub Jelinek 2010-03-09 23:33:16 UTC
__sync_bool_compare_and_swap is documented to be a full barrier, so I think the bug must be on the ia64/sync.md side.
Comment 4 Steve Ellcey 2010-03-09 23:49:26 UTC
Yes, I think this is clearly a bug in the IA64 definition of sync_compare_and_swap.  I think the fix is swapping the two instructions being generated by the IA64 sync_compare_and_swap instruction.  (cmpxchg4.rel followed by a memory fence instead of a memory fence followed by cmpxchg4.rel).
Comment 5 Steve Ellcey 2010-03-12 18:19:26 UTC
Subject: Bug 42869

Author: sje
Date: Fri Mar 12 18:19:14 2010
New Revision: 157410

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157410
Log:
2010-03-12  Steve Ellcey  <sje@cup.hp.com>

	PR target/42869
	* config/ia64/sync.md (sync_compare_and_swap): Move memory fence.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/ia64/sync.md

Comment 6 Steve Ellcey 2010-03-12 18:22:55 UTC
Fixed by moving the mf after the cmpxch.rel.  The cmpxchg.rel will keep memory operations from moving down and the mf will keep them from moving up.  Changing cmpxchg.rel to cmpxchg.acq would have achieved the same result.
Comment 7 Steve Ellcey 2010-07-21 22:38:10 UTC
Subject: Bug 42869

Author: sje
Date: Wed Jul 21 22:37:53 2010
New Revision: 162387

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162387
Log:
2010-07-21  Steve Ellcey  <sje@cup.hp.com>

        PR target/42869
        * config/ia64/sync.md (sync_compare_and_swap): Move memory fence.

Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/config/ia64/sync.md