This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Optimization of conditional access to globals: thread-unsafe?
- From: Tomash Brechko <tomash dot brechko at gmail dot com>
- To: gcc at gcc dot gnu dot org
- Date: Sun, 21 Oct 2007 18:55:13 +0400
- Subject: Optimization of conditional access to globals: thread-unsafe?
Hello,
I have a question regarding the thread-safeness of a particular GCC
optimization. I'm sorry if this was already discussed on the list, if
so please provide me with the reference to the previous discussion.
Consider this piece of code:
extern int v;
void
f(int set_v)
{
if (set_v)
v = 1;
}
If f() is called concurrently from several threads, then call to f(1)
should be protected by the mutex. But do we have to acquire the mutex
for f(0) calls? I'd say no, why, there's no access to global v in
that case. But GCC 3.3.4--4.3.0 on i686 with -01 generates the
following:
f:
pushl %ebp
movl %esp, %ebp
cmpl $0, 8(%ebp)
movl $1, %eax
cmove v, %eax ; load (maybe)
movl %eax, v ; store (always)
popl %ebp
ret
Note the last unconditional store to v. Now, if some thread would
modify v between our load and store (acquiring the mutex first), then
we will overwrite the new value with the old one (and would do that in
a thread-unsafe manner, not acquiring the mutex).
So, do the calls to f(0) require the mutex, or it's a GCC bug?
This very bug was actually already reported for a bit different case,
"Loop IM and other optimizations harmful for -fopenmp"
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862 ; please ignore my
last comment there, as I no longer sure myself). But the report was
closed with "UNCONFIRMED" mark, and reasons for that are not quire
clear to me. I tried to dig into the C99 standard and David
Butenhof's "Programming with POSIX Threads", and didn't find any
indication that call f(0) should be also protected by the mutex.
Here are some pieces from C99:
Sec 3.1 par 3: NOTE 2 "Modify" includes the case where the new value
being stored is the same as the previous value.
Sec 3.1 par 4: NOTE 3 Expressions that are not evaluated do not access
objects.
Sec 5.1.2.3 par 3: In the abstract machine, all expressions are
evaluated as specified by the semantics.
Sec 5.1.2.3 par 5 basically says that the result of the program
execution wrt volatile objects, external files and terminal output
should be the same for all confirming implementations.
Sec 5.1.2.3 par 8: EXAMPLE 1 An implementation might define a
one-to-one correspondence between abstract and
actual semantics: ...
Sec 5.1.2.3 par 9: Alternatively, an implementation might perform
various optimizations within each translation unit,
such that the actual semantics would agree with the
abstract semantics only when making function calls
across translation unit boundaries. ...
I think that the above says that even when compiler chooses to do some
optimizations, the result of the _whole execution_ should be the same
as if actual semantics equals to abstract semantics. Sec 5.1.2.3 par
9 cited last is not a permission to do optimizations that may change
the end result. In our case when threads are involved the result may
change, because there's no access to v in the abstract semantics, and
thus no mutex is required from abstract POV.
So, could someone explain me why this GCC optimization is valid, and,
if so, where lies the boundary below which I may safely assume GCC
won't try to store to objects that aren't stored to explicitly during
particular execution path? Or maybe the named bug report is valid
after all?
Thanks in advance,
--
Tomash Brechko