This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Optimization of conditional access to globals: thread-unsafe?


I have a question regarding the thread-safeness of a particular GCC
optimization.  I'm sorry if this was already discussed on the list, if
so please provide me with the reference to the previous discussion.

Consider this piece of code:

    extern int v;
    f(int set_v)
      if (set_v)
        v = 1;

If f() is called concurrently from several threads, then call to f(1)
should be protected by the mutex.  But do we have to acquire the mutex
for f(0) calls?  I'd say no, why, there's no access to global v in
that case.  But GCC 3.3.4--4.3.0 on i686 with -01 generates the

            pushl   %ebp
            movl    %esp, %ebp
            cmpl    $0, 8(%ebp)
            movl    $1, %eax
            cmove   v, %eax        ; load (maybe)
            movl    %eax, v        ; store (always)
            popl    %ebp

Note the last unconditional store to v.  Now, if some thread would
modify v between our load and store (acquiring the mutex first), then
we will overwrite the new value with the old one (and would do that in
a thread-unsafe manner, not acquiring the mutex).

So, do the calls to f(0) require the mutex, or it's a GCC bug?

This very bug was actually already reported for a bit different case,
"Loop IM and other optimizations harmful for -fopenmp"
( ; please ignore my
last comment there, as I no longer sure myself).  But the report was
closed with "UNCONFIRMED" mark, and reasons for that are not quire
clear to me.  I tried to dig into the C99 standard and David
Butenhof's "Programming with POSIX Threads", and didn't find any
indication that call f(0) should be also protected by the mutex.

Here are some pieces from C99:

Sec 3.1 par 3: NOTE 2 "Modify" includes the case where the new value
               being stored is the same as the previous value.

Sec 3.1 par 4: NOTE 3 Expressions that are not evaluated do not access

Sec par 3: In the abstract machine, all expressions are
                   evaluated as specified by the semantics.

Sec par 5 basically says that the result of the program
execution wrt volatile objects, external files and terminal output
should be the same for all confirming implementations.

Sec par 8: EXAMPLE 1 An implementation might define a
                   one-to-one correspondence between abstract and
                   actual semantics: ...

Sec par 9: Alternatively, an implementation might perform
                   various optimizations within each translation unit,
                   such that the actual semantics would agree with the
                   abstract semantics only when making function calls
                   across translation unit boundaries. ...

I think that the above says that even when compiler chooses to do some
optimizations, the result of the _whole execution_ should be the same
as if actual semantics equals to abstract semantics.  Sec par
9 cited last is not a permission to do optimizations that may change
the end result.  In our case when threads are involved the result may
change, because there's no access to v in the abstract semantics, and
thus no mutex is required from abstract POV.

So, could someone explain me why this GCC optimization is valid, and,
if so, where lies the boundary below which I may safely assume GCC
won't try to store to objects that aren't stored to explicitly during
particular execution path?  Or maybe the named bug report is valid
after all?

Thanks in advance,

   Tomash Brechko

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]