This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix test-case on ppc64le (PR testsuite/79455).


On 04/26/17 12:07, Martin Liška wrote:
> On 04/26/2017 12:05 PM, Martin Liška wrote:
>> Received as HTML from Kelvin:
>>
>> I have reviewed some of the implementation details for pthread_mutex_init and pthread_mutex_lock, and have confirmed that ppc64 does use memset to initialize the entirety of its mutex lock structure before potentially overwriting a few of its fields with non-zero data.  So in this regard, your fix seems appropriate.
>>
>> However, I'm confused about how this program runs.  My understanding of the test case code is that the use of barrier_wait() calls is supposed to force Thread1 to run "to completion" before Thread2 even starts.  So if my understanding is correct, it is not sensible for Thread1's pthread_mutex_init call to be followed "immediately" by Thread2's pthread_mutex_lock() call.
>>
>> The comment in tsan_barrier.h describes barrier_init and barrier_wait as "TSAN-invisible barriers".  Can you help me understand how TSAN works in this regard?  I was expecting that TSAN runs the program with instrumentation on the TSAN-visible "thread services" but no instrumentation on the TSAN-invisible thread services.  Is this understanding correct?
>>
>> The questions I am trying to answer for myself are whether this test, as crafted, is really sufficiently "portable" and "meaningful".  It seems to be making non-portable assumptions about the implementation of pthread_mutex_init and pthread_mutex_lock and pthread_mutex_unlock, even with the proposed change.
>>
>> Just to clarify:
>>   Is it your understanding that the expected tsan diagnostic report is complaining that some data field of the mutex structure is being read in Thread 2 after having been written in Thread 1 without any "happens before" relationship (since tsan cannot see the barrier_wait calls?)  Since the "conflict" is going all the way back to Thread 1's pthread_mutex_init call, am I correct in my understanding that the "first conflict noticed" is dealing with some part of the lock structure that is not normally modified by the execution of pthread_mutex_lock and pthread_mutex_unlock.
>>
>> Maybe adding a comment or two to the test case to clarify what is being tested would be most helpful here...
>>
>
> Well, I'm not the author neither of original test-case or modifications that added tsan_barrier.h.
> I'm CCing Bernd that can hopefully answer your questions.
>

Hmm, that's a difficult story.

In the PR the problem is that the 1-byte access is expected by
pthread_mutex_init.

But in fact the access is artificially generated by the
instrumentation in tsan/tsan_rtl_mutex.cc (MutexCreate)
where we have this (after the mutex is already successfully
created):

     MemoryWrite(thr, pc, addr, kSizeLog1);

What the pthread_mutex_init normaly does is invisible to tsan, because
it is compiled without instrumentation, however in your glibc version
a memset is actually used, and that can be intercepted by tsan,
even if the .so was not built with tsan instrumentation.

That was probably an unexpected second access, and that made the test
case fail, because the earlier access was reported.

I think however, that only one extra call frame can ever
be seen, because the stack frames are only created by instrumentation
at instrumented function begin and end statements.

So probably the test expectations could be more strict than in the
proposed patch.

I probably don't understand your question right, but without the
barriers the test case could either report nothing, or even crash,
because pthread_mutex_lock in thread2 operates on uninitialized
data.  This is just to nail down the sequence of events in a way
that the expected test result happens with 100% likelihood.

It is in the nature of tsan, that it cannot report every problem
if you only run the program once, because the internal shadow ram
is inherently racy.  That is why in practice many test runs are
necessary, but that is not how our test suite works.


Bernd.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]