[PATCH] Make some asan builtins tm_pure (PR sanitizer/55508)

Mon Dec 17 09:52:00 GMT 2012

resend in plain text

On Mon, Dec 17, 2012 at 1:50 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Fri, Dec 14, 2012 at 5:43 PM, Torvald Riegel <triegel@redhat.com> wrote:
> > On Thu, 2012-12-13 at 10:02 +0100, Jakub Jelinek wrote:
> >> On Thu, Dec 13, 2012 at 10:38:13AM +0400, Dmitry Vyukov wrote:
> >> > On Wed, Dec 12, 2012 at 11:50 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> >> > > Various TM tests ICE when built with -fgnu-tm -fsanitizer=address.
> >> > > The problem is that asan.c pass adds calls to builtins that weren't there
> >> > > before and TM is upset about it.  The __asan_report* are all like
> >> > > abort, in correctly written program they shouldn't have a user visible
> >> > > effect, in bad program they will terminate the process, but in any case
> >> > > it doesn't matter how many times they are retried as part of a transaction,
> >> > > there is no state to roll back on transaction cancellation.
> >> > > __asan_handle_no_return, while not being noreturn, just marks the stack as
> >> > > unprotected, so again in correctly written application no effect, in bad app
> >> > > might result in some issues being undetected, but still, it can be done many
> >> > > times and isn't irreversible.
> >> >
> >> > I was only loosely following tm-languages discussions. Does gcc tm
> >> > model guarantees strong consistency for all memory accesses? I mean
> >> > there are tm implementations that allow transient inconsistencies,
> >>
> >> Will leave this to Torvald.
> >
> > This has two parts: (1) how TM fits into the C++11/C11 memory model, and
> > (2) which guarantees the compiler and the TM runtime library agree on at
> > the level of the TM ABI that we use in GCC.
> >
> > Regarding the first part, all transactions provide guarantees similar to
> > a global lock.  Specifically, there are virtual transaction start/end
> > events that take part in sequenced-before (similar to acquisition and
> > release of the global lock), whose are then guaranteed to execute
> > (without transactions interleaving with each other) in a global total
> > order (let's call this order TO).  TO then contributes to
> > happens-before.
> >
> > On the ABI level, the TM runtime is allowed to execute speculatively as
> > long as (1) it does not expose any speculative execution to
> > nontransactional code and (2) speculation doesn't violate the language's
> > as-if rules (i.e., no visible side effects other than the abstract
> > machine; no signals due to seg faults etc.).  This means that, for
> > example, the TM will not access data at a wider granularity than what
> > nontransactional code that conforms to the C++11 memory model does would
> > do.
> > Second, transactions can have a tentative position in TO, but they will
> > only expose final TO choices to nontxnal code.  So, each transaction
> > will execute code that would be valid when executed in isolation -- but
> > it is possible that several transactions noncommitted transactions are
> > in flight concurrently that may conflict with each other.  The TM
> > ensures that such speculative execution is safe and not visible to a
> > race-free program.  So, when a transaction commits at a certain position
> > in TO, it will make sure that all other active transactions reach
> > consensus on TO (up to this position) before it returns to the execution
> > of nontxnal code.  Those active transactions will either see that they
> > would be still valid at a new TO position (after the committing txn), or
> > they abort and then signal that they agree to this TO.  That means that
> > the nontxnal code will not be affected by any speculative execution,
> > even if the prior txn privatized some data (this is called privatization
> > safety in TM jargon).
> > The sort-of counterpart to privatization safety is publication safety
> > (ie, transactions can safely make some data shared).  While
> > privatization safety is handled by the TM runtime, publication safety is
> > ensured by the compiler by not allowing any dangerous load/load
> > reordering, basicallly.  Publication safety is not yet ensured always,
> > but Aldy is working on this.
> >
> >> > than are detected later and trx is restarted. Can't asan trigger false
> >> > positives in this case?
> >>
> >> I can't imagine any.
> >
> > I also don't see any, because all loads/stores of all transactions, even
> > those with a tentative TO position, would be a valid execution.  For
> > malloc/free in transactions, AFAIR, ASAN hooks into malloc/free
> > directly, so when libitm handles those including rollback of an
> > allocation, there shouldn't be false positives.  However, there might be
> > false negatives in case of free() because deallocations are deferred
> > until transaction commit (they can't be rolled back).  Could we tell
> > ASAN about free() directly using a call added by the TM instrumentation
> > pass?  That would at least help catch the false negatives for all
> > free()-like functions that the compiler knows about.
>
> I do not think that false positive is super important.  We have a similar false negative in case of racy use after free, when the free the just happen to happen after the use.
>
> However, it must be possible to support, because in case of asan we do can "rollback" free.
> But won't it lead to false positives in case of:
>
> __transaction {
>   free(p);
>   p = 0;
> }
>
> __transaction {
>   r = 0;
>   if (p)
>     r = p->x;
>   ...
>
> }
> ?
>
>
> > All of this is different for TSAN, of course, because there a memory
> > access does alter the state that TSAN keeps, so if we roll back the
> > accesses in the TM we would also have to roll back the TSAN state.
>
>
> I think the simplest way to solve it for now, it to use... well, single global lock.
> I.e. replace __txn_start() with global pthread_mutex_t acquisition, __txn_commit() with lock release. And no transactional instrumentation inside of the trx. It should be correct transformation, right. However, I am not sure what to do with __txn_abort()...
>
>
>
>
> >> > Also, what is the order of instrumentation in tm+asan setting? I mean
> >> > that neither tm must instrument asan instrumentation, nor asan must
> >> > instrument tm instrumentation. Is it the case? There also can be
> >> > conflicts related to ordering of instrumentation in the following
> >> > case:
> >> > asan_check();
> >> > speculative_load();
> >> > tm_check();
> >
> > The ordering of instrumentation is an interesting question, but I think
> > that putting the ASAN check before the TM load/store is the right thing
> > to do.  It will catch any illegal access before it's performed.  It
> > doesn't suffer from false positives because the TM will only allow valid
> > (even though speculative) executions.  For this ordering to not be valid
> > (ie, tm_check before asan_check), transactions would need to be able to
> > prevent access to a certain variable by writing to them (which doesn't
> > work).  Other cases in which the transaction would perform the access
> > because of values of other variables that it loaded previously are
> > covered by privatization safety.
> >
> >> I'm not aware of TM having speculative loads, libgtm certainly doesn't
> >> install a SIGSEGV handler (and testing whether some memory is
> >> readable/writable without actually dereferencing would be terribly slow).
> >> If a memory load or store segfaults, whether in a transaction or outside of
> >> it, it is a program bug and it is right if asan terminates the program.
> >
> > Right, memory accesses performed by the TM are only speculative to the
> > extent described above.
>
> There was a project called Transactional Locking II, if you are interested:
>
> https://github.com/daveboutcher/tl2-x86-mp/blob/master/tl2.c
>     /*
>      * Fetch tentative value
>      * Use either SPARC non-fault loads or complicit signal handlers.
>      * If the LD fails we'd like to call TxAbort()
>      * TL2 does not permit zombie/doomed txns to run
>      */
>     volatile vwLock* LockFor = PSLOCK(Addr);
>     vwLock rdv = LDLOCK(LockFor) & ~LOCKBIT;
>     MEMBARLDLD();
>     Valu = LDNF(Addr);
>     MEMBARLDLD();
>     if (rdv <= Self->rv && LDLOCK(LockFor) == rdv) {
>         if (!Self->IsRO) {
>             if (!TrackLoad(Self, LockFor)) {
>                 PROF_STM_READ_END();
>                 TxAbort(Self);
>             }
>         }
>         PROF_STM_READ_END();
>         return Valu;
>     }
>
>