[PR51752] publication safety violations in loop invariant motion pass

Sun Feb 26 17:39:00 GMT 2012

On Fri, Feb 24, 2012 at 2:10 PM, Torvald Riegel <triegel@redhat.com> wrote:
> On Fri, 2012-02-24 at 09:58 +0100, Richard Guenther wrote:
>> On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> > On 02/23/12 12:19, Aldy Hernandez wrote:
>> >
>> >> about hit me. Instead now I save all loads in a function and iterate
>> >> through them in a brute force way. I'd like to rewrite this into a hash
>> >> of some sort, but before I go any further I'm interested to know if the
>> >> main idea is ok.
>> >
>> >
>> > For the record, it may be ideal to reuse some of the iterations we already
>> > do over the function's basic blocks, so we don't have to iterate yet again
>> > over the IL stream.  Though it may complicate the pass unnecessarily.
>>
>> It's definitely ugly that you need to do this.
>
> Indeed.  But that's simply the price we have to pay for making
> publication with transactions easier for the programmer yet still at
> acceptable performance.
>
> Also note that what we primarily have to care about for this PR is to
> not hoist loads _within_ transactions if they would violate publication
> safety.  I didn't have time to look at Aldy's patch yet, but a first
> safe and conservative way would be to treat transactions as full
> transformation barriers, and prevent publication-safety-violating
> transformations _within_ transactions.  Which I would prefer until we're
> confident that we understood all of it.
>
> For hoisting out of or across transactions, we have to reason about more
> than just publication safety.
>
>> And yes, you have to look at
>> very many passes I guess, PRE comes to my mind at least.
>>
>> I still do not understand the situation very well, at least why for
>>
>>  transaction { for () if (b) load; } load;
>>
>> it should be safe to hoist load out of the transaction
>
> This one is funny.  *Iff* this is an atomic txn, we can assume that the
> transaction does not wait for a concurrent event. If it is a relaxed
> txn, then we must not have a loop which terminates depending on an
> atomic load (which we don't in that example); otherwise, we cannot hoist
> load to before the txn (same reason as why we must not hoist to before
> an atomic load with memory_order_acquire).
> Now, if these things hold, then the load will always be executed after
> the txn.  Thus, we can assume that it will happen anyway and nothing can
> stop it (irrespective of which position the txn gets in the
> Transactional Synchronization Order (see the C++ TM spec)).  It is a
> nonatomic load, and we can rely on the program being data-race-free, so
> we cannot have other threads storing to the same location, so we can
> hoist it across because in that part of the program, the location is
> guaranteed to be thread-local data (or immutable).
>
> As I said, this isn't related to just pub safety anymore.  And this is
> tricky enough that I'd rather not try to optimize this currently but
> instead wait until we have more confidence in our understanding of the
> matter.
>
>> while for
>>
>>  load; transaction { for () if (b) load; }
>>
>> it is not.
>
> If the same assumptions as above hold, I think you can hoist it out,
> because again you can assume that it targets thread-local/immutable
> data: the nontransactional (+nonatomic!) load can happen at any time,
> essentially, irrespective of b's value or how/when other threads modify
> it.  Thus, it cannot have been changed between the two loads in a
> data-race-free program.
>
>> Neither do I understand why it's not ok for
>>
>>  transaction { for () if (b) load; }
>>
>> to hoist the load out of the transaction.
>
> You would violate publication safety.
>
> Also, you don't have any reason to believe that load targets
> thread-local/immutable data, so you must not make it nontransactional
> (otherwise, you could introduce a race condition).
>
>>
>> I assume all the issue is about hoisting things over the trasaction start.
>> So - why does the transaction start not simply clobber all global memory?
>> That would avoid the hoisting.  I assume that  transforming the above to
>>
>>  transaction { tem = load; for () if (b) = tem; }
>>
>> is ok?
>
> No, it is not.  Actually, this is precisely the transformation that we
> need to prevent from happening.
> As Aldy said, please see the explanations in the PR, or have a look at
> the C++ TM specification and the C++11 memory model alternatively.  We
> could also discuss this on IRC, if this would be easier.

Ok. I see.  So, I think what would be best is to have a way to check whether
a store/load is part of a transaction - do we have a way to do that right now?
(For example a flag on a gimple stmt?)  Then we can simply avoid the LIM
transform by making transaction load/store stmts behave the same as
potentially trapping stmts (thus, only optimize if the memory is accessed
unconditional somewhere else).  That would work for PRE as well.
[easiest would be to make *_could_trap_p return true for memory ops
inside a transaction]

Does that sound like it would make TM happy (well, apart from missed
optimizations)?  Thus, is it enough to avoid transforms that would be
invalid for loads/stores that might trap?

Thanks,
Richard.

> Torvald
>