[PATCH 0/3] Add __builtin_load_no_speculate

Jeff Law law@redhat.com
Wed Jan 10 23:48:00 GMT 2018


On 01/09/2018 03:47 AM, Richard Earnshaw (lists) wrote:
> On 05/01/18 13:08, Alexander Monakov wrote:
>> On Fri, 5 Jan 2018, Richard Earnshaw (lists) wrote:
>>> This is quite tricky.  For ARM we have to have a speculated load.
>>
>> Sorry, I don't follow. On ARM, it is surprising that CSEL-CSDB-LDR sequence
>> wouldn't work (applying CSEL to the address rather than loaded value), and
>> if it wouldn't, then ARM-specific lowering of the builtin can handle that
>> anyhow, right? (by spilling the pointer)
> 
> The load has to feed /in/ to the csel/csdb sequence, not come after it.
> 
>>
>> (on x86 the current Intel's recommendation is to emit LFENCE prior to the load)
> 
> That can be supported in the way you expand the builtin.  The builtin
> expander is given a (MEM (ptr)) , but it's up to the back-end where to
> put that in the expanded sequence to materialize the load, so you could
> write (sorry, don't know x86 asm very well, but I think this is how
> you'd put it)
> 
> 	lfence
> 	mov	(ptr), dest
> 
> with branches around that as appropriate to support the remainder of the
> builtin's behaviour.
I think the argument is going to be that they don't want the branches
around to support the other test + failval semantics.  Essentially the
same position as IBM has with PPC.

> 
>> Is the main issue expressing the CSEL condition in the source code? Perhaps it is
>> possible to introduce
>>
>>   int guard = __builtin_nontransparent(predicate);
>>
>>   if (predicate)
>>     foo = __builtin_load_no_speculate(&arr[addr], guard);
>>
>> ... or maybe even
>>
>>   if (predicate)
>>     foo = arr[__builtin_loadspecbarrier(addr, guard)];
>>
>> where internally __builtin_nontransparent is the same as
>>
>>   guard = predicate;
>>   asm volatile("" : "+g"(guard));
>>
>> although admittedly this is not perfect since it forces evaluation of 'guard'
>> before the branch.
> 
> As I explained to Bernd last night, I think this is likely be unsafe.
> If there's some control path before __builtin_nontransparent that allows
> 'predicate' to be simplified (eg by value range propagation), then your
> guard doesn't protect against the speculation that you think it does.
> Changing all the optimizers to guarantee that wouldn't happen (and
> guaranteeing that all future optimizers won't introduce new problems of
> that nature) is, I suspect, very non-trivial.
Agreed.  Whatever PREDICATE happens to be, the compiler is going to go
through extreme measures to try and collapse PREDICATE down to a
compile-time constant, including splitting paths to the point where
PREDICATE is used in the conditional so that on one side it's constant
and the other it's non-constant.  It seems like this approach is likely
to be compromised by the optimizers.


Jeff



More information about the Gcc-patches mailing list