[RFC] Do not consider volatile asms as optimization barriers #1

Richard Sandiford rsandifo@linux.vnet.ibm.com
Mon Mar 3 13:12:00 GMT 2014


Eric Botcazou <ebotcazou@adacore.com> writes:
>>> ...it's so loosely defined.  If non-local labels are the specific problem,
>> I think it'd be better to limit the flush to that.
>
> No, there was "e.g." written so non-local labels are not the only problem.

What are the others though?  As discussed in the other subthread,
I don't think prologue and epilogue barriers are quite the same.

>> I'm back to throwing examples around, sorry, but take the MIPS testcase:
>> 
>>     volatile int x = 1;
>> 
>>     void foo (void)
>>     {
>>       x = 1;
>>       __builtin_mips_set_fcsr (0);
>>       x = 2;
>>     }
>> 
>> where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
>> (I'm not interested in what the function does here.)  Even at -O2,
>> the cse.c code successfully prevents %hi(x) from being shared,
>> as you'd expect:
>> 
>>         li      $3,1                    # 0x1
>>         lui     $2,%hi(x)
>>         sw      $3,%lo(x)($2)
>>         move    $2,$0
>>         ctc1    $2,$31
>>         li      $3,2                    # 0x2
>>         lui     $2,%hi(x)
>>         sw      $3,%lo(x)($2)
>>         j       $31
>>         nop
>> 
>> But put it in a loop:
>> 
>>     void frob (void)
>>     {
>>       for (;;)
>>         {
>>           x = 1;
>>           __builtin_mips_set_fcsr (0);
>>           x = 2;
>>         }
>>     }
>> 
>> and we get the rather bizarre code:
>> 
>>         lui     $2,%hi(x)
>>         li      $6,1                    # 0x1
>>         move    $5,$0
>>         move    $4,$2
>>         li      $3,2                    # 0x2
>>         .align  3
>> .L3:
>>         sw      $6,%lo(x)($2)
>>         ctc1    $5,$31
>>         sw      $3,%lo(x)($4)
>>         j       .L3
>>         lui     $2,%hi(x)
>> 
>> Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
>> %hi(x) is reloaded each time.  So what's the correct behaviour here?
>> Should the hoisting of the second %hi(x) have been disabled because the
>> loop contains an unspec_volatile?  What about the 1 (from the first store)
>> and the 2?
>
> Well, I personally wouldn't spend much time on the code generated in a loop 
> containing an UNSPEC_VOLATILE.  If an instruction or a builtin is supposed to 
> be performance-sensitive, then don't use an UNSPEC_VOLATILE by all means and 
> properly model it instead!

That doesn't really answer the question though.  What's the correct
behaviour for an unspec volatile in a loop?  I don't think it's what
we did in the example above, since it doesn't seem self-consistent.
And "not spending too much time" is again a bit vague in terms of
saying what's right and what's wrong.

My point is that if the construct is well-defined enough to handle the
important things we want it to handle, the answer should be known to somebody,
even if it isn't to me. :-)

Thanks,
Richard



More information about the Gcc-patches mailing list