This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Do not consider volatile asms as optimization barriers #1
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Eric Botcazou <ebotcazou at adacore dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Yury Gribov <y dot gribov at samsung dot com>
- Date: Mon, 03 Mar 2014 08:01:07 +0000
- Subject: Re: [RFC] Do not consider volatile asms as optimization barriers #1
- Authentication-results: sourceware.org; auth=none
- References: <2417129 dot VoWsWu5CJz at polaris> <874n3hdflb dot fsf at talisman dot default> <1501954 dot b67uvQkiBj at polaris>
Eric Botcazou <ebotcazou@adacore.com> writes:
>> Thanks for doing this. FWIW I agree it's probably the best stop-gap fix.
>> But the implication seems to be that unspec_volatile and volatile asms
>> are volatile in different ways. IMO they're volatile in the same way
>> and the problems for volatile asms apply to unspec_volatile too.
>
> I disagree, we need a simple way for the RTL middle-end as well as the back-
> ends to block most optimizations across a specific point (e.g. a non-local
> label as in HP's fix) and UNSPEC_VOLATILE is the best candidate, at least in
> the short term.
I don't agree it's the best candidate if...
>> E.g. although cse.c will flush the table for unspec_volatile,
>> it isn't the case that unspec_volatile forces a containing function
>> to save all call-saved registers. That would be excessive for a plain
>> blockage instruction. So again we seem to be assuming one thing in places
>> like cse.c and another in the register allocator. Code that uses the DF
>> framework will also assume that registers are not implicitly clobbered
>> by an unspec_volatile:
>> [...]
>> Also, ira-lives.c (which tracks the liveness of both pseudo and hard
>> registers) doesn't mention "volatile" at all.
>
> Yes, the definition of a blockage instruction is somewhat vague and I agree
> that it shoudn't cause registers to be spilled. But it needs to block most,
> if not all, optimizations.
...it's so loosely defined. If non-local labels are the specific problem,
I think it'd be better to limit the flush to that.
I'm back to throwing examples around, sorry, but take the MIPS testcase:
volatile int x = 1;
void foo (void)
{
x = 1;
__builtin_mips_set_fcsr (0);
x = 2;
}
where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
(I'm not interested in what the function does here.) Even at -O2,
the cse.c code successfully prevents %hi(x) from being shared,
as you'd expect:
li $3,1 # 0x1
lui $2,%hi(x)
sw $3,%lo(x)($2)
move $2,$0
ctc1 $2,$31
li $3,2 # 0x2
lui $2,%hi(x)
sw $3,%lo(x)($2)
j $31
nop
But put it in a loop:
void frob (void)
{
for (;;)
{
x = 1;
__builtin_mips_set_fcsr (0);
x = 2;
}
}
and we get the rather bizarre code:
lui $2,%hi(x)
li $6,1 # 0x1
move $5,$0
move $4,$2
li $3,2 # 0x2
.align 3
.L3:
sw $6,%lo(x)($2)
ctc1 $5,$31
sw $3,%lo(x)($4)
j .L3
lui $2,%hi(x)
Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
%hi(x) is reloaded each time. So what's the correct behaviour here?
Should the hoisting of the second %hi(x) have been disabled because the
loop contains an unspec_volatile? What about the 1 (from the first store)
and the 2?
If instead it was:
for (i = 0; i < 100; i++)
would converting to a hardware do-loop be acceptable?
>> So most passes assume that no pseudos or hard registers will be
>> implicitly clobbered by unspec_volatile, just like for a volatile asm.
>> And IMO that's right. I think the rule should be the same for volatile
>> asms and unspec_volatiles, and the same for registers as it already is for
>> memory: if the instruction clobbers something, it should say so explicitly.
>
> IMO that would buy us nothing and, on the contrary, would add complexity where
> there currently isn't. We really need a simple blockage instruction.
>
>> Volatile itself should:
>>
>> (a) prevent deletion or duplication of the operation
>> (b) prevent reordering wrt other volatiles
>> (c) prevent the operation from being considered equivalent to any other
>> operation (even if it's structurally identical and has the same inputs)
>>
>> but nothing beyond that.
>
> Maybe UNSPEC_VOLATILE is a misnomer then and we should allow volatile UNSPECs
> along the above lines.
That'd be fine with me, especially since with the patch it sounds like
using volatile asm would produce better code than a built-in function
that expands to an unspec_volatile, whereas IMO the opposite should
always be true.
But I still think we need a similar list for what unspec_volatile
means now, if we decide to keep something with the current meaning.
Thanks,
Richard