Bug 82602 - IRA considers volatile asm to be moveable
Summary: IRA considers volatile asm to be moveable
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Segher Boessenkool
URL:
Keywords: ra
Depends on:
Blocks:
 
Reported: 2017-10-18 09:10 UTC by Segher Boessenkool
Modified: 2017-10-19 07:11 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-10-18 00:00:00


Attachments
testcase (521 bytes, text/plain)
2017-10-18 09:10 UTC, Segher Boessenkool
Details
proposed patch (294 bytes, text/plain)
2017-10-18 09:36 UTC, Segher Boessenkool
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Segher Boessenkool 2017-10-18 09:10:45 UTC
Created attachment 42390 [details]
testcase

https://gcc.gnu.org/ml/gcc-help/2017-10/msg00061.html reports a problem
where volatile asm statements (without output) are moved around incorrectly.

Testcase attached, build with
  arm-eabi-gcc -Wall -W -O1 -mcpu=cortex-m0plus -mthumb
and see the asm statements reordered (on GCC 5 at least, it does
not seem to trigger the bug with trunk).

I have a patch.
Comment 1 Segher Boessenkool 2017-10-18 09:36:17 UTC
Created attachment 42391 [details]
proposed patch
Comment 2 David Brown 2017-10-18 10:17:02 UTC
For reference, the issue is also discussed here:

https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849

For this particular case, there are usable workarounds (with extra dependencies or memory clobbers).  But "asm volatile" statements should not be moveable across other asm volatile statements, volatile memory accesses, or other observable behaviour.  (It is fine that they can be moved across other code, just like other volatile accesses.)
Comment 3 Xi Ruoyao 2017-10-18 13:34:21 UTC
I'm still not convinced this is a bug.  For example, all kernel code
uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code
through it, not `asm volatile ("" :::)`.  So the developers have been aware
a "asm volatile" does NOT mean a barrier. If we change this behaviour and
stop reordering through all "asm volatile"s, the generated code could be
slower.
Comment 4 Xi Ruoyao 2017-10-18 13:41:02 UTC
By the way, in kernel code (compiler-gcc.h) there is a comment:

/* The "volatile" is due to gcc bugs */
#define barrier() __asm__ __volatile__("": : :"memory")

So the developer(s) actually think "volatile" is unnecessary, instead of the
"memory" clobber.
Comment 5 Bernd Edlinger 2017-10-18 13:45:15 UTC
(In reply to Xi Ruoyao from comment #3)
> I'm still not convinced this is a bug.  For example, all kernel code
> uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code
> through it, not `asm volatile ("" :::)`.  So the developers have been aware
> a "asm volatile" does NOT mean a barrier. If we change this behaviour and
> stop reordering through all "asm volatile"s, the generated code could be
> slower.

Yes, spin locks need to have a "memory" clobber.
I agree it will only work by chance without.
Comment 6 Segher Boessenkool 2017-10-18 13:58:11 UTC
(In reply to Xi Ruoyao from comment #3)
> I'm still not convinced this is a bug.  For example, all kernel code
> uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code
> through it, not `asm volatile ("" :::)`.  So the developers have been aware
> a "asm volatile" does NOT mean a barrier. If we change this behaviour and
> stop reordering through all "asm volatile"s, the generated code could be
> slower.

The bug in IRA is that it would move volatile asm statements to wherever
else it felt like, including for example over another volatile asm.  That
is the bug, which my patch fixes.  Making volatile asm a "barrier" (what
does that even _mean_?!) is a bad idea, I certainly agree.
Comment 7 David Brown 2017-10-18 14:02:47 UTC
(In reply to Xi Ruoyao from comment #3)
There is no intention to make "asm volatile" a barrier, as you get with a memory clobber.  The intention is to stop it moving past other volatile code (such as other asm volatiles, and volatile memory accesses).  An "asm volatile" statement should still be moveable across other "normal" code.

(In reply to Xi Ruoyao from comment #4)
As for the comment in the kernel code, the gcc documentation says that an "asm" statement with no output is automatically considered as though it were "asm volatile".  So it should not be necessary to write "volatile" in the memory barrier here, as the compiler should treat it that way anyway.  If there is a compiler bug there, it should be fixed - or the documentation could be changed.  There is certainly no harm in having the "volatile" explicit in the barrier() definition.
Comment 8 Xi Ruoyao 2017-10-18 15:41:28 UTC
(In reply to David Brown from comment #7)

> There is no intention to make "asm volatile" a barrier, as you get with a
> memory clobber.  The intention is to stop it moving past other volatile code
> (such as other asm volatiles, and volatile memory accesses).  An "asm
> volatile" statement should still be moveable across other "normal" code.

I see... But then your code in the maillist

             uint32_t status;

             /* save the PRIMASK into 'status' */
             __asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: );

             /* set PRIMASK to disable interrupts */
             __asm volatile ("cpsid i");

             foo(); /* call a function */

             /* restore PRIMASK from 'status' */
             __asm volatile ("msr PRIMASK,%0" :: "r" (status) : );

could still become

             uint32_t status;

             /* save the PRIMASK into 'status' */
             __asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: );

             foo(); /* call a function */

             /* set PRIMASK to disable interrupts */
             __asm volatile ("cpsid i");

             /* restore PRIMASK from 'status' */
             __asm volatile ("msr PRIMASK,%0" :: "r" (status) : );

and still need a "memory" clobber.
Comment 9 Segher Boessenkool 2017-10-18 16:18:14 UTC
You cannot do that if you do not know what foo() does (it could for
example contain another volatile asm).  But yes, the code as written
is not so great.
Comment 10 Bernd Edlinger 2017-10-18 16:32:19 UTC
Yes, and moreover foo() could access non-volatile memory.
And only a memory clobber can prevent the compiler from
using cached values.
Comment 11 Segher Boessenkool 2017-10-18 16:50:38 UTC
(In reply to Bernd Edlinger from comment #10)
> Yes, and moreover foo() could access non-volatile memory.
> And only a memory clobber can prevent the compiler from
> using cached values.

But you *want* the compiler to use cached values if it can.

It is valid for the compiler to move all of foo (or part of it) to
outside the asm's, if the compiler can see the body of foo, e.g.
with LTO or if it is defined in this translation unit.  That's what
makes this code not so super.  There should _not_ be a memory clobber
in these asm statements.  Memory clobbers do not prevent using cached
values, btw.; not in the general case, anyway.  A memory clobber says
"this asm may write to some memory, and I'm not saying what".  It does
not force things to live in memory, and it does do nothing to things
in registers.
Comment 12 Segher Boessenkool 2017-10-18 21:08:50 UTC
Author: segher
Date: Wed Oct 18 21:08:18 2017
New Revision: 253869

URL: https://gcc.gnu.org/viewcvs?rev=253869&root=gcc&view=rev
Log:
ira: volatile asm's are not moveable (PR82602)

A volatile asm statement can not be moved (relative to other volatile
asm, etc.), but IRA would do it nevertheless.  This patch fixes it.


	PR rtl-optimization/82602
	* ira.c (rtx_moveable_p): Return false for volatile asm.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira.c
Comment 13 Segher Boessenkool 2017-10-18 21:13:48 UTC
Author: segher
Date: Wed Oct 18 21:13:16 2017
New Revision: 253870

URL: https://gcc.gnu.org/viewcvs?rev=253870&root=gcc&view=rev
Log:
ira: volatile asm's are not moveable (PR82602)

A volatile asm statement can not be moved (relative to other volatile
asm, etc.), but IRA would do it nevertheless.  This patch fixes it.


	PR rtl-optimization/82602
	* ira.c (rtx_moveable_p): Return false for volatile asm.

Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/ira.c
Comment 14 Segher Boessenkool 2017-10-18 21:15:56 UTC
Author: segher
Date: Wed Oct 18 21:15:24 2017
New Revision: 253871

URL: https://gcc.gnu.org/viewcvs?rev=253871&root=gcc&view=rev
Log:
ira: volatile asm's are not moveable (PR82602)

A volatile asm statement can not be moved (relative to other volatile
asm, etc.), but IRA would do it nevertheless.  This patch fixes it.


	PR rtl-optimization/82602
	* ira.c (rtx_moveable_p): Return false for volatile asm.

Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/ira.c
Comment 15 Segher Boessenkool 2017-10-18 21:17:21 UTC
Fixed on all open release branches.
Comment 16 David Brown 2017-10-19 07:11:01 UTC
(In reply to Xi Ruoyao from comment #8)
> (In reply to David Brown from comment #7)
> 
> > There is no intention to make "asm volatile" a barrier, as you get with a
> > memory clobber.  The intention is to stop it moving past other volatile code
> > (such as other asm volatiles, and volatile memory accesses).  An "asm
> > volatile" statement should still be moveable across other "normal" code.
> 
> I see... But then your code in the maillist
> 

Yes, of course if the compiler knows the contents of foo() (from LTO, or from a definition in the same file), then it can re-arrange any non-volatile statements there.  That is a separate issue, and a known issue.  For most purposes this can be handled just by using memory clobbers or appropriate use of volatile variables.  For more advanced uses, fake dependencies in asm statements can also be used.  This is all known territory.

In order to get correct code, you need control of particular memory accesses within a critical region like this - execution and calculations don't matter, nor do local variables.  That's why memory clobbers are nice - they may lead to some sub-optimal code, but they are an easy way to get it correct.

For the best possible code, you may want control of calculations and execution as well - in particular, you want to minimise the processor time within the critical region.  That is a lot harder, and there is currently no way to specify things fully (though with careful use of asm dependencies, you can get close).  But that is another issue entirely, and it is a matter of code efficiency rather than code correctness.

However, correctness here depends on the compiler honouring the ordering of volatile asm statements with respect to other volatile code, and this patch fixes that.