Created attachment 42390 [details] testcase https://gcc.gnu.org/ml/gcc-help/2017-10/msg00061.html reports a problem where volatile asm statements (without output) are moved around incorrectly. Testcase attached, build with arm-eabi-gcc -Wall -W -O1 -mcpu=cortex-m0plus -mthumb and see the asm statements reordered (on GCC 5 at least, it does not seem to trigger the bug with trunk). I have a patch.
Created attachment 42391 [details] proposed patch
For reference, the issue is also discussed here: https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849 For this particular case, there are usable workarounds (with extra dependencies or memory clobbers). But "asm volatile" statements should not be moveable across other asm volatile statements, volatile memory accesses, or other observable behaviour. (It is fine that they can be moved across other code, just like other volatile accesses.)
I'm still not convinced this is a bug. For example, all kernel code uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code through it, not `asm volatile ("" :::)`. So the developers have been aware a "asm volatile" does NOT mean a barrier. If we change this behaviour and stop reordering through all "asm volatile"s, the generated code could be slower.
By the way, in kernel code (compiler-gcc.h) there is a comment: /* The "volatile" is due to gcc bugs */ #define barrier() __asm__ __volatile__("": : :"memory") So the developer(s) actually think "volatile" is unnecessary, instead of the "memory" clobber.
(In reply to Xi Ruoyao from comment #3) > I'm still not convinced this is a bug. For example, all kernel code > uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code > through it, not `asm volatile ("" :::)`. So the developers have been aware > a "asm volatile" does NOT mean a barrier. If we change this behaviour and > stop reordering through all "asm volatile"s, the generated code could be > slower. Yes, spin locks need to have a "memory" clobber. I agree it will only work by chance without.
(In reply to Xi Ruoyao from comment #3) > I'm still not convinced this is a bug. For example, all kernel code > uses `asm volatile ("" ::: "memory")` as barrier to stop GCC to reorder code > through it, not `asm volatile ("" :::)`. So the developers have been aware > a "asm volatile" does NOT mean a barrier. If we change this behaviour and > stop reordering through all "asm volatile"s, the generated code could be > slower. The bug in IRA is that it would move volatile asm statements to wherever else it felt like, including for example over another volatile asm. That is the bug, which my patch fixes. Making volatile asm a "barrier" (what does that even _mean_?!) is a bad idea, I certainly agree.
(In reply to Xi Ruoyao from comment #3) There is no intention to make "asm volatile" a barrier, as you get with a memory clobber. The intention is to stop it moving past other volatile code (such as other asm volatiles, and volatile memory accesses). An "asm volatile" statement should still be moveable across other "normal" code. (In reply to Xi Ruoyao from comment #4) As for the comment in the kernel code, the gcc documentation says that an "asm" statement with no output is automatically considered as though it were "asm volatile". So it should not be necessary to write "volatile" in the memory barrier here, as the compiler should treat it that way anyway. If there is a compiler bug there, it should be fixed - or the documentation could be changed. There is certainly no harm in having the "volatile" explicit in the barrier() definition.
(In reply to David Brown from comment #7) > There is no intention to make "asm volatile" a barrier, as you get with a > memory clobber. The intention is to stop it moving past other volatile code > (such as other asm volatiles, and volatile memory accesses). An "asm > volatile" statement should still be moveable across other "normal" code. I see... But then your code in the maillist uint32_t status; /* save the PRIMASK into 'status' */ __asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: ); /* set PRIMASK to disable interrupts */ __asm volatile ("cpsid i"); foo(); /* call a function */ /* restore PRIMASK from 'status' */ __asm volatile ("msr PRIMASK,%0" :: "r" (status) : ); could still become uint32_t status; /* save the PRIMASK into 'status' */ __asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: ); foo(); /* call a function */ /* set PRIMASK to disable interrupts */ __asm volatile ("cpsid i"); /* restore PRIMASK from 'status' */ __asm volatile ("msr PRIMASK,%0" :: "r" (status) : ); and still need a "memory" clobber.
You cannot do that if you do not know what foo() does (it could for example contain another volatile asm). But yes, the code as written is not so great.
Yes, and moreover foo() could access non-volatile memory. And only a memory clobber can prevent the compiler from using cached values.
(In reply to Bernd Edlinger from comment #10) > Yes, and moreover foo() could access non-volatile memory. > And only a memory clobber can prevent the compiler from > using cached values. But you *want* the compiler to use cached values if it can. It is valid for the compiler to move all of foo (or part of it) to outside the asm's, if the compiler can see the body of foo, e.g. with LTO or if it is defined in this translation unit. That's what makes this code not so super. There should _not_ be a memory clobber in these asm statements. Memory clobbers do not prevent using cached values, btw.; not in the general case, anyway. A memory clobber says "this asm may write to some memory, and I'm not saying what". It does not force things to live in memory, and it does do nothing to things in registers.
Author: segher Date: Wed Oct 18 21:08:18 2017 New Revision: 253869 URL: https://gcc.gnu.org/viewcvs?rev=253869&root=gcc&view=rev Log: ira: volatile asm's are not moveable (PR82602) A volatile asm statement can not be moved (relative to other volatile asm, etc.), but IRA would do it nevertheless. This patch fixes it. PR rtl-optimization/82602 * ira.c (rtx_moveable_p): Return false for volatile asm. Modified: trunk/gcc/ChangeLog trunk/gcc/ira.c
Author: segher Date: Wed Oct 18 21:13:16 2017 New Revision: 253870 URL: https://gcc.gnu.org/viewcvs?rev=253870&root=gcc&view=rev Log: ira: volatile asm's are not moveable (PR82602) A volatile asm statement can not be moved (relative to other volatile asm, etc.), but IRA would do it nevertheless. This patch fixes it. PR rtl-optimization/82602 * ira.c (rtx_moveable_p): Return false for volatile asm. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/ira.c
Author: segher Date: Wed Oct 18 21:15:24 2017 New Revision: 253871 URL: https://gcc.gnu.org/viewcvs?rev=253871&root=gcc&view=rev Log: ira: volatile asm's are not moveable (PR82602) A volatile asm statement can not be moved (relative to other volatile asm, etc.), but IRA would do it nevertheless. This patch fixes it. PR rtl-optimization/82602 * ira.c (rtx_moveable_p): Return false for volatile asm. Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/ira.c
Fixed on all open release branches.
(In reply to Xi Ruoyao from comment #8) > (In reply to David Brown from comment #7) > > > There is no intention to make "asm volatile" a barrier, as you get with a > > memory clobber. The intention is to stop it moving past other volatile code > > (such as other asm volatiles, and volatile memory accesses). An "asm > > volatile" statement should still be moveable across other "normal" code. > > I see... But then your code in the maillist > Yes, of course if the compiler knows the contents of foo() (from LTO, or from a definition in the same file), then it can re-arrange any non-volatile statements there. That is a separate issue, and a known issue. For most purposes this can be handled just by using memory clobbers or appropriate use of volatile variables. For more advanced uses, fake dependencies in asm statements can also be used. This is all known territory. In order to get correct code, you need control of particular memory accesses within a critical region like this - execution and calculations don't matter, nor do local variables. That's why memory clobbers are nice - they may lead to some sub-optimal code, but they are an easy way to get it correct. For the best possible code, you may want control of calculations and execution as well - in particular, you want to minimise the processor time within the critical region. That is a lot harder, and there is currently no way to specify things fully (though with careful use of asm dependencies, you can get close). But that is another issue entirely, and it is a matter of code efficiency rather than code correctness. However, correctness here depends on the compiler honouring the ordering of volatile asm statements with respect to other volatile code, and this patch fixes that.