In some cases the specified register for a local variable meant to be used with inline assembly code is not respected. This breaks code relying on this feature to implement non-standard calling convension. Provided test case works with all gcc versions tested so far except for all gcc-3.* including current CVS head.
Created attachment 6147 [details] test case showing the problem this test case should be compiled with: gcc -O2 -S test.c then inspection of the generated test.s file will contain a line like this: @ __i = r5 (should be r4)
Just to complement the information... The test case only needs to be modified slightly for gcc to change behavior and generate correct code. For example, simply removing the loop or assigning a value to l directly will mask the problem. Also compiling with -O0 produces good code in all cases.
Here's another data point: If in the test r4 is changed to r8 which is a totally unused register, the assignment is still done to r5. But, if r8 is also added to the clobber list for the inline asm, then compilation fails with "error: asm-specifier for variable `__i' conflicts with asm clobber list" even if without this clobber gcc stubbornly refuses to use r8.
Created attachment 6182 [details] patch It looks like the desired value does start out being loaded into r4, but loop deletes this assignment in the belief that it's redundant. Does this patch help?
Yep, proposed patch fixes the bug.
submitted the patch as http://gcc.gnu.org/ml/gcc-patches/2004-04/msg01925.html
Note even after your patch, the tree-ssa will fail still, DOM needs not to propagate stores past register variables.
This PR while present on gcc-3_3-branch seems to be of concerns only for gcc-3.4.x and mainline.
Is this a problem when tree-ssa is not in use on the 3.4 branch?
Yes the problem exists even on the 3.4 branch, I put the tree-ssa marker there because the patch would not fix the all of the problems on the tree-ssa when it was on the branch. The problem for loop pass is still a problem on the mainline but there are other problems on the mainline. The loop pass is the only known problem on the 3.4 branch.
OK for 3.4.1 if approved for mainline.
Jeff Law writes: "This code is using the "Variables in Specified Registers" extension to GCC for local variables. I'll quote from the GCC manual: Local register variables in specific registers do not reserve the registers. The compiler's data flow analysis is capable of determining where the specified registers contain live values, and where they are available for other uses. Stores into local register variables may be deleted when they appear to be dead according to dataflow analysis. References to local register variables may be deleted or moved or simplified. When __l is used in the asm, the writer of the bugzilla report expects to see register "r0". However, the value happens to be hanging around in a different register so GCC uses that other register instead of "r0". Similarly for uses of __i. The behavior of the compiler is precisely what I would expect." Therefore, closing this as INVALID.
Subject: Re: [3.4/3.5 Regression] [tree-ssa] local register variable with a specified register is bad "mmitchel at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes: > When __l is used in the asm, the writer of the bugzilla report expects > to see register "r0". However, the value happens to be hanging around > in a different register so GCC uses that other register instead of > "r0". Similarly for uses of __i. The behavior of the compiler is > precisely what I would expect." But what about this comment from RTH? http://gcc.gnu.org/ml/gcc/2004-05/msg00657.html He is describing what appears to me to be an undocumented feature, but one which he says is important. Isn't this test case pretty much the same sort of thing? http://gcc.gnu.org/bugzilla/attachment.cgi?id=6147&action=view Ian
Subject: Re: [3.4/3.5 Regression] [tree-ssa] local register variable with a specified register is bad > Therefore, closing this as INVALID. I beg to disagree here. Let's analyse the quoted manual: Local register variables in specific registers do not reserve the registers. The compiler's data flow analysis is capable of determining where the specified registers contain live values, and where they are available for other uses. This is perfectly sensible, although the current behavior doesn't correspond to this, ence this bug report. The inline asm constitute a precise point of use which is not respected. Stores into local register variables may be deleted when they appear to be dead according to dataflow analysis. This is also perfectly fine, but that's not the case here since the misassigned register is actually defined to be an input operand to the inline assembly. It therefore can't be dead. References to local register variables may be deleted or moved or simplified. I still don't see how this sentence could mean "any other register can be used", otherwise this feature would be completely non sense since no register specification could ever be relied upon which is completely useless. When the manual says "References to local register variables may be deleted or moved or simplified", that means the "reference" can be reordered or discarded with regards to other operations which is again perfectly fine, but that doesn't mean the compiler may decide to use another register for the actual reference at all. > When __l is used in the asm, the writer of the bugzilla report expects > to see register "r0". Indeed! That's the expected result of the feature. > However, the value happens to be hanging around > in a different register so GCC uses that other register instead of > "r0". Similarly for uses of __i. The behavior of the compiler is > precisely what I would expect." This is of course the expected behavior when there is _no_ register specification for a given variable, which is absolutely not the case here. And yet the value happens to be defined by a constant right before being used as follows: l = 100; { register unsigned long long __l asm("r0") = l; register unsigned long long __r asm("r2"); asm ( "..." _ "=r" (__r) : "r" (__l) ); l = __r; } So there is no reason gcc would have used another register than the one we asked it to, especially since the original l value has no life passed the assignment to __l. And incidentally, when the proposed patch for this bug is applied then that's exactly what happens. Now let's suppose we rework the test case a bit so __l becomes an output as well as an input to the inline asm, and that both l and __l are referenced after the inline asm. Then gcc doesn't have the choice to copy the value of l into a separate register. Even in this case the selected register doesn't respect the register specification, even if that specification is changed to a completely unused register like, say, r8 instead of r0. And to further justify the proposed patch: all this is dependent on the presence of a surrounding loop. If the loop is removed from the code, it's then impossible to make the test case fail. I'm therefore asking for this bug to be reopened and the proposed fix applied. The reasons for marking this bug as invalid, are, IMHO, themselves invalid. This is a documented feature that has always worked perfectly fine in the past, and which has existing bodies of code relying on the documented behavior for legitimate purposes. Thank you. Nicolas
Subject: Re: [3.4/3.5 Regression] [tree-ssa] local register variable with a specified register is bad Ian Lance Taylor wrote: > "mmitchel at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes: > > >>When __l is used in the asm, the writer of the bugzilla report expects >>to see register "r0". However, the value happens to be hanging around >>in a different register so GCC uses that other register instead of >>"r0". Similarly for uses of __i. The behavior of the compiler is >>precisely what I would expect." > > > But what about this comment from RTH? > http://gcc.gnu.org/ml/gcc/2004-05/msg00657.html > > He is describing what appears to me to be an undocumented feature, but > one which he says is important. I don't know; I don't have enough experience with these kinds of tricks. (When I need to do these kinds of things, I just write assembly code.) If Richard and Jeff agree to reopen the bug, that's fine with me.
Yes, this should work. Indeed, the patch looks plausible. We should update the documentation too, if that hasn't been done already. I thought it had, last time this came up...
Subject: Bug 15089 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_4-branch Changes by: mmitchel@gcc.gnu.org 2004-06-25 21:47:48 Modified files: gcc : ChangeLog loop.c Log message: 2004-06-25 Philip Blundell <philb@gnu.org> PR wrong-code/15089 * loop.c (scan_loop): Do not move user-specified register assignments. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.523&r2=2.2326.2.524 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/loop.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.488.2.3&r2=1.488.2.4
Subject: Bug 15089 CVSROOT: /cvs/gcc Module name: gcc Changes by: mmitchel@gcc.gnu.org 2004-06-25 21:48:39 Modified files: gcc : ChangeLog loop.c Log message: PR wrong-code/15089 * loop.c (scan_loop): Do not move user-specified register assignments. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4154&r2=2.4155 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/loop.c.diff?cvsroot=gcc&r1=1.499&r2=1.500
OK, I applied Phil's patch to 3.4.1 and mainline. RTH, I've now assigned this to you to write the docs and add a test case. :-p
> Note even after your patch, the tree-ssa will fail still, DOM needs not to > propagate stores past register variables. And CCP as well (strange that PRE does not catch it!)
Created attachment 7222 [details] Test case that fails on i686-pc-linux-gnu Here, __l is assigned to %eax instead of %ebx, while __r and __i are assigned correctly.
Subject: Bug 15089 CVSROOT: /cvs/gcc Module name: gcc Changes by: rth@gcc.gnu.org 2004-09-29 02:50:47 Modified files: gcc : ChangeLog tree-flow.h tree-ssa-ccp.c tree-ssa-copy.c tree-ssa-dom.c Added files: gcc/testsuite/gcc.dg/tree-ssa: asm-2.c Log message: PR 15089 * tree-ssa-copy.c (may_propagate_copy_into_asm): New. * tree-flow.h (may_propagate_copy_into_asm): Declare. * tree-ssa-ccp.c (replace_uses_in): Use it. * tree-ssa-dom.c (cprop_operand): Likewise. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5673&r2=2.5674 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-flow.h.diff?cvsroot=gcc&r1=2.48&r2=2.49 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-ccp.c.diff?cvsroot=gcc&r1=2.45&r2=2.46 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-copy.c.diff?cvsroot=gcc&r1=2.16&r2=2.17 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-dom.c.diff?cvsroot=gcc&r1=2.55&r2=2.56 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/tree-ssa/asm-2.c.diff?cvsroot=gcc&r1=NONE&r2=1.1
The only thing left is documentation.
Removing the regression markes but not removing the target milestone. Richard you want to add the documenation or should someone else do it?
I've just re-read the Extended Asm section and find that the approprate docs have been present since revision 1.222 (hp 29-Sep-04).
Created attachment 44888 [details] Test case demonstrating the problem Looks like issue still exists. #include <array> uint64_t test() { register uint64_t t0; register uint64_t t1; register uint64_t t2 ; register uint64_t rdx; register uint64_t r0 asm("rsi"); register uint64_t r1 ; register uint64_t r2 ; register uint64_t r3 ; register uint64_t m3; register uint64_t y0; register uint64_t y1; register uint64_t y2; register uint64_t y3; asm( "mov %[res0],%[res0]" : "=&d"(rdx), [res0] "=&r"(r0), [res1] "=&r"(r1), [res2] "=&r"(r2), [res3] "=&r"(r3), [t0] "=&r"(t0), [t1] "=&r"(t1), [t2] "=&r"(t2) : [m3] "r"(m3), [y0] "r"(y0), [y1] "r"(y1), [y2] "r"(y2) : "cc"); return r0; } Results in: test(): push %r14 xor %eax,%eax xor %ecx,%ecx push %rbp xor %esi,%esi xor %edi,%edi push %rbx mov %r14,%r14 <--------- should be rsi. mov %r14,%rax pop %rbx pop %rbp pop %r14 retq https://godbolt.org/z/RLTDbY
No, this is a different issue (local reg var not honored for an earlyclobber operand), and it's better to have a separate bug: PR 87733.