User account creation filtered due to spam.

Bug 46164 - Local variables in specified registers don't work correctly with inline asm operands
Summary: Local variables in specified registers don't work correctly with inline asm o...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.5.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: ra, wrong-code
Depends on:
Blocks:
 
Reported: 2010-10-25 10:13 UTC by Siarhei Siamashka
Modified: 2015-01-27 08:01 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 4.1.2, 4.3.2, 4.6.0
Last reconfirmed: 2010-10-27 21:13:00


Attachments
proposed testcase for x86_64 (213 bytes, text/plain)
2010-10-25 10:37 UTC, Siarhei Siamashka
Details
updated testcase (x86_64) (205 bytes, text/x-c)
2010-10-25 12:32 UTC, Siarhei Siamashka
Details
testcase for gcc 4.9.1 (299 bytes, text/plain)
2014-08-13 07:47 UTC, Tim Pambor
Details
updated testcase for gcc 4.9.1 (288 bytes, text/plain)
2014-08-13 07:51 UTC, Tim Pambor
Details
"-da" rtl files for testcase (185.15 KB, application/zip)
2014-08-13 07:55 UTC, Tim Pambor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Siarhei Siamashka 2010-10-25 10:13:31 UTC
When testing with gcc 4.5.1

==== ARM ====

$ cat test.c

int f(int a)
{
  register int result asm("r0");
  asm (
    "add    r0, %[a], #123\n"
    : [result] "=&r" (result)
    : [a]      "r"   (a)
  );
  return result;
}

$ gcc -O2 -c test.c
$ objdump -d test.o

00000000 <f>:
   0:   e280007b        add     r0, r0, #123    ; 0x7b
   4:   e1a00003        mov     r0, r3
   8:   e12fff1e        bx      lr

Here the local variable 'result' gets assigned to register r3 instead of r0
causing all kind of problems.

==== x86-64 ====

$ cat test.c

int f(int a)
{
  register int result asm("edi");
  asm (
    "lea    0x7b(%[a]), %%edi\n"
    : [result] "=&r" (result)
    : [a]      "r"   (a)
  );
  return result;
}

$ gcc -O2 -c test.c
$ objdump -d test.o

0000000000000000 <f>:
   0:   67 8d 7f 7b             addr32 lea 0x7b(%edi),%edi
   4:   c3                      retq

=================================

And some final bits.

http://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html#Local-Reg-Vars
http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

The documantation is a bit confusing, but it gives at least one example of assigining variables to specified registers:

"Sometimes you need to make an asm operand be a specific register, but there's no matching constraint letter for that register by itself. To force the operand into that register, use a local variable for the operand and specify the register in the variable declaration. See Explicit Reg Vars. Then for the asm operand, use any register constraint letter that matches the register:

     register int *p1 asm ("r0") = ...;
     register int *p2 asm ("r1") = ...;
     register int *result asm ("r0");
     asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));"

Let's try to use something like that with x86-64:

/********************/
void abort();

int __attribute__((noinline)) f(int a)
{
  register int p1 asm ("edi");
  register int result asm ("edi");
  asm (
    "mov %2, %0\n"
    "add %2, %0\n"
    "add %2, %0\n"
    : "=r" (result) : "0"  (p1), "r" (a));
  return result;
}

int main()
{
    if (f(1) != 3)
        abort();
}

/********************/

This testcase fails.

So is it a bug in gcc? Or the documentation is wrong? Or I'm missing something?
Comment 1 Siarhei Siamashka 2010-10-25 10:37:13 UTC
Created attachment 22144 [details]
proposed testcase for x86_64
Comment 2 Siarhei Siamashka 2010-10-25 12:32:01 UTC
Created attachment 22145 [details]
updated testcase (x86_64)

Actually the previous testcase was not very good. It tried to simulate earlyclobber operand by specifying it both as input and output, but because "p1" was actually not initialized, gcc may be allowed to optimize it and screw up everything (without any kind of warnings, but that's another story).

So the problem is actually related to using specified registers for earlyclobber output operands in such a way that they try to use the same registers as function arguments.
Comment 3 Andrew Pinski 2010-10-27 21:13:00 UTC
Confirmed, The register allocator is causing it.  I think it does not take into account the "&" so reload will correct it.  (this was true at least in the old RA in 4.3 and before).
Comment 4 Tim Pambor 2014-08-13 07:47:20 UTC
Created attachment 33307 [details]
testcase for gcc 4.9.1

I think this bug is still present in gcc 4.9.1 and 4.8.4.

I could reproduce the problem with the attached testcase using gcc 4.8.4 with -O1 and -Og and 4.9.1 with -O1. -O0, -O2, -O3, -Os generated correct code. It generated the following assembler code:

...
  mov r0, r0	@ r0
  mov r4, r4	@ r1
  mov r2, r2	@ r2
...

Expected would have been:

...
  mov r0, r0	@ r0
  mov r1, r1	@ r1
  mov r2, r2	@ r2
...
Comment 5 Tim Pambor 2014-08-13 07:51:04 UTC
Created attachment 33308 [details]
updated testcase for gcc 4.9.1
Comment 6 Tim Pambor 2014-08-13 07:55:12 UTC
Created attachment 33309 [details]
"-da" rtl files for testcase
Comment 7 Hale Wang 2015-01-22 11:02:01 UTC
(In reply to Tim Pambor from comment #4)
> Created attachment 33307 [details]
> testcase for gcc 4.9.1
> 
> I think this bug is still present in gcc 4.9.1 and 4.8.4.
> 
> I could reproduce the problem with the attached testcase using gcc 4.8.4
> with -O1 and -Og and 4.9.1 with -O1. -O0, -O2, -O3, -Os generated correct
> code. It generated the following assembler code:
> 
> ...
>   mov r0, r0	@ r0
>   mov r4, r4	@ r1
>   mov r2, r2	@ r2
> ...
> 
> Expected would have been:
> 
> ...
>   mov r0, r0	@ r0
>   mov r1, r1	@ r1
>   mov r2, r2	@ r2
> ...

The combine pass combined the volatile register which caused this bug.

The expected assembler code should be:

  mov r4, .L_temp
  mov r1, r4
  ...
  mov r0, r0    @ r0
  mov r1, r1	@ r1
  mov r2, r2	@ r2

But GCC combined the insns, and the code is generated as:

  mov r4, .L_temp
  ...
  mov r0, r0    @ r0
  mov r4, r4	@ r1
  mov r2, r2	@ r2

The register 'r1' is defined as volatile in this case. It should not be combined.
Comment 8 Hale Wang 2015-01-26 09:59:44 UTC
I have submitted a patch to community for further discussion. Refer to: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02238.html.
Comment 9 Hale Wang 2015-01-27 08:01:11 UTC
Hi Tim,

Your testcase is caused by the combine. It's not the same with Siarhei's test case. So I think we should divide your test case to another bug.

And my patch is only used to fix the bug with your test case. So I will submit a new bug to record your comments.

Thanks,
Hale