Bug 16331 - x86-64 inline asm register constraints insufficient WRT ABI
Summary: x86-64 inline asm register constraints insufficient WRT ABI
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.3.3
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 39139
  Show dependency treegraph
 
Reported: 2004-07-02 16:51 UTC by thutt
Modified: 2018-04-28 15:02 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description thutt 2004-07-02 16:51:34 UTC
An x86-64 version of gcc 3.3.3 does not allow inline asm to specify constraints
for all the available registers.  Particularly, since r8 and r9 are used in the
ABI, it would be very beneficial to ensure that parameters to a function called
from inline assembly reside in the proper registers.
Comment 1 Falk Hueffner 2004-07-02 16:57:38 UTC
You can declare variables like

register unsigned long r8 asm("r8")

and pass it to the asm, and it will reside in r8. Therefore, I don't
think it makes sense to add further single register constraints.
Comment 2 thutt 2004-07-02 18:49:01 UTC
This is only guaranteed to work for global variables.  From the gcc info pages:

Variables in Specified Registers
================================

   GNU C allows you to put a few global variables into specified
hardware registers.  You can also specify the register in which an
ordinary register variable should be allocated.

   * Global register variables reserve registers throughout the program.
     This may be useful in programs such as programming language
     interpreters which have a couple of global variables that are
     accessed very often.

   * Local register variables in specific registers do not reserve the
     registers.  The compiler's data flow analysis is capable of
     determining where the specified registers contain live values, and
     where they are available for other uses.  Stores into local
     register variables may be deleted when they appear to be dead
     according to dataflow analysis.  References to local register
     variables may be deleted or moved or simplified.

     These local variables are sometimes convenient for use with the
     extended `asm' feature (*note Extended Asm::.), if you want to
     write one output of the assembler instruction directly into a
     particular register.  (This will work provided the register you
     specify fits the constraints specified for that operand in the
     `asm'.)

I read this to say that the compiler may, or may not, actually use the suggested
register.
Comment 3 Andrew Pinski 2004-07-02 18:54:31 UTC
The documentation is wrong, there is another bug about this already.
Comment 4 thutt 2004-07-06 14:55:32 UTC
Could you reference the documentation defect number?

And, I'm not sure that just the documentation is wrong; consider:

extern int hokus(void);

void test(void)
{
    register int r8 asm("r8") = hokus();
    register int r9 asm("r9") = 6;
    
    __asm__ __volatile__("call pokus"
                         :
                         :
                         "D" (r8),
                         "S" (2),
                         "d" (3),
                         "c" (4),
                         "r" (r8),
                         "r" (r9)
                         );
    r8 = hokus();
    pokus(r8, 2, 3, 4, 5, 6);
}
0000000000000000 <test>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   e8 00 00 00 00          callq  9 <test+0x9>
                        5: R_X86_64_PC32        hokus+0xfffffffffffffffc
   9:   41 b9 06 00 00 00       mov    $0x6,%r9d
   f:   41 b8 05 00 00 00       mov    $0x5,%r8d
  15:   b9 04 00 00 00          mov    $0x4,%ecx
  1a:   ba 03 00 00 00          mov    $0x3,%edx
  1f:   be 02 00 00 00          mov    $0x2,%esi
  24:   44 89 c7                mov    %r8d,%edi
  27:   b8 00 00 00 00          mov    $0x0,%eax
  2c:   e8 00 00 00 00          callq  31 <test+0x31>
                        2d: R_X86_64_PC32       pokus+0xfffffffffffffffc
  31:   be 02 00 00 00          mov    $0x2,%esi
  36:   ba 03 00 00 00          mov    $0x3,%edx
  3b:   b9 04 00 00 00          mov    $0x4,%ecx
  40:   44 89 c7                mov    %r8d,%edi
  43:   e8 00 00 00 00          callq  48 <test+0x48>
                        44: R_X86_64_PC32       pokus+0xfffffffffffffffc
  48:   c9                      leaveq 
  49:   c3                      retq   

The first call of pokus() completely ignores the assigned value of the variable
r8 -- instead the value '6' into it for the call.  The second call assumes the
the register r8 should be used for the call, but by now the wrong value has bee
placed into it.

Perhaps the documentation is wrong, but so is the code generated.  I'm reopening.


Comment 5 Andrew Pinski 2005-07-02 01:27:13 UTC
(In reply to comment #4)
>     __asm__ __volatile__("call pokus"

This is wrong, you don't want to use call in an asm at all.
Comment 6 Falk Hueffner 2005-07-02 10:17:02 UTC
(In reply to comment #4)

> The first call of pokus() completely ignores the assigned value of the variable
> r8 -- instead the value '6' into it for the call.  The second call assumes the
> the register r8 should be used for the call, but by now the wrong value has bee
> placed into it.

I cannot reproduce this with gcc 3.3.6, the generated assembly looks just
fine for me:

0000000000000000 <test>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   e8 00 00 00 00          callq  9 <test+0x9>
                        5: R_X86_64_PC32        hokus+0xfffffffffffffffc
   9:   41 89 c0                mov    %eax,%r8d
   c:   41 b9 06 00 00 00       mov    $0x6,%r9d
  12:   be 02 00 00 00          mov    $0x2,%esi
  17:   ba 03 00 00 00          mov    $0x3,%edx
  1c:   b9 04 00 00 00          mov    $0x4,%ecx
  21:   44 89 c7                mov    %r8d,%edi
  24:   e8 00 00 00 00          callq  29 <test+0x29>
                        25: R_X86_64_PC32       pokus+0xfffffffffffffffc
  29:   e8 00 00 00 00          callq  2e <test+0x2e>
                        2a: R_X86_64_PC32       hokus+0xfffffffffffffffc
  2e:   41 b9 06 00 00 00       mov    $0x6,%r9d
  34:   41 b8 05 00 00 00       mov    $0x5,%r8d
  3a:   b9 04 00 00 00          mov    $0x4,%ecx
  3f:   ba 03 00 00 00          mov    $0x3,%edx
  44:   be 02 00 00 00          mov    $0x2,%esi
  49:   44 89 c7                mov    %r8d,%edi
  4c:   b8 00 00 00 00          mov    $0x0,%eax
  51:   e8 00 00 00 00          callq  56 <test+0x56>
                        52: R_X86_64_PC32       pokus+0xfffffffffffffffc
  56:   c9                      leaveq 

Please retry and give the exact version and flags you used.
Comment 7 Andrew Pinski 2005-09-04 19:05:11 UTC
No feedbacck in 3 months.
Comment 8 H.J. Lu 2009-02-09 20:46:33 UTC
Reopened.
Comment 9 H.J. Lu 2009-02-09 20:46:56 UTC
*** Bug 39139 has been marked as a duplicate of this bug. ***
Comment 10 H.J. Lu 2009-02-09 20:47:45 UTC
The rational for this request is at

http://gcc.gnu.org/bugzilla/attachment.cgi?id=17274
Comment 11 H.J. Lu 2009-02-09 20:49:49 UTC
Uros, how hard to support this in x86 backend?
Comment 12 Uroš Bizjak 2009-02-09 22:43:44 UTC
(In reply to comment #11)
> Uros, how hard to support this in x86 backend?

I remember there were concerns when xmm0 single-register constraint was introduced... We need new constraint letter and new regclass entry. I don't have relevant mail at hand, but IIRC adding new register class is O(n*n).

OTOH, constraints should be used to support correct register allocation for machine instructions, not to emulate ABI in order to support calls from inside asm statements.
Comment 13 thutt 2009-02-10 14:34:48 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > Uros, how hard to support this in x86 backend?
>

<snip>

> OTOH, constraints should be used to support correct register
> allocation for machine instructions, not to emulate ABI in order to
> support calls from inside asm statements.

Please indulge me for a moment.

What exactly is a call?

Are you considering the only method of transferring control to be the
standard 'near call' & 'near ret' instructions on the x86?

What about the following modes of transferring control to another
address?

   int
   iret
   ret
   sysenter
   sysexit
   syscall
   sysret
   ud2
   int3
   jmp

Then, what of these?

   lcall
   lret
   ljmp

Every one of these is a method to transfer control to another address
with a programmer-defined set of input register.  More importantly,
none of these are directly supported by gcc to invoke functions
without resorting to inline assembly.

If you're working on an operating system, a virtualization engine or
some other embedded device you might need to transfer control using
one of these methods.

As a really simple example, consider a handler for a timer interrupt.
Let's say that the prologue for the interrupt (written in assembly)
handler storing all the machine registers into a data structure
accessible from C.  Then, the prologue transfers to the handler which is
conveniently written in C.  Wouldn't it be really nice if one could
restore all the saved registers in C code using only inline assembly
instruction?

    __asm__("iret"
            :
            : force restoring registers saved in data structure);

I'm confident you can see the advantage of doing this in C and letting
the compiler deal with the bookkeeping details, rather than resorting
to another assembly language function which does such a simple feat.

I think I pretty clearly demonstrate here that calling other functions
using the x86 architecture isn't as simple as assuming it's going only
to be done with 'call', 'jmp' and 'ret', and many of those methods are
not possible with straight C, even with gcc's helpful extensions.

Should gcc prevent the developer from using the ABI just because the
inline assembly wasn't meant to 'support calls from inside assembly
statements'?
Comment 14 Uroš Bizjak 2009-02-10 15:20:39 UTC
> > OTOH, constraints should be used to support correct register
> > allocation for machine instructions, not to emulate ABI in order to
> > support calls from inside asm statements.
> 
> Please indulge me for a moment.
> 
> What exactly is a call?
> 
> Are you considering the only method of transferring control to be the
> standard 'near call' & 'near ret' instructions on the x86?

I was referring at the procedure call, where you need to setup outgoing arguments on the calling point and setup incoming arguments on caller point. gcc will magically match these two no matter what ABI-changing compile flag (i.e. -mregparm) you use. When you call procedure from inside asm, gcc does not know about that, and there is no way that gcc will know where arguments are to be found.
Comment 15 thutt 2009-02-10 15:35:46 UTC
(In reply to comment #14)
> > > OTOH, constraints should be used to support correct register
> > > allocation for machine instructions, not to emulate ABI in order to
> > > support calls from inside asm statements.
> >
> > Please indulge me for a moment.
> >
> > What exactly is a call?
> >
> > Are you considering the only method of transferring control to be the
> > standard 'near call' & 'near ret' instructions on the x86?
>
> I was referring at the procedure call, where you need to setup outgoing
> arguments on the calling point and setup incoming arguments on caller point.
> gcc will magically match these two no matter what ABI-changing compile flag
> (i.e. -mregparm) you use. When you call procedure from inside asm, gcc does not
> know about that, and there is no way that gcc will know where arguments are to
> be found.

I understand.

I'm not expecting gcc to handle those aspects of the inline assembly.

My request is just to be able to specify any GP hardware register in a
constraint.  Both input & output.

Comment 16 Christian Häggström 2012-06-21 17:11:11 UTC
Can this bug be closed as it works with gcc 3.3.6 as expressed in comment #6?
Otherwise, please state again clearly what is still broken since it is hard to follow the discussion.
Comment 17 Gerald Pfeifer 2014-06-25 23:47:53 UTC
David Wohlferd, who has been rewriting the asm documentation, among
others, indicated that this is actually now described in the Local Reg Vars
section, so I am closing this bug.

Please advise if you disagree and there is something left to document
or fix in the compiler proper.