extended asm and input parameters

dw limegreensocks@yahoo.com
Wed Jul 3 01:41:00 GMT 2013

I'm trying to understand how input parameters are used by gcc's extended 
asm.  Each time I think I've got a handle on it, it does something else 

For example, this c++ code:

  inline void moo(unsigned char *Dest, unsigned char Data, int Count) {
    asm volatile (
       "rep stosb"
       : /* no outputs */
       : "D" (Dest), "a" (Data), "c" (Count)
       : "memory");

int main()
    unsigned char buff1[32];

    moo(buff1, 0, sizeof(buff1));
    moo(buff1, 0, sizeof(buff1));

    return 0;

  Compiling for 64bit on i386 using -Os, I get:

0000000000407800 <main>:
   407800:    push   rdi
   407801:    sub    rsp,0x40
   407805:    call   4022d0 <__main>

   40780a:    lea    rdi,[rsp+0x20]
   40780f:    mov    ecx,0x20
   407814:    xor    eax,eax
   407816:    rep stos BYTE PTR es:[rdi],al

   407818:    lea    rdi,[rsp+0x20]
   40781d:    rep stos BYTE PTR es:[rdi],al

   40781f:    xor    eax,eax
   407821:    add    rsp,0x40
   407825:    pop    rdi
   407826:    ret

Now, there are a few noteworthy things here:

1) ecx gets loaded for the first call, but not the second.
2) rdi gets loaded for both calls.
3) eax gets zeroed before the first call, does not for the second, but 
then gets zeroed again for the return code.

When I saw that ecx wasn't getting re-loaded, I speculated that inputs 
are assumed to be unchanged by the asm unless they are also listed as 
output.  This was not what I expected, but upon reflection, I could see 
how that made sense.

But if that's true, why does rdi get re-loaded each time?  My first 
guess was that the "memory" clobber was causing this.  But removing it 
didn't change the asm that got generated.

And what about the fact that rax is getting zeroed for the first call, 
not for the second, then zeroed again for the return value. If the 
optimizer is assuming input values are unchanged by asm blocks, why did 
it need to re-assign it, but only sometimes?

I have tried other experiments in an attempt to understand the pattern 
here, but the more I try, the more unclear things become. Rather than 
posting all my tests here making this post harder to read, I'll just 
start with the most important question first, then ask a followup or two:

When can and can't you (safely) modify extended asm input-only 
parameters?  Unlike output parameters (which must be lvalues), inputs 
are expressions.  Does this mean they are supposed to be modifiable at 
will?  Or must they (all and always) be treated as read-only?


More information about the Gcc-help mailing list