Bug 63359 - aarch64: 32bit registers in inline asm
Summary: aarch64: 32bit registers in inline asm
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.9.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: documentation, inline-asm
Depends on:
Blocks:
 
Reported: 2014-09-24 13:26 UTC by Marc Glisse
Modified: 2022-11-15 05:50 UTC (History)
2 users (show)

See Also:
Host:
Target: aarch64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-08-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Glisse 2014-09-24 13:26:48 UTC
int f(int i){
  asm("clz %0, %0":"+r"(i));
  return i;
}

produces:

	clz x0, x0

I need to write "clz %w0, %w0" to get the expected:

	clz w0, w0

What I would like:
1) on x86, the type of i is used to get the right register name. If this can't be done on aarch64 (did ARM forbid it?), it may be useful to warn when I pass the wrong type.
2) I need a documented way to get 32bit regs, and %w0 is not documented in the gcc manual. Besides, clang rejects it, so please find a common syntax...
Comment 1 James Molloy 2014-09-24 13:55:28 UTC
Hi,

> Besides, clang rejects it, so please find a common syntax...

It shouldn't. The "w" modifier should have been supported since clang 3.4, and is certainly supported in clang 3.5.

Clang 3.5 has a warning about this:

"""
/tmp/test.c:2:27: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
    asm("clz %0, %0":"+r"(i));
                          ^
/tmp/test.c:2:14: note: use constraint modifier "w"
    asm("clz %0, %0":"+r"(i));
             ^~
             %w0
/tmp/test.c:2:27: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
    asm("clz %0, %0":"+r"(i));
                          ^
/tmp/test.c:2:18: note: use constraint modifier "w"
    asm("clz %0, %0":"+r"(i));
                 ^~
                 %w0
2 warnings generated.
"""
Comment 2 Marc Glisse 2014-09-24 14:11:13 UTC
(In reply to James Molloy from comment #1)
> > Besides, clang rejects it, so please find a common syntax...
> 
> It shouldn't. The "w" modifier should have been supported since clang 3.4,
> and is certainly supported in clang 3.5.

Uh, you are right. I have no idea why my earlier tests failed... Sorry.

So I would mostly like to see this 'w' modifier documented in the gcc doc, and a warning that includes a hint about 'w'. (For clang I only have 3.5 and svn215195 of 3.6 and the note part is not present yet)

Thanks.
Comment 3 Richard Earnshaw 2014-09-24 14:43:56 UTC
Agree that this needs to be documented.  

I'm not so sure about a warning, however.  I could envisage cases where the warning would be incorrect and avoiding it would lead to code pessimisation.
Comment 4 James Molloy 2014-09-24 14:47:03 UTC
Hi Richard,

My two-pennyworth for what it's worth - we've had several people with broken code tripped by this bug, and Apple have reported seeing the same thing with their internal codebases. This one seems often to appear in real-world code.

Cheers,

James
Comment 5 Richard Earnshaw 2014-09-24 14:57:24 UTC
So consider:

int f(int i){
  long x;
  asm("lsl %0, %1, 33" : "=r"(x) : "r"(i)); // lshift by more than sizeof(int)
  return x;
}

We really don't care about the top bits in i, so we don't want to extend the value to 64 bits before we do the shift.  But we can't put "w" on the second operand since it has to be a 64-bit register.
Comment 6 James Molloy 2014-09-24 15:03:55 UTC
Good example, although I might argue slightly pathological.

So in this case currently, GCC doesn't even implicitly promote the argument, just uses it as-is. It seems a very dangerous behaviour to have as default. Could there not be a more sensible default and an explicit constraint modifier to allow this instead?
Comment 7 Marc Glisse 2014-09-24 15:07:21 UTC
(In reply to Richard Earnshaw from comment #3)
> I'm not so sure about a warning, however.  I could envisage cases where the
> warning would be incorrect and avoiding it would lead to code pessimisation.

I don't insist on the warning, I might not have needed it with a proper doc (that explains both that "r" is reserved to 64 bits (gcc won't adapt to the type of the argument) and that there is a 'w' modifier to get 32 bits).
Comment 8 Richard Earnshaw 2014-09-24 15:21:31 UTC
(In reply to James Molloy from comment #6)
> Good example, although I might argue slightly pathological.
> 

Agreed, this is somewhat pathological, but I only need to find one valid counter-example :-)

Furthermore, something similar will be quite common on results.  Eg:

int i, j;
unsigned long r;
asm("add %w0, %w1, %w2" : "=r"(r) : "r"(i), "r"(j));  // zero-extend result.

here we *want* the 64-bit result from the implicit zero-extend of writing the lower 32 bits.

> So in this case currently, GCC doesn't even implicitly promote the argument,
> just uses it as-is. It seems a very dangerous behaviour to have as default.
> Could there not be a more sensible default and an explicit constraint
> modifier to allow this instead?

One of the things I dislike so much about GCC's inline assembly is that it's just an exposure to users of an internal API in the compiler.  That makes it very difficult to say precisely what will happen in all cases and *very* hard to fix problems with it when it exposes bugs.

I'm not saying I'll never accept a warning for this sort of code; but I'd need convincing that it won't unduly pessimize real code with no acceptable work-arounds.
Comment 9 James Molloy 2014-09-24 15:26:57 UTC
OK, given your second example I agree that the usecase isn't quite as pathological as I thought.

> I'm not saying I'll never accept a warning for this sort of code; but I'd need
convincing that it won't unduly pessimize real code with no acceptable
work-arounds.

Clang is committed to this warning as our community feels the error detection rate makes up for the lack of raw power. So unless we actively do something the two compilers will always differ in approach which probably isn't best for our users.

Would you be opposed to discussing a constraint modifier to mean "implicitly extend to 64-bits"?
Comment 10 Andrew Pinski 2016-01-13 23:30:27 UTC
This is a documentation issue rather than anything else.  Also note x86_64 has some good documentation for inline-asm.  It would be good if someone familiar with the aarch64 back-end to write up documentation like it was done for x86_64 (Linaro or ARM would be two good companies to do it).
Comment 11 Jeremy 2016-06-23 11:07:42 UTC
int32_t n;
asm( "str %1,[%0],#4" : "+r" (ptr) : "r" (n) : "memory" );

Caught me until I just happened to examine the assembler.

Of course %w1 works - but then I need SEPARATE code for 32-bit ARM and for aarch64.

Now arnv8 has two register sizes, I ask also, please could it work like x86 and use the operand size to determine which to emit, x or w.
Comment 12 Richard Earnshaw 2016-06-23 13:14:37 UTC
We considered that, but it won't work.  For example, in ILP32 address registers need to use the X form, but are still 32-bits in size.  There are other cases as well where a W or X form is required but that is not the natural size of the object.
Comment 13 nsz 2016-06-23 15:51:07 UTC
there could be a format for reducing porting work between ilp32 and lp64 inline asm:

asm ("ldr %?0, foo" : "=r"(ptr));

where %?0 behaves like %w0 on ilp32 and %x0 on lp64, which is a bit nicer than doing preprocessor hacks like

#ifdef __LP64__
#define XW "x"
#else
#define XW "w"
#endif
asm ("ldr %" XW "0, foo" : "=r"(ptr));
Comment 14 andysem 2020-06-15 11:07:30 UTC
I've been hit by this issue recently.

(In reply to Richard Earnshaw from comment #12)
> We considered that, but it won't work.  For example, in ILP32 address
> registers need to use the X form, but are still 32-bits in size.  There are
> other cases as well where a W or X form is required but that is not the
> natural size of the object.

For the cases where the size needs to be specified explicitly, you can leave %w1/%x1 syntax. But the default register size, I think, should be chosen based on the asm argument size.

There is one case where I don't feel like %w1/%x1 syntax is correct, even if currently accepted. It's when the argument allows an alternative of an immediate constant:

asm("add %w0, %w1, %w2" : "=r"(r) : "r"(i), "Ir"(j));

Note the last argument. When it is a constant, it doesn't need the "w" prefix, but it needs one if it is a runtime value.