Bug 104236 - asm statements containing %= assembler templates getting merged
Summary: asm statements containing %= assembler templates getting merged
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: documentation, inline-asm
Depends on:
Blocks:
 
Reported: 2022-01-26 00:54 UTC by Nick Desaulniers
Modified: 2022-01-26 01:52 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Desaulniers 2022-01-26 00:54:09 UTC
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile and
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#AssemblerTemplate mention

    > Under certain circumstances, GCC may duplicate (or remove duplicates
    > of) your assembly code when optimizing. This can lead to unexpected
    > duplicate symbol errors during compilation if your asm code defines
    > symbols or labels. Using ‘%=’ (see AssemblerTemplate) may help resolve
    > this problem.
    >
    > ‘%=’ Outputs a number that is unique to each instance of the asm
    > statement in the entire compilation. This option is useful when
    > creating local labels and referring to them multiple times in a single
    > template that generates multiple assembler instructions.

I think I might have found a case where GCC is folding two asm strings that look identical, but technically contain a %= assembler template.  Perhaps there's somewhere that checks for parameter equality, and misses checking whether an asm string contains %= ? (or perhaps %= needs to be expanded sooner?)

Test files are in:
https://gist.github.com/nickdesaulniers/c2c37edcdbaf3a2deb914d2ea8011a96

I would have expected the inline asm string containing `.Lreachable%=:` (in media_request_object_complete()) to appear twice in the emitted asm output.  If I modify one of the two `.Lreachable` asm strings by one character, then I see both emitted.
Comment 1 Andrew Pinski 2022-01-26 01:18:31 UTC
On the gimple level we have:
  if (_2 == 0)
    goto <bb 4>; [10.00%]
  else
    goto <bb 3>; [90.00%]

  <bb 3> [local count: 966367639]:
  _6 = MEM[(struct media_request *)media_request_object_complete_obj.0_1].state;
  if (_6 != 0)
    goto <bb 5>; [10.00%]
  else
    goto <bb 6>; [90.00%]

  <bb 4> [local count: 107374184]:
  __asm__ __volatile__(".byte 0x0f, 0x0b");
  __asm__ __volatile__(".Lreachable%=:
	.pushsection .discard.reachable
	.long .Lreachable%= - .
	.popsection
	" :  :  : "memory");
  _raw_spin_unlock_irqrestore (pretmp_46, 0); [tail call]
  goto <bb 10>; [100.00%]

  <bb 5> [local count: 96636765]:
  __asm__ __volatile__(".byte 0x0f, 0x0b" :  : "i" 0);
  __asm__ __volatile__(".Lreachable%=:
	.pushsection .discard.reachable
	.long .Lreachable%= - .
	.popsection
	" :  :  : "memory");
  _raw_spin_unlock_irqrestore (pretmp_46, 0); [tail call]
  goto <bb 10>; [100.00%]

So what the RTL level is doing looks fine, BB 4 and BB 5 are exactly the same so its combining the two.

I don't see why this is a bug.
The documentation does have:
> (or remove duplicates of)


%= is only matters at the end.
Comment 2 Andrew Pinski 2022-01-26 01:25:39 UTC
>or perhaps %= needs to be expanded sooner?

The basic blocks are exactly the same. The %= is only there to resolve the issue where GCC duplicates the inline-asm and needs to be resolved at the end of compiling when outputting the asm itself.

I don't see why even in this case would cause a problem as you have an inline-asm which is the same as the other and they are combined together.

I think the original code should have had the two inline-asm combined together instead of having them seperated.
That is:

  __asm__ __volatile__(".byte 0x0f, 0x0b\n"
  ".Lreachable%=:
	.pushsection .discard.reachable
	.long .Lreachable%= - .
	.popsection
	" :  :  : "memory");

....

  __asm__ __volatile__(".byte 0x0f, 0x0b\n"
  ".Lreachable%=:
	.pushsection .discard.reachable
	.long .Lreachable%= - .
	.popsection
	" :  "i"(0)  : "memory");

That will fix the issue at hand in the code itself really.
Even if you used 1: and 1b inside the inline-asm you would run into the same issue.
Comment 3 Nick Desaulniers 2022-01-26 01:39:57 UTC
Thanks for the feedback. I guess I was expecting these two to be somewhat equivalent:

void x (int a) {
    if (a)
        asm("# %0"::"i"(__COUNTER__));
    else
        asm("# %0"::"i"(__COUNTER__));
}

void y (int a) {
    if (a)
        asm("# %=");
    else
        asm("# %=");
}

as was attempted in kernel commit

commit 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()")
Comment 4 Andrew Pinski 2022-01-26 01:44:18 UTC
(In reply to Nick Desaulniers from comment #3)
> Thanks for the feedback. I guess I was expecting these two to be somewhat
> equivalent:
> 
> void x (int a) {
>     if (a)
>         asm("# %0"::"i"(__COUNTER__));
>     else
>         asm("# %0"::"i"(__COUNTER__));
> }
> 
> void y (int a) {
>     if (a)
>         asm("# %=");
>     else
>         asm("# %=");
> }

They are not. I was just about going to suggest the use of __COUNTER__ as the correct way of implementing the inline-asm also.

For GCC, %= gets resolved at the very end of compiling when outputting the asm to the output file.

GCC does touch the string or look into the inline-asm string during compiling except to duplicate it or to see if it is an exact match (remove duplicates).
Comment 5 Andrew Pinski 2022-01-26 01:52:18 UTC
You can still use %= if you want and combine it with the input constraint with counter.
That is:
void y (int a) {
    if (a)
        asm("# %="::"i"(__COUNTER__));
    else
        asm("# %="::"i"(__COUNTER__));
}

That way if you still get the same version to use between the two compilers if clang does not have __COUNTER__.