103616 – [11/12/13/14 Regression] ICE on ceph with systemtap macro since r8-5608

Bug 103616 - [11/12/13/14 Regression] ICE on ceph with systemtap macro since r8-5608

Summary: [11/12/13/14 Regression] ICE on ceph with systemtap macro since r8-5608

Status:	UNCONFIRMED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	12.0

Importance:	P2 normal
Target Milestone:	11.5
Assignee:	Not yet assigned to anyone

URL:
Keywords:	ice-on-valid-code, inline-asm, ra

Depends on:
Blocks:

Reported:	2021-12-08 09:44 UTC by Jakub Jelinek
Modified:	2023-07-07 10:41 UTC (History)
CC List:	3 users (show)

See Also:	98991
Host:
Target:	x86_64--
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jakub Jelinek 2021-12-08 09:44:39 UTC

Since r8-5608-gd555138e648961fdc572d8afdb234b52978828f9 the following testcase
ICEs at -O2 -fPIC on x86_64-linux:
long a, b;
void bar (char *, long);
void baz (char, char);
void qux (char *, char *);

void
foo (void)
{
  while (1)
    {
      char c, d, e, f;
      bar (&c, a);
      bar (&d, b);
      baz (c, d);
      qux (&e, &f);
      double g = 0;
      __asm__("" : : "norfxy" (g));
    }
}

during RTL pass: reload
dump file: rh2027386.c.301r.reload
rh2027386.c: In function ‘foo’:
rh2027386.c:19:1: internal compiler error: maximum number of generated reload insns per insn achieved (90)
   19 | }
      | ^
0x11156b7 lra_constraints(bool)
	../../gcc/lra-constraints.c:5084
0x10fe2de lra(_IO_FILE*)
	../../gcc/lra.c:2336
0x10a590d do_reload
	../../gcc/ira.c:5932
0x10a5dfc execute
	../../gcc/ira.c:6118
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Similar LRA looping on these "norfxy" constraints
has been fixed with r9-9463-g49cc1253d079bbefc1 but not in this testcase.

One thing is it would be nice to avoid the LRA looping (dunno what is at fault, whether LRA or the backend).

Another one is I wonder if the cheapest reload when the insn allows memory
wouldn't be to use the literal pool memory.  E.g. on
void
foo (void)
{
  double d = 0.0, e = 7.8;
  __asm ("# %0 %1" : : "m" (d), "m" (e));
}

void
bar (void)
{
  double d = 0.0, e = 7.8;
  __asm ("# %0 %1" : : "mr" (d), "mr" (e));
}

void
baz (void)
{
  double d = 0.0, e = 7.8;
  __asm ("# %0 %1" : : "mrx" (d), "mrx" (e));
}

void
qux (void)
{
  double d = 0.0, e = 7.8;
  __asm ("# %0 %1" : : "mrfx" (d), "mrfx" (e));
}

for foo we emit a weird load of the floating point constants from constant pool,
store those on stack and use those stack memories as operands (this isn't RA fault, but expansion fault), while for bar-qux the combiner combines the constant pool memories into the inline asm and they survive RA there.
So, after the looping is fixed, it would be nice if the RA also considered moving constant pool MEMs (they are constant, can't be clobbered by function calls etc. in between) to input operands that accept memory.

Note, systemtap changed recently the norfxy to norx for x86_64, I think both the y and f in there are too dangerous, but even with norx constraint, if a floating point constant is used and combiner doesn't combine it for some reason (e.g. multiple uses), it would be nice if for the systemtap macros they were as cheap as possible and thus avoiding runtime code to compute the values when possible.

Comment 1 Vladimir Makarov 2022-01-28 14:57:35 UTC

I can not reproduce ICE on this week GCC.  Probably it was fixed (or switched off) by some recent RA patch.

As for the second issue (code generation for function foo), I thought for some time how it could be fixed.  It seemed that LRA inheritance sub-pass could be extended to work on memory too besides regs.  But I got to conclusion that it would complicate already complicated LRA (inheritance subpass) more as we need to add sophisticated analysis (including aliasing) for memory.

I guess there is an simpler alternative solution.  The problem would disappear if double constant were in asm insn before LRA.  I think some pass before RA could this.  It could be driven by a target, for example to promote double constants for x86-64.

Also the problem might be solved if we had pseudo<-double insn instead of mem<-double insn before LRA, LRA code dealing with equiv could promote double into the asm insn (although I am not 100% sure about this but, if it is not the case, probably code dealing with equiv could be tweaked to do this).

So my proposal is to solve the problem somehow outside RA.

Comment 2 Jakub Jelinek 2022-01-28 16:23:44 UTC

#c0 doesn't ICE on the trunk since
r12-5944-ga7acb6dca941db2b1c135107dac3a34a20650d5c

Comment 3 Richard Biener 2022-05-27 09:46:51 UTC

GCC 9 branch is being closed

Comment 4 Jakub Jelinek 2022-06-28 10:47:22 UTC

GCC 10.4 is being released, retargeting bugs to GCC 10.5.

Comment 5 Richard Biener 2023-07-07 10:41:44 UTC

GCC 10 branch is being closed.