Bug 86547

Summary: s390x: Maximum number of LRA assignment passes is achieved (30) when compiling a small inline assembler snippet
Product: gcc Reporter: Ilya Leoshkevich <iii>
Component: targetAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: jeffreyalaw, krebbel, vmakarov
Priority: P3 Keywords: ra
Version: 9.0   
Target Milestone: ---   
Host: Target: s390
Build: Known to work:
Known to fail: Last reconfirmed:
Attachments: patch v1

Description Ilya Leoshkevich 2018-07-17 11:58:07 UTC
Shows up on master (6cfa970a):

$ cat test.c
int a;
struct {} __thread b;
void c() {
  __asm__(""
          :
	  : "r"(a), ""(b), ""(b), "r"(a), "r"(a), "m"(a), "m"(a), "m"(a), "m"(a), "m"(a), "m"(a)
          : "r12");
}

$ PATH=gcc:$PATH xgcc -c test.c
during RTL pass: reload
test.c: In function ‘c’:
test.c:8:1: internal compiler error: Maximum number of LRA assignment passes is achieved (30)

 }
 ^
0x1a2d43f lra_assign(bool&)
	../../gcc/gcc/lra-assigns.c:1669
0x1a24e79 lra(_IO_FILE*)
	../../gcc/gcc/lra.c:2482
0x19c147f do_reload
	../../gcc/gcc/ira.c:5465
0x19c147f execute
	../../gcc/gcc/ira.c:5649
Comment 1 Ilya Leoshkevich 2018-07-20 14:03:57 UTC
I did a bisect and found two relevant commits:

1) c312b100: PR target/83712

Before: error: ‘asm’ operand has impossible constraints
After:  internal compiler error: Segmentation fault

2) eaefe34f: PR target/84876

Before: internal compiler error: Segmentation fault
After:  internal compiler error: Maximum number of LRA assignment passes is achieved (30)
Comment 2 Ilya Leoshkevich 2018-07-23 11:22:29 UTC
I dug a bit deeper and found that this used to compile without errors on gcc-4_8_5-release.

Bisect points to s390-specific commit 7b1bda1c, which first appeared in gcc-4_9_0-release:

    2013-06-06  Vladimir Makarov  <vmakarov@redhat.com>

            * config/s390/s390.opt (mlra): New option.
            * config/s390/s390.c (s390_decompose_address): Check displacement
            for all registers for LRA.
            (s390_secondary_reload): Don't used secondary reloads for LRA.
            (s390_lra_p): New function.
            (TARGET_LRA_P): Define.
            * config/s390/s390.md (*movmem_short, *clrmem_short): Change value
            of attribute cpu_facility to zarch for the last alternative.
            (*cmpmem_short): Ditto.

This commit appears to have introduced LRA for s390x.
Comment 3 Ilya Leoshkevich 2018-07-24 12:09:09 UTC
I think I found an issue in spill_hard_reg_in_range().

The idea behind the rclass loop in spill_hard_reg_in_range() seems to
be: find a hard_regno, which in general conflicts with reload regno,
but does not do so between `from` and `to`, and then do the live range
splitting based on this information. To check the absence of conflicts,
we make use of insn_bitmap, which does not contain insns which clobber
the hard_regno.

My current solution is: when selecting the hard_regno, make sure no insn
between `from` and `to` clobbers it. This makes the compile fail with
‘asm’ operand has impossible constraints instead of an ICE, which is an
improvement, but is still not perfect.
Comment 4 Ilya Leoshkevich 2018-07-24 12:09:31 UTC
Created attachment 44430 [details]
patch v1
Comment 5 Ilya Leoshkevich 2018-07-27 12:10:56 UTC
I've further narrowed this down to the difference between the following
snippets: 

// error: impossible asm constraint
struct {} __thread b;
void c() {
  __asm__("la 0,%0\n"
          :
          : "m" (b)
          : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9",
            "r10", "r12");  // free: r11,r13,r14,r15
}

// works
struct {} b;
void c() {
  __asm__("la 0,%0\n"
          "la 1,%1\n"
          :
          : "m" (b), "m" (b)
          : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9",
            "r10", "r12");  // free: r11,r13,r14,r15
}

The problem appears to be that s390 never allows the register allocator
to use r11 and r15, and also r13 when the function uses the constant
pool, which is the case when TLS variables are used. This leaves us only
with 1 free register r14, and we need 2 in order to access TLS variable.

For z10 and later machines it's in theory possible to free up r13, but I
would use this bug to only fix the ICE.

Vladimir, could you please take a look at the attached patch? I
ran regression with and without it on x86_64, and compare_tests did not
show any new failures.
Comment 6 Vladimir Makarov 2018-07-27 18:03:23 UTC
(In reply to Ilya Leoshkevich from comment #5)
> 
> 
> Vladimir, could you please take a look at the attached patch? I
> ran regression with and without it on x86_64, and compare_tests did not
> show any new failures.

The patch looks good for me.  You can commit it to the trunk.  Thanks for working on this.
Comment 7 Andreas Krebbel 2018-07-30 08:32:44 UTC
Author: krebbel
Date: Mon Jul 30 08:30:06 2018
New Revision: 263063

URL: https://gcc.gnu.org/viewcvs?rev=263063&root=gcc&view=rev
Log:
lra: consider clobbers when selecting hard_regno to spill

The idea behind the rclass loop in spill_hard_reg_in_range() seems to
be: find a hard_regno, which in general conflicts with reload regno,
but does not do so between `from` and `to`, and then do the live range
splitting based on this information. To check the absence of conflicts,
we make use of insn_bitmap, which does not contain insns which clobber
the hard_regno.

gcc/ChangeLog:

2018-07-30  Ilya Leoshkevich  <iii@linux.ibm.com>

        PR target/86547
	* lra-constraints.c (spill_hard_reg_in_range): When selecting the
	hard_regno, make sure no insn between `from` and `to` clobbers it.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-constraints.c
Comment 8 Jeffrey A. Law 2018-08-15 04:10:17 UTC
Author: law
Date: Wed Aug 15 04:09:45 2018
New Revision: 263548

URL: https://gcc.gnu.org/viewcvs?rev=263548&root=gcc&view=rev
Log:
        PR target/86547
	* lra-lives.c (remove_some_program_points_and_update_live_ranges):
        Check whether lra_live_max_point is 0 before dividing.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-lives.c
Comment 9 Jeffrey A. Law 2018-08-15 04:12:23 UTC
Fixed on the trunk.