Bug 83712 - [7 Regression] "Unable to find a register to spill" when compiling for thumb1
Summary: [7 Regression] "Unable to find a register to spill" when compiling for thumb1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 7.2.0
: P2 normal
Target Milestone: 8.0
Assignee: sudi
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2018-01-06 11:52 UTC by mikulas
Modified: 2021-10-11 23:35 UTC (History)
7 users (show)

See Also:
Host: x86_64-linux-gnu
Target: arm-linux-gnueabi-gcc-7
Build: x86_64-linux-gnu
Known to work: 8.0
Known to fail: 4.8.5, 4.9.4, 5.4.1, 6.4.1, 7.2.1, 7.5.0
Last reconfirmed: 2018-01-08 00:00:00


Attachments
a reduced testcase for gcc bug (267 bytes, text/plain)
2018-01-06 11:52 UTC, mikulas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mikulas 2018-01-06 11:52:04 UTC
Created attachment 43050 [details]
a reduced testcase for gcc bug

I get the following error when attempting co compile the attached file with:
$ arm-linux-gnueabi-gcc-7 -mfloat-abi=softfp -mthumb -march=armv5t -O2 kbd-min.c

kbd-min.c: In function ‘handle_trm’:
kbd-min.c:23:1: error: unable to find a register to spill
 }
 ^
kbd-min.c:23:1: error: this is the insn:
(insn 17 43 44 2 (parallel [
            (set (mem:SI (reg/f:SI 129 [121]) [0  S4 A32])
                (mem:SI (reg/f:SI 135 [120]) [0  S4 A32]))
            (set (mem:SI (plus:SI (reg/f:SI 129 [121])
                        (const_int 4 [0x4])) [0  S4 A32])
                (mem:SI (plus:SI (reg/f:SI 135 [120])
                        (const_int 4 [0x4])) [0  S4 A32]))
            (set (mem:SI (plus:SI (reg/f:SI 129 [121])
                        (const_int 8 [0x8])) [0  S4 A32])
                (mem:SI (plus:SI (reg/f:SI 135 [120])
                        (const_int 8 [0x8])) [0  S4 A32]))
            (set (reg/f:SI 129 [121])
                (plus:SI (reg/f:SI 129 [121])
                    (const_int 12 [0xc])))
            (set (reg/f:SI 135 [120])
                (plus:SI (reg/f:SI 135 [120])
                    (const_int 12 [0xc])))
            (clobber (reg:SI 126))
            (clobber (reg:SI 127))
            (clobber (reg:SI 128))
        ]) "kbd-min.c":16 836 {movmem12b}
     (expr_list:REG_UNUSED (reg:SI 128)
        (expr_list:REG_UNUSED (reg:SI 127)
            (expr_list:REG_UNUSED (reg:SI 126)
                (nil)))))
kbd-min.c:23: confused by earlier errors, bailing out

I reproduced the bug on gcc-4.8.4, gcc-6.3.0, gcc-7.2.0.
Comment 1 Richard Earnshaw 2018-01-08 14:56:35 UTC
I can't reproduce this.  My compilers build the testcase without problems.  I've tried various releases of GCC as well.  Exactly how was your compiler configured?  What is the output of "gcc -v"?
Comment 2 mikulas 2018-01-08 15:35:13 UTC
I can reproduce it with
gcc-4.8.4 on Ubuntu 14.04 on native armhf (it doesn't reproduce with gcc-4.9 on the same machine)
arm-linux-gnueabi-gcc-6 cross-compiler on Debian Stretch
arm-linux-gnueabi-gcc-7 cross-compiler on Debian Sid

If you have x86-64 Debian Stretch or Sid, install the package "gcc-arm-linux-gnueabi" to reproduce this bug.
Comment 3 ktkachov 2018-01-08 15:40:48 UTC
I've reproduced it on every version from 4.8 onward.
My configuration for my arm-none-linux-gnueabi compiler are: --with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a 

Compiling with -mfloat-abi=softfp -mthumb -march=armv5t -O2 ICEs
Comment 4 sudi 2018-01-12 17:35:06 UTC
What I have observed so far is that the failure occurs based on how the
scheduler (sched1) chooses to schedule the movmem12b instructions (insn 16 in all the cases below). If that
instruction is scheduled a bit later (even by one instruction), its all good!

Even though the movmem12b instruction has a very heavy demand on the registers, shouldn't the allocator and/or the scheduler be able to detect that? Is this a scheduler problem or an allocator problem or neither?

Example Passing cases:

-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1
;; Pressure summary: GENERAL_REGS:8

;;        0--> b  0: i  13 r119=[`*.LC1']                          :(l_a+e_1),l_dc1,l_dc2,l_wb:GENERAL_REGS+1(1)
;;        1--> b  0: i  12 r118=sfp-0x10                           :e_1,e_2,e_3,e_wb:@GENERAL_REGS+1(1)
;;        2--> b  0: i   2 r111=r0                                 :e_1,e_2,e_3,e_wb:@GENERAL_REGS+1(0):model 0
;;        3--> b  0: i  16 {[r118]=[r119];[r118+0x4]=[r119+0x4];[r118+0x8]=[r119+0x8];r120=r118+0xc;r121=r119+0xc;clobber scratch;clobber scratch;clobber scratch;}:(l_a+e_1),l_dc1*2,l_dc2,l_wb:GENERAL_REGS+2(1)
...
,..

Example Failing case:

-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1 -mtune=cortex-m0plus
;; Pressure summary: GENERAL_REGS:8

;;        0--> b  0: i  13 r119=[`*.LC1']                          :core:GENERAL_REGS+1(1)
;;        1--> b  0: i  12 r118=sfp-0x10                           :core:@GENERAL_REGS+1(1)
;;        2--> b  0: i  16 {[r118]=[r119];[r118+0x4]=[r119+0x4];[r118+0x8]=[r119+0x8];r120=r118+0xc;r121=r119+0xc;clobber scratch;clobber scratch;clobber scratch;}:core*4:GENERAL_REGS+2(1)
...
...

Other passing option:
mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1 -mtune=cortex-m7

Other failing option:
-mfloat-abi=softfp -mthumb -march=armv6 m-bug.c -O2 -S -fdump-rtl-sched1 -mtune=cortex-m4
Comment 5 Vladimir Makarov 2018-01-26 23:05:02 UTC
(In reply to sudi from comment #4)
> What I have observed so far is that the failure occurs based on how the
> scheduler (sched1) chooses to schedule the movmem12b instructions (insn 16
> in all the cases below). If that
> instruction is scheduled a bit later (even by one instruction), its all good!
> 
> Even though the movmem12b instruction has a very heavy demand on the
> registers, shouldn't the allocator and/or the scheduler be able to detect
> that? Is this a scheduler problem or an allocator problem or neither?
> 

It is hard to say which pass (scheduler or RA) is responsible for the bug.  Such bug was frequent for older reload pass.  Therefore for i386 the 1st insn scheduling was switched off long ago.

The newer LRA has a new feature to spill hard reg live range in some cases. Unfortunately, it does not always work.  In this case the scheduler extends live ranges of hard registers used for parameter passing.

To improve communication of the scheduler and RA. a few years ago a register-pressure sensitive scheduling was introduced.  It should prevent occurrence of 'unable to find a register to spill' situation and decrease number of spills in RA.

We have two different algorithms for register-pressure sensitive scheduling.  ARM by default uses the 2nd one (MODEL) probably because it results in a better generated code.  The 1st algorithm (WEIGHTED) is less aggressive but more safe IMHO.  So if you add --param sched-pressure-algorithm=1, the problem goes away. 

So I see 3 possible ways to solve the problem:
  1. Fix it in RA which would be very hard.
  2. Fix it in the 2nd pressure-sensitive insn scheduling.  I think Richard Sandiford would do this as an author of the code or at least to say how hard to fix the problem there.
  3. Use more conservative but safe 1st algorithm.  This is the simplest solution.

I'd like to see other people opinions on what approach to use because I have no particular preference except avoiding the 1st approach.
Comment 6 Vladimir Makarov 2018-03-07 23:07:28 UTC
I've decided to fix it in RA because it could help to fix analogous bugs when existing hard reg splitting code fails. This particular bug is more complicated because it happens for non-small reg class.  It requires a lot of changes in LRA and changing its sub-pass flow.

I've been working on this PR for some time and now I can say that it will probably fixed on this week.
Comment 7 Vladimir Makarov 2018-03-09 16:01:36 UTC
Author: vmakarov
Date: Fri Mar  9 16:00:36 2018
New Revision: 258390

URL: https://gcc.gnu.org/viewcvs?rev=258390&root=gcc&view=rev
Log:
2018-03-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR target/83712
	* lra-assigns.c (assign_by_spills): Return a flag of reload
	assignment failure.  Do not process the reload assignment
	failures.  Do not spill other reload pseudos if they has the same
	reg class.
	(lra_assign): Add a return arg.  Set up from the result of
	assign_by_spills call.
	(find_reload_regno_insns, lra_split_hard_reg_for): New functions.
	* lra-constraints.c (split_reg): Add a new arg.  Use it instead of
	usage_insns if it is not NULL.
	(spill_hard_reg_in_range): New function.
	(split_if_necessary, inherit_in_ebb): Pass a new arg to split_reg.
	* lra-int.h (spill_hard_reg_in_range, lra_split_hard_reg_for): New
	function prototypes.
	(lra_assign): Change prototype.
	* lra.c (lra): Add code to deal with fails by splitting hard reg
	live ranges.

2018-03-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR target/83712
	* gcc.target/arm/pr83712.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/arm/pr83712.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-assigns.c
    trunk/gcc/lra-constraints.c
    trunk/gcc/lra-int.h
    trunk/gcc/lra.c
    trunk/gcc/testsuite/ChangeLog
Comment 8 Alexandre Oliva 2018-03-10 13:48:22 UTC
Hey, Vlad, I'm afraid bisection tells me r258390 caused a regression I'm seeing on i686-linux-gnu.  hsa-regalloc now fails to compile in stage2:

../../gcc/hsa-regalloc.c: In function ‘void hsa_regalloc()’:
../../gcc/hsa-regalloc.c:728:1: error: unable to find a register to spill
 }
 ^
../../gcc/hsa-regalloc.c:728:1: error: this is the insn:
(insn 1513 3740 3741 249 (parallel [
            (set (reg/v:SI 1258 [orig:435 ret ] [435])
                (div:SI (reg/v:SI 1258 [orig:435 ret ] [435])
                    (reg:SI 1162)))
            (set (reg:SI 1280 [709])
                (mod:SI (reg/v:SI 1258 [orig:435 ret ] [435])
                    (reg:SI 1162)))
            (clobber (reg:CC 17 flags))
        ]) "../../gcc/hsa-regalloc.c":305 312 {*divmodsi4}
     (expr_list:REG_UNUSED (reg:SI 1280 [709])
        (expr_list:REG_DEAD (reg:SI 1162)
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))
during RTL pass: reload
../../gcc/hsa-regalloc.c:728:1: internal compiler error: in lra_split_hard_reg_for, at lra-assigns.c:1802
0x8c4e069 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        ../../gcc/rtl-error.c:108
0x8aef4e2 lra_split_hard_reg_for()
        ../../gcc/lra-assigns.c:1802
0x8ae909a lra(_IO_FILE*)
        ../../gcc/lra.c:2506
0x8a9cce7 do_reload
        ../../gcc/ira.c:5465
0x8a9d17a execute
        ../../gcc/ira.c:5649
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Makefile:1110: recipe for target 'hsa-regalloc.o' failed
Comment 9 H.J. Lu 2018-03-10 14:03:17 UTC
(In reply to Alexandre Oliva from comment #8)
> Hey, Vlad, I'm afraid bisection tells me r258390 caused a regression I'm
> seeing on i686-linux-gnu.  hsa-regalloc now fails to compile in stage2:

See

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84806

for simple testcases.
Comment 10 Vladimir Makarov 2018-03-10 16:32:53 UTC
Author: vmakarov
Date: Sat Mar 10 16:32:21 2018
New Revision: 258415

URL: https://gcc.gnu.org/viewcvs?rev=258415&root=gcc&view=rev
Log:
2018-03-10  Vladimir Makarov  <vmakarov@redhat.com>

	Reverting patch:
	2018-03-09  Vladimir Makarov  <vmakarov@redhat.com>

	PR target/83712
	* lra-assigns.c (assign_by_spills): Return a flag of reload
	assignment failure.  Do not process the reload assignment
	failures.  Do not spill other reload pseudos if they has the same
	reg class.
	(lra_assign): Add a return arg.  Set up from the result of
	assign_by_spills call.
	(find_reload_regno_insns, lra_split_hard_reg_for): New functions.
	* lra-constraints.c (split_reg): Add a new arg.  Use it instead of
	usage_insns if it is not NULL.
	(spill_hard_reg_in_range): New function.
	(split_if_necessary, inherit_in_ebb): Pass a new arg to split_reg.
	* lra-int.h (spill_hard_reg_in_range, lra_split_hard_reg_for): New
	function prototypes.
	(lra_assign): Change prototype.
	* lra.c (lra): Add code to deal with fails by splitting hard reg
	live ranges.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-assigns.c
    trunk/gcc/lra-constraints.c
    trunk/gcc/lra-int.h
    trunk/gcc/lra.c
Comment 11 Vladimir Makarov 2018-03-13 20:43:21 UTC
Author: vmakarov
Date: Tue Mar 13 20:42:49 2018
New Revision: 258504

URL: https://gcc.gnu.org/viewcvs?rev=258504&root=gcc&view=rev
Log:
2018-03-13  Vladimir Makarov  <vmakarov@redhat.com>

	PR target/83712
	* lra-assigns.c (find_all_spills_for): Ignore uninteresting
	pseudos.
	(assign_by_spills): Return a flag of reload assignment failure.
	Do not process the reload assignment failures.  Do not spill other
	reload pseudos if they has the same reg class.  Update n if
	necessary.
	(lra_assign): Add a return arg.  Set up from the result of
	assign_by_spills call.
	(find_reload_regno_insns, lra_split_hard_reg_for): New functions.
	* lra-constraints.c (split_reg): Add a new arg.  Use it instead of
	usage_insns if it is not NULL.
	(spill_hard_reg_in_range): New function.
	(split_if_necessary, inherit_in_ebb): Pass a new arg to split_reg.
	* lra-int.h (spill_hard_reg_in_range, lra_split_hard_reg_for): New
	function prototypes.
	(lra_assign): Change prototype.
	* lra.c (lra): Add code to deal with fails by splitting hard reg
	live ranges.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-assigns.c
    trunk/gcc/lra-constraints.c
    trunk/gcc/lra-int.h
    trunk/gcc/lra.c
Comment 12 Jakub Jelinek 2018-10-26 10:07:18 UTC
GCC 6 branch is being closed
Comment 13 Richard Biener 2018-11-22 08:51:04 UTC
The series of LRA changes are not going to be backported.
Comment 14 Richard Biener 2019-11-14 10:53:29 UTC
Fixed in GCC 8.