Bug 59477 - [4.8 Regression] ICE: in assign_by_spills, at lra-assigns.c:1281 with -O
Summary: [4.8 Regression] ICE: in assign_by_spills, at lra-assigns.c:1281 with -O
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: 4.8.3
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code
Depends on:
Blocks:
 
Reported: 2013-12-11 21:44 UTC by Zdenek Sojka
Modified: 2014-01-22 21:02 UTC (History)
7 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build:
Known to work: 4.7.4
Known to fail: 4.8.3, 4.9.0
Last reconfirmed: 2013-12-13 00:00:00


Attachments
partially reduced testcase (594 bytes, text/x-csrc)
2013-12-11 21:44 UTC, Zdenek Sojka
Details
gcc49-pr59477.patch (3.98 KB, patch)
2014-01-15 14:52 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2013-12-11 21:44:35 UTC
Created attachment 31422 [details]
partially reduced testcase

Compiler output:
$ gcc -O testcase.C         
testcase.C: In function 'void test1(_Bit_iterator)':
testcase.C:90:1: internal compiler error: in assign_by_spills, at lra-assigns.c:1281
 }
 ^
0xb697f3 assign_by_spills
        /mnt/svn/gcc-trunk/gcc/lra-assigns.c:1281
0xb6a57b lra_assign()
        /mnt/svn/gcc-trunk/gcc/lra-assigns.c:1439
0xb65ac2 lra(_IO_FILE*)
        /mnt/svn/gcc-trunk/gcc/lra.c:2361
0xb13656 do_reload
        /mnt/svn/gcc-trunk/gcc/ira.c:5455
0xb13656 rest_of_handle_reload
        /mnt/svn/gcc-trunk/gcc/ira.c:5584
0xb13656 execute
        /mnt/svn/gcc-trunk/gcc/ira.c:5613
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.


$ /mnt/svn/gcc-trunk/binary-latest/bin/gcc -v           
Using built-in specs.
COLLECT_GCC=/mnt/svn/gcc-trunk/binary-latest/bin/gcc
COLLECT_LTO_WRAPPER=/mnt/svn/gcc-trunk/binary-205881-lto-fortran-checking-yes-rtl-df/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /mnt/svn/gcc-trunk//configure --enable-checking=yes,rtl,df --enable-languages=c,c++,lto,fortran --prefix=/mnt/svn/gcc-trunk/binary-205881-lto-fortran-checking-yes-rtl-df/ --without-cloog --without-ppl
Thread model: posix
gcc version 4.9.0 20131211 (experimental) (GCC) 


Tested revisions:
r205881 - crash
4.8 r204890 - crash
4.7 r204889 - OK
Comment 1 Tom de Vries 2013-12-13 13:54:38 UTC
I could reproduce this.
Comment 2 Vladimir Makarov 2013-12-13 19:21:41 UTC
 Reload pass gives

b.i: In function ‘void test1(_Bit_iterator)’:
b.i:90:1: error: unable to find a register to spill in class ‘CREG’
 }
 ^
b.i:90:1: error: this is the insn:
(insn 29 43 30 2 (parallel [
            (set (reg:DI 37 r8)
                (ashift:DI (reg:DI 37 r8 [92])
                    (subreg:QI (reg:SI 90 [ _M_offset ]) 0)))
            (clobber (reg:CC 17 flags))
        ]) b.i:54 528 {*ashldi3_1}
     (expr_list:REG_DEAD (reg:DI 37 r8 [92])
        (expr_list:REG_DEAD (reg:SI 90 [ _M_offset ])
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))
b.i:90:1: internal compiler error: in spill_failure, at reload1.c:2106
0xaf796a _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        /home/cygnus/vmakarov/build1/trunk2/gcc/gcc/rtl-error.c:109
0xaf63d3 spill_failure
        /home/cygnus/vmakarov/build1/trunk2/gcc/gcc/reload1.c:2106
0xaf63d3 find_reload_regs
        /home/cygnus/vmakarov/build1/trunk2/gcc/gcc/reload1.c:2032
0xaf63d3 select_reload_regs
...

Before combiner we have:

   28: cx:DI=0
   29: r8:SI=r97:DI#0
      REG_DEAD r97:DI
   30: dx:DI=r103:DI
      REG_DEAD r103:DI
      REG_EQUAL frame:DI-0x2
   31: si:DI=r104:DI
      REG_DEAD r104:DI
      REG_EQUAL frame:DI-0x1
   32: di:DI=r98:DI
      REG_DEAD r98:DI
      REG_EQUAL frame:DI-0x20
   33: call [`_Z9__find_ifI16reverse_iteratorI13_Bit_iteratorE17_Iter_equals_iterIS2_EET_S5_S5_T0_'] argc:0
      REG_DEAD r8:SI
      REG_DEAD di:DI
      REG_DEAD si:DI
      REG_DEAD cx:DI
      REG_DEAD dx:DI

After combiner we have:

   28: cx:DI=0
   29: {r8:DI=r92:DI<<r90:SI#0;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r90:SI
      REG_DEAD r92:DI
   30: dx:DI=r103:DI
      REG_DEAD r103:DI
      REG_EQUAL frame:DI-0x2
   31: si:DI=r104:DI
      REG_DEAD r104:DI
      REG_EQUAL frame:DI-0x1
   32: di:DI=r98:DI
      REG_DEAD r98:DI
      REG_EQUAL frame:DI-0x20
   33: call [`_Z9__find_ifI16reverse_iteratorI13_Bit_iteratorE17_Iter_equals_iterIS2_EET_S5_S5_T0_'] argc:0
      REG_DEAD r8:SI
      REG_DEAD di:DI
      REG_DEAD si:DI
      REG_DEAD cx:DI
      REG_DEAD dx:DI

As r90 in inns 29 needs to be in cx and cx is already alive, neither reload nor LRA can do anything.

I'll still investigate what can be done in LRA to fix it but I am not sure I'll succeed.
Comment 3 Jakub Jelinek 2014-01-02 11:17:28 UTC
Started with r175063, but I guess that just made the problem no longer latent.
More reduced testcase:

struct A
{
  unsigned *a, b;
  A (unsigned x) : a (), b (x) {}
};

struct B
{
  B (int);
  B (const B &) {}
};

B bar (B, B, A);
int v;

void
foo ()
{
  B c = 0;
  bar (c, c, A (1ULL << v));
}

I don't think this is a RA bug, the problem is I think that combine changes:
 (insn 20 19 21 2 (set (reg:DI 2 cx)
         (const_int 0 [0])) pr59477.C:20 62 {*movdi_internal_rex64}
      (nil))
 
-(insn 21 20 22 2 (set (reg:SI 37 r8)
-        (subreg:SI (reg:DI 64) 0)) pr59477.C:20 64 {*movsi_internal}
-     (expr_list:REG_DEAD (reg:DI 64)
-        (nil)))
+(insn 21 20 22 2 (parallel [
+            (set (reg:DI 37 r8)
+                (ashift:DI (reg:DI 65)
+                    (subreg:QI (reg:SI 63 [ v ]) 0)))
+            (clobber (reg:CC 17 flags))
+        ]) pr59477.C:20 503 {*ashldi3_1}
+     (expr_list:REG_UNUSED (reg:CC 17 flags)
+        (expr_list:REG_DEAD (reg:SI 63 [ v ])
+            (expr_list:REG_DEAD (reg:DI 65)
+                (nil)))))

and thus moves a complex operation (which in this case will need %ecx register)
to a place where hard regs need to be live across that and thus makes it hard for RA to do it's job.

Usually combiner will reject such combination on i?86/x86_64 in the fn argument hard reg setup insns because of cant_combine_insn_p, but here r8 isn't likely spilled.  Also, typically cx is initialized after r8, because hard register arguments are initialized from last argument to first.  But in this case there is one argument passed in rcx/r8 pair and within a single argument we typically initialize the first half before the second one.

As no arguments are passed in more than two hard registers, supposedly
the rcx/r8 case is the only problematic one for function arguments, so perhaps
ix86_legitimate_combined_insn could if the output is hard register r8 look for previous insn if it sets rcx and in that case reject the combination.

In general, e.g. for local register asm variables (e.g. as used for various
syscall inline asm wrappers etc.), I wonder if it isn't unsafe to combine anything across instructions that set likely spilled hard registers, so perhaps
alternative would be to take that into account when creating LOG_LINKS in
create_log_links.  As clearing the whole next_use array upon seeing
  if (HARD_REGISTER_NUM_P (regno)
      && ! TEST_HARD_REG_BIT (fixed_reg_set, regno)
      && targetm.class_likely_spilled_p (REGNO_REG_CLASS (regno)))
early in the
  for (def_vec = DF_INSN_DEFS (insn); *def_vec; def_vec++)
loop might be too expensive, perhaps we'd need to add another array with a tick count when next_use was set and increment the tick count on the above mentioned condition, and consider uses with tick count smaller than the current one as next_use[] == NULL.

Thoughts on this?
Comment 4 Eric Botcazou 2014-01-02 15:45:51 UTC
> Usually combiner will reject such combination on i?86/x86_64 in the fn
> argument hard reg setup insns because of cant_combine_insn_p, but here r8
> isn't likely spilled.  Also, typically cx is initialized after r8, because
> hard register arguments are initialized from last argument to first.  But in
> this case there is one argument passed in rcx/r8 pair and within a single
> argument we typically initialize the first half before the second one.

Is rcx likely spilled?  If so, the situation looks similar to the one dealt with by likely_spilled_retval_p so maybe the machinery could be extended to argument registers.
Comment 5 Jakub Jelinek 2014-01-02 16:02:00 UTC
(In reply to Eric Botcazou from comment #4)
> > Usually combiner will reject such combination on i?86/x86_64 in the fn
> > argument hard reg setup insns because of cant_combine_insn_p, but here r8
> > isn't likely spilled.  Also, typically cx is initialized after r8, because
> > hard register arguments are initialized from last argument to first.  But in
> > this case there is one argument passed in rcx/r8 pair and within a single
> > argument we typically initialize the first half before the second one.
> 
> Is rcx likely spilled?  If so, the situation looks similar to the one dealt
> with by likely_spilled_retval_p so maybe the machinery could be extended to
> argument registers.

Yes, rcx is likely spilled, just r8 isn't and there is rcx = 0; r8 = pseudo1;
in that order before combine and combine is changing it to rcx = 0; r8 = pseudo2 << pseudo3;
Comment 6 Jakub Jelinek 2014-01-02 16:50:45 UTC
A problem with doing this in create_log_links with a simpler tick counter might be that it would prevent any combination across most of the CALL_INSNs.
Dunno how often that happens in practice though.
So perhaps for calls we'd need to do something smarter, like if we see a CALL_INSN, look at CALL_INSN_FUNCTION_USAGE and if there are any likely spilled
regs used for arguments or return value, try to find the range of insns between
last copying of return value reg (if likely spilled) into pseudo(s), or CALL_INSN if not returning value or return value is not likely spilled, and
first insn to set up call argument in likely spilled argument.  We could then treat the whole range of insns as a range that clears next_use rather than setting it to insns in that range, and ignore the above mentioned special handling of likely spilled hard reg setters in that range.
Comment 7 Jakub Jelinek 2014-01-15 14:52:46 UTC
Created attachment 31844 [details]
gcc49-pr59477.patch

Untested fix.
Comment 8 Vladimir Makarov 2014-01-22 19:39:19 UTC
Author: vmakarov
Date: Wed Jan 22 19:38:47 2014
New Revision: 206938

URL: http://gcc.gnu.org/viewcvs?rev=206938&root=gcc&view=rev
Log:
2014-01-22  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/59477
	* lra-constraints.c (inherit_in_ebb): Process call for living hard
	regs.  Update reloads_num and potential_reload_hard_regs for all
	insns.

2014-01-22  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/59477
	* g++.dg/pr59477.C: New.


Added:
    trunk/gcc/testsuite/g++.dg/pr59477.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-constraints.c
    trunk/gcc/testsuite/ChangeLog
Comment 9 Jeffrey A. Law 2014-01-22 21:02:38 UTC
Fixed by Vlad's commit on the trunk.