This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/65249] unable to find a register to spill in class 'R0_REGS' when compiling protobuf on sh4


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65249

--- Comment #8 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 34910
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34910&action=edit
Various R0 pre-alloc splits (2)

(In reply to Oleg Endo from comment #7)
> Created attachment 34902 [details]
> Various R0 pre-alloc splits
> 
> I've tried compiling CSiBE with the two split patterns in comment #5 ...
> It's a can of worms.  I'm dumping my current state as a patch here, maybe
> it's helpful in some way.  I ended up adding one r0 split pattern after
> another, but probably some are still missing.
> 
> bzip2 (haven't tried anything else so far) fails with:
> 
> internal compiler error: in spill_failure, at reload1.c:2143
> error: this is the insn:
> (insn 8965 8964 8966 70 (set (mem:SI (plus:SI (reg:SI 0 r0)
>                 (reg:SI 945 [ D.5064 ])) [1 MEM[base: _692, index: _1412,
> offset: 0B]+0 S4 A32])
>         (reg:SI 7 r7 [orig:592 D.5066 ] [592])) bzip2-1.0.2/blocksort.c:564
> 252 {movsi_ie}
>      (expr_list:REG_DEAD (reg:SI 7 r7 [orig:592 D.5066 ] [592])
>         (expr_list:REG_DEAD (reg:SI 0 r0)
>             (nil))))


The problem is this:

#define INDEX_REG_CLASS \
  (!ALLOW_INDEXED_ADDRESS ? NO_REGS : TARGET_SHMEDIA ? GENERAL_REGS : R0_REGS)

which makes reload insist on trying to put reg:SI 945 into r0, although it's
not really needed.

Changing it to '#define INDEX_REG_CLASS  GENERAL_REGS' helps.
With the attached patch bzip2 compiles.

I've taken out the save/restore move insns for r0 in the split patterns because
the trailing r0 restores create artificial/wrong r0 live ranges and trigger
further r0 spill failures down the code path.  Of course this will generate
wrong code in some situations if r0 is already live at the point, but it allows
a first code comparison.

There are code size increases.  One of the reasons is that the reg+reg mem
access patterns always force the first addr op into r0.  This is not always a
good choice.  It would be OK to do so, if there was proper address mode
selection done before RA.  

Generally, it seems reg+reg address modes are not utilized in some cases.  E.g.
this code (without the patch, trunk compiler):
.L113:
        mov.b   @r6+,r1
        dt      r3
        extu.b  r1,r1
        shll2   r1
        add     r5,r1
        mov.l   @r1,r7
        add     #1,r7
        bf/s    .L113
        mov.l   r7,@r1

should be:
        mov     r5,r0
.L113:
        mov.b   @r6+,r1
        dt      r3
        extu.b  r1,r1
        shll2   r1
        mov.l   @(r0,r1),r7
        add     #1,r7
        bf/s    .L113
        mov.l   r7,@(r0,r1)

But that's a whole different topic.
With the patched compiler the loop above gets worse:

.L113:
        mov.b    @r6+,r1
        dt    r3
        mov.w    .L187,r4    <<
        extu.b    r1,r1
        shll2    r1
        add    r15,r4      <<
        add    r4,r1
        mov.l    @r1,r7
        add    #1,r7
        bf/s    .L113
        mov.l    r7,@r1

The marked insns re-calculate the address of something on the stack inside the
loop, although that address already has been calculated before the loop.  Maybe
mem aliasing info is dropped/wrong somewhere in the split patterns, I don't
know...

At the moment, my impression is that pre-allocating r0 could work and result in
some improvements, probably even with LRA.  But it's a lot more effort than
simply adding the split patterns as in the patches.  For instance, r0 live
ranges have to be determined in order to place r0 save/restore move insns and
r0 uses should be propagated into surrounding insns to reduce the amount of
move insns to/from r0 which RA has to resolve.  Additionally, an address mode
selection pass needs to be in place which figures out how to assign index/base
regs for reg+reg mems etc.

I guess all in all this would end up being a highly specialized RA for the
single register r0.

All of that is definitely not going to help GCC 4.9 or 5.

Kaz, maybe you have an idea how to quick fix this problem?  I'm too obsessed
with the r0-prealloc idea at the moment and can't see clearly.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]