Too much pressure on R0 for SH4

Joern Rennecke joern.rennecke@superh.com
Mon Aug 9 16:16:00 GMT 2004


> 
> --=-=-=
> 
> Hi Joern, Hi Alex,
> 
>   I have come across a case where GCC (from today's sources) is unable
>   to find a spill register in class R0_REG for an sh4-linux target.
>   It was building one of the source files in libstdc++-v3 with this
>   command line:
> 
>   cc1plus -fpreprocessed demangle.ii -quiet -dumpbase demangle.cc \
>    -auxbase-strip .libs/demangle.o -g -O2 -Wall -W -Wwrite-strings \
>    -Wcast-qual -fno-implicit-templates -fdiagnostics-show-location=once \
>    -ffunction-sections -fdata-sections -fimplicit-templates -fPIC \
>    -o demangle.s
> 
>  (Preprocessed file demangle.ii is attached).
>  
>   If I change the defintion of OVERRIDE_OPTIONS in gcc/config/sh/sh.h
>   so that flag_schedule_insns is set to 0 when SMALL_REGISTER_CLASSES
>   is true, regardless of whether TARGET_HARD_SH4 is true or not then
>   the problem goes away.
> 
>   Is this the correct way to solve this problem ?

No, the problem is that the cfg based jump optimization has again
torn apart an EBB during which the return value of a function is live.

The EBB first appears in the demangle.cc.03.eh dump:

[at the end of bb 7]:

(call_insn 94 93 1505 7 /work/builds/devo/branches/sh2a-040305-branch/sh4-linux/sources/tools-src/libstdc++-v3/src/demangle.cc:123 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref/i:SI ("_ZN9__gnu_cxx9demangler7sessionISaIcEE15decode_encodingERSsPKciRKNS0_22implementation_detailsE") [flags 0x1] <function_decl 0x40951e0c decode_encoding>) [0 S4 A32])
                    (const_int 0 [0x0])))
            (use (reg:PSI 151 ))
            (use (reg:SI 12 r12))
            (clobber (reg:SI 146 pr))
            (clobber (scratch:SI))
        ]) -1 (nil)
    (expr_list:REG_EH_REGION (const_int 2 [0x2])
        (nil))
    (expr_list (use (reg:SI 7 r7))
        (expr_list (use (reg:SI 6 r6))
            (expr_list (use (reg:SI 5 r5))
                (expr_list (use (reg:SI 4 r4))
                    (nil))))))

(note 1505 94 95 8 [bb 8] NOTE_INSN_BASIC_BLOCK)

(insn 95 1505 97 8 /work/builds/devo/branches/sh2a-040305-branch/sh4-linux/sources/tools-src/libstdc++-v3/src/demangle.cc:123 (set (reg/v:SI 172 [ cnt ])
        (reg:SI 0 r0)) -1 (nil)
    (nil))

The exception handling logic dictates that call_insn 94 ends a basic block,
so insn 95, which copies the return value, is in another basic block.
But since both basic blocks are part of the same extended basic block,
and the instructions are otherwise adjacent, that's still fine with reload.

However, the next dump, demangle.cc.04.jump, shows us that the EBB has
been butchered:

(call_insn 94 93 1618 7 /work/builds/devo/branches/sh2a-040305-branch/sh4-linux/sources/tools-src/libstdc++-v3/src/demangle.cc:123 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref/i:SI ("_ZN9__gnu_cxx9demangler7sessionISaIcEE15decode_encodingERSsPKciRKNS0_22implementation_detailsE") [flags 0x1] <function_decl 0x40951e0c decode_encoding>) [0 S4 A32])
                    (const_int 0 [0x0])))
            (use (reg:PSI 151 ))
            (use (reg:SI 12 r12))
            (clobber (reg:SI 146 pr))
            (clobber (scratch:SI))
        ]) -1 (nil)
    (expr_list:REG_EH_REGION (const_int 2 [0x2])
        (nil))
    (expr_list (use (reg:SI 7 r7))
        (expr_list (use (reg:SI 6 r6))
            (expr_list (use (reg:SI 5 r5))
                (expr_list (use (reg:SI 4 r4))
                    (nil))))))
;; End of basic block 7, registers live:
 (nil)

;; Start of basic block 8, registers live: (nil)
(note 1618 94 1620 8 [bb 8] NOTE_INSN_BASIC_BLOCK)

(jump_insn 1620 1618 1621 8 (set (pc)
        (label_ref 1619)) -1 (nil)
    (nil))
;; End of basic block 8, registers live:
 (nil)

(barrier 1621 1620 1610)

;; Start of basic block 9, registers live: (nil)
(code_label/s 1610 1621 1613 9 5461 "" [1 uses])

(note 1613 1610 1611 9 [bb 9] NOTE_INSN_BASIC_BLOCK)

(insn 1611 1613 1612 9 (set (reg:SI 313 [ save_eptr.3735 ])
        (reg:SI 4 r4)) -1 (nil)
    (nil))


You have to find out what is tearing the call apart from the return-value
copy, and stop it from doing that.

jump_insn 1620 is generated here:

#0  make_jump_insn_raw (pattern=0x41118330) at ../../srcw/gcc/emit-rtl.c:3437
#1  0x082536d1 in emit_jump_insn (x=0x41118330)
    at ../../srcw/gcc/emit-rtl.c:4550
#2  0x082f0162 in gen_jump (operand0=0x4062e264) at insn-emit.c:10224
#3  0x08210854 in force_nonfallthru_and_redirect (e=0x40d7ab40, 
    target=0x41071a6c) at ../../srcw/gcc/cfgrtl.c:1145
#4  0x082108ba in force_nonfallthru (e=0x40d7ab40)
    at ../../srcw/gcc/cfgrtl.c:1166
#5  0x08209403 in merge_blocks_move (e=0x40b825c0, b=0x41071a6c, c=0x41062bc8, 
    mode=41) at ../../srcw/gcc/cfgcleanup.c:860
#6  0x0820ac0b in try_optimize_cfg (mode=41)
    at ../../srcw/gcc/cfgcleanup.c:1895
#7  0x0820aeca in cleanup_cfg (mode=41) at ../../srcw/gcc/cfgcleanup.c:2073
#8  0x083e66e6 in rest_of_handle_jump2 () at ../../srcw/gcc/passes.c:1568
#9  0x083e6b75 in rest_of_compilation () at ../../srcw/gcc/passes.c:1791
#10 0x08181db9 in execute_one_pass (pass=0x852eac0)
    at ../../srcw/gcc/tree-optimize.c:453
#11 0x08181e3f in execute_pass_list (pass=0x852eac0)
    at ../../srcw/gcc/tree-optimize.c:478
#12 0x081820a7 in tree_rest_of_compilation (fndecl=0x408f4000, 
    nested_p=0 '\000') at ../../srcw/gcc/tree-optimize.c:556
#13 0x08109519 in expand_body (fn=0x408f4000)
    at ../../srcw/gcc/cp/semantics.c:2882

I can make gdb stop just before force_nonfallthru is called like this:

6   breakpoint     keep y   0x08209b44 in merge_blocks_move
                                       at ../../srcw/gcc/cfgcleanup.c:860
        stop only if cfun->emit->x_cur_insn_uid >= 1620 - 10 && input_location.line == 166

Breakpoint 6, merge_blocks_move (e=0x40b825c0, b=0x41071a6c, c=0x41062bc8, 
    mode=41) at ../../srcw/gcc/cfgcleanup.c:860
860               bb = force_nonfallthru (b_fallthru_edge);
(gdb) call debug_rtx(b_fallthru_edge->src->end_)
(call_insn 94 93 1505 7 /work/builds/devo/branches/sh2a-040305-branch/sh4-linux/sources/tools-src/libstdc++-v3/src/demangle.cc:123 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref/i:SI ("_ZN9__gnu_cxx9demangler7sessionISaIcEE15decode_encodingERSsPKciRKNS0_22implementation_detailsE") [flags 0x1] <function_decl 0x40951e0c decode_encoding>) [0 S4 A32])
                    (const_int 0 [0x0])))
            (use (reg:PSI 151 ))
            (use (reg:SI 12 r12))
            (clobber (reg:SI 146 pr))
            (clobber (scratch:SI))
        ]) -1 (nil)
    (expr_list:REG_EH_REGION (const_int 2 [0x2])
        (nil))
    (expr_list (use (reg:SI 7 r7))
        (expr_list (use (reg:SI 6 r6))
            (expr_list (use (reg:SI 5 r5))
                (expr_list (use (reg:SI 4 r4))
                    (nil))))))
(gdb) call debug_rtx(b_fallthru_edge->dest->head_)
(note 1505 94 95 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
(gdb) call debug_rtx_list(b_fallthru_edge->dest->head_,3)
(note 1505 94 95 8 [bb 8] NOTE_INSN_BASIC_BLOCK)

(insn 95 1505 97 8 /work/builds/devo/branches/sh2a-040305-branch/sh4-linux/sources/tools-src/libstdc++-v3/src/demangle.cc:123 (set (reg/v:SI 172 [ cnt ])
        (reg:SI 0 r0)) -1 (nil)
    (nil))

(note 97 95 98 8 ("/work/builds/devo/branches/sh2a-040305-branch/sh4-linux/build-devo/branches/sh2a-040305/tools-stage2/sh4-linux/libstdc++-v3/include/bits/demangle.h") 333)

(gdb) out c->index   
10(gdb) out b->index
8(gdb) 

So, definitely, this cfgcleanup.c code is at fault:

      /* Otherwise, we're going to try to move C after B.  If C does
         not have an outgoing fallthru, then it can be moved
         immediately after B without introducing or modifying jumps.  */
      if (! c_has_outgoing_fallthru)
        {
          merge_blocks_move_successor_nojumps (b, c);
          return next == ENTRY_BLOCK_PTR ? next->next_bb : next;
        }

      /* If B does not have an incoming fallthru, then it can be moved
         immediately before C without introducing or modifying jumps.
         C cannot be the first block, so we do not have to worry about
         accessing a non-existent block.  */

      if (b_has_incoming_fallthru)
        {
          basic_block bb;

          if (b_fallthru_edge->src == ENTRY_BLOCK_PTR)
            return NULL;
          bb = force_nonfallthru (b_fallthru_edge);



More information about the Gcc-bugs mailing list