alpha loop bug #2

Richard Henderson rth@cygnus.com
Sun Jan 31 20:04:00 GMT 1999


Configured for alphapca56-linux, compile ~rth/reload1.i with
./cc1 -O2 -dp -dL -fno-schedule-insns -fno-schedule-insns2

I believe I have your latest patches (derived_from bits) installed.

In choose_reload_regs, there is the loop

  for (i = 0; i < reload_n_operands; i++)
    {
      CLEAR_HARD_REG_SET (reload_reg_used_in_output[i]);
      CLEAR_HARD_REG_SET (reload_reg_used_in_input[i]);
      CLEAR_HARD_REG_SET (reload_reg_used_in_input_addr[i]);
      CLEAR_HARD_REG_SET (reload_reg_used_in_inpaddr_addr[i]);
      CLEAR_HARD_REG_SET (reload_reg_used_in_output_addr[i]);
      CLEAR_HARD_REG_SET (reload_reg_used_in_outaddr_addr[i]);
    }

for it we get the code

        lda $8,reload_reg_used_in_outaddr_addr   # 4915 movsi+2/6
        lda $7,reload_reg_used_in_output_addr    # 4920 movsi+2/6
        lda $6,reload_reg_used_in_inpaddr_addr   # 4925 movsi+2/6
        lda $5,reload_reg_used_in_input_addr     # 4930 movsi+2/6
        lda $4,reload_reg_used_in_input		 # 4935 movsi+2/6
        lda $3,reload_reg_used_in_output         # 4940 movsi+2/6
        lda $2,1688($30)         		 # 4944 adddi3/3
        .align 4
$L1898:
        ldt $f10,0($3)  	 # 779  movsi+2/11      [length = 4]
        stt $f10,160($12)        # 780  movsi+2/12      [length = 4]
        ldt $f10,0($4)  	 # 795  movsi+2/11      [length = 4]
        stt $f10,80($12)         # 796  movsi+2/12      [length = 4]
        ldt $f10,0($5)  	 # 811  movsi+2/11      [length = 4]
        stt $f10,-240($12)       # 812  movsi+2/12      [length = 4]
        ldt $f10,0($6)  	 # 827  movsi+2/11      [length = 4]
        stt $f10,-160($12)       # 828  movsi+2/12      [length = 4]
        ldt $f10,0($7)  	 # 843  movsi+2/11      [length = 4]
        stt $f10,-80($12)        # 844  movsi+2/12      [length = 4]
        subq $2,160,$12 	 # 851  adddi3/2        [length = 4]
        ldt $f10,0($8)  	 # 859  movsi+2/11      [length = 4]
        stt $f10,-160($2)        # 860  movsi+2/12      [length = 4]
        addq $8,8,$8    	 # 4912 adddi3/1        [length = 4]
        addq $7,8,$7    	 # 4917 adddi3/1        [length = 4]
        addq $6,8,$6    	 # 4922 adddi3/1        [length = 4]
        addq $5,8,$5    	 # 4927 adddi3/1        [length = 4]
        addq $4,8,$4    	 # 4932 adddi3/1        [length = 4]
        addq $3,8,$3    	 # 4937 adddi3/1        [length = 4]
        addq $2,8,$2    	 # 4942 adddi3/1        [length = 4]
        addl $11,1,$11  	 # 868  addsi3+3/1      [length = 4]
        cmplt $11,$22,$1         # 759  sqrtdf2+1       [length = 4]
        bne $1,$L1898   	 # 760  umindi3+2       [length = 4]

The is the bug I presented earlier -- $12 is used before initialized.

I believe the problem to be in that giv combinations are not used in
determining the giv's lifetime.

We have

giv at 860 combined with giv at 851
giv at 844 combined with giv at 851
giv at 828 combined with giv at 851
giv at 812 combined with giv at 851
giv at 796 combined with giv at 851
giv at 780 combined with giv at 851

Note that insn 851 is the DEST_REG giv related to the last of the
sequence of stores, and that combine_givs_p decided to express all
of the stores in terms of it.

We then have

>> giv at 787 derived from 771 as (reg:DI 336)
>> giv at 803 derived from 771 as (reg:DI 345)
>> giv at 819 derived from 771 as (reg:DI 354)
>> giv at 835 derived from 771 as (reg:DI 363)
>> giv at 851 derived from 771 as (reg:DI 372)

This adjusts all of the DEST_REG givs to express them in terms of
the first of the sequence.  (I'm a bit surprised, given your goal,
that they are not derived from each other, but all from the first.)
The problem being that all of the DEST_ADDR givs are still derived
from 851.  Which creates the uses before initialization.

I'm unsure the best way to resolve the dilema.  We could, for each
combined giv, go through and find the first use, and so get all the 
lifetimes correct.  But I'm beginning to think this approach of
combining then recombining is flawed.  If we want to do more
intelligent combinations, then we should do that the first time.

Thoughts?


r~



More information about the Gcc-bugs mailing list