alpha loop bug #2
Richard Henderson
rth@cygnus.com
Sun Jan 31 20:04:00 GMT 1999
Configured for alphapca56-linux, compile ~rth/reload1.i with
./cc1 -O2 -dp -dL -fno-schedule-insns -fno-schedule-insns2
I believe I have your latest patches (derived_from bits) installed.
In choose_reload_regs, there is the loop
for (i = 0; i < reload_n_operands; i++)
{
CLEAR_HARD_REG_SET (reload_reg_used_in_output[i]);
CLEAR_HARD_REG_SET (reload_reg_used_in_input[i]);
CLEAR_HARD_REG_SET (reload_reg_used_in_input_addr[i]);
CLEAR_HARD_REG_SET (reload_reg_used_in_inpaddr_addr[i]);
CLEAR_HARD_REG_SET (reload_reg_used_in_output_addr[i]);
CLEAR_HARD_REG_SET (reload_reg_used_in_outaddr_addr[i]);
}
for it we get the code
lda $8,reload_reg_used_in_outaddr_addr # 4915 movsi+2/6
lda $7,reload_reg_used_in_output_addr # 4920 movsi+2/6
lda $6,reload_reg_used_in_inpaddr_addr # 4925 movsi+2/6
lda $5,reload_reg_used_in_input_addr # 4930 movsi+2/6
lda $4,reload_reg_used_in_input # 4935 movsi+2/6
lda $3,reload_reg_used_in_output # 4940 movsi+2/6
lda $2,1688($30) # 4944 adddi3/3
.align 4
$L1898:
ldt $f10,0($3) # 779 movsi+2/11 [length = 4]
stt $f10,160($12) # 780 movsi+2/12 [length = 4]
ldt $f10,0($4) # 795 movsi+2/11 [length = 4]
stt $f10,80($12) # 796 movsi+2/12 [length = 4]
ldt $f10,0($5) # 811 movsi+2/11 [length = 4]
stt $f10,-240($12) # 812 movsi+2/12 [length = 4]
ldt $f10,0($6) # 827 movsi+2/11 [length = 4]
stt $f10,-160($12) # 828 movsi+2/12 [length = 4]
ldt $f10,0($7) # 843 movsi+2/11 [length = 4]
stt $f10,-80($12) # 844 movsi+2/12 [length = 4]
subq $2,160,$12 # 851 adddi3/2 [length = 4]
ldt $f10,0($8) # 859 movsi+2/11 [length = 4]
stt $f10,-160($2) # 860 movsi+2/12 [length = 4]
addq $8,8,$8 # 4912 adddi3/1 [length = 4]
addq $7,8,$7 # 4917 adddi3/1 [length = 4]
addq $6,8,$6 # 4922 adddi3/1 [length = 4]
addq $5,8,$5 # 4927 adddi3/1 [length = 4]
addq $4,8,$4 # 4932 adddi3/1 [length = 4]
addq $3,8,$3 # 4937 adddi3/1 [length = 4]
addq $2,8,$2 # 4942 adddi3/1 [length = 4]
addl $11,1,$11 # 868 addsi3+3/1 [length = 4]
cmplt $11,$22,$1 # 759 sqrtdf2+1 [length = 4]
bne $1,$L1898 # 760 umindi3+2 [length = 4]
The is the bug I presented earlier -- $12 is used before initialized.
I believe the problem to be in that giv combinations are not used in
determining the giv's lifetime.
We have
giv at 860 combined with giv at 851
giv at 844 combined with giv at 851
giv at 828 combined with giv at 851
giv at 812 combined with giv at 851
giv at 796 combined with giv at 851
giv at 780 combined with giv at 851
Note that insn 851 is the DEST_REG giv related to the last of the
sequence of stores, and that combine_givs_p decided to express all
of the stores in terms of it.
We then have
>> giv at 787 derived from 771 as (reg:DI 336)
>> giv at 803 derived from 771 as (reg:DI 345)
>> giv at 819 derived from 771 as (reg:DI 354)
>> giv at 835 derived from 771 as (reg:DI 363)
>> giv at 851 derived from 771 as (reg:DI 372)
This adjusts all of the DEST_REG givs to express them in terms of
the first of the sequence. (I'm a bit surprised, given your goal,
that they are not derived from each other, but all from the first.)
The problem being that all of the DEST_ADDR givs are still derived
from 851. Which creates the uses before initialization.
I'm unsure the best way to resolve the dilema. We could, for each
combined giv, go through and find the first use, and so get all the
lifetimes correct. But I'm beginning to think this approach of
combining then recombining is flawed. If we want to do more
intelligent combinations, then we should do that the first time.
Thoughts?
r~
More information about the Gcc-bugs
mailing list