This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/33222] New: failing rtl iv analysis (maybe due to df)
- From: "dorit at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 29 Aug 2007 07:01:51 -0000
- Subject: [Bug rtl-optimization/33222] New: failing rtl iv analysis (maybe due to df)
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
In the testcase below, after the inner-loop gets completely unrolled, the
enclosing i-loop does not get unrolled because of failure to analyze the loop
iv, possibly due to a bug in df:
#define N 40
#define M 10
float in[N+M], coeff[M], out[N];
void fir (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j++) {
diff += in[j+i]*coeff[j];
}
out[i] = diff;
}
}
Compiler options used:
/Develop/mainline-dn1/bin/gcc -O3 -maltivec -funroll-loops
vect-outer-fir2-kernel.c -S --param max-completely-peeled-insns=5000 --param
max-completely-peel-times=40 -fdump-tree-all -da -ftree-vectorize
(without -ftree-vectorize the i-loop does get unrolled).
Detailed description and discussion here:
http://gcc.gnu.org/ml/gcc/2007-08/msg00482.html
Here are the relevant pieces from the RTL dump (at loop3_unroll):
bb2:
(insn 40 39 41 2 vect-outer-fir2-kernel.c:38 (set (reg:DI 187 [ ivtmp.59 ])
(mem/u/c:DI (plus:DI (reg:DI 2 2)
(const:DI (minus:DI (symbol_ref/u:DI ("*.LC4") [flags 0x2])
(symbol_ref:DI ("*.LCTOC1"))))) [7 S8 A8])) 344
{*movdi_internal64} (expr_list:REG_EQUAL (symbol_ref:DI ("fir_out") [flags
0x80] <var_decl 0xf7d571c0 fir_out>)
(nil)))
...
(insn 289 288 68 2 (set (reg/f:DI 319)
(plus:DI (reg:DI 187 [ ivtmp.59 ])
(const_int 160 [0xa0]))) 80 {*adddi3_internal1} (expr_list:REG_DEAD
(reg:DI 2 2)
(expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("fir_out")
[flags 0x80] <var_decl 0xf7d571c0 fir_out>)
(const_int 160 [0xa0])))
(nil))))
...
loop:
bb3 (loop-header):
...
(insn 255 254 256 3 vect-outer-fir2-kernel.c:47 (set (reg:DI 187 [ ivtmp.59 ])
(plus:DI (reg:DI 187 [ ivtmp.59 ])
(const_int 16 [0x10]))) 80 {*adddi3_internal1} (nil))
...
(insn 265 263 266 3 vect-outer-fir2-kernel.c:47 (set (reg:CC 316)
(compare:CC (reg:DI 187 [ ivtmp.59 ])
(reg/f:DI 319))) 459 {*cmpdi_internal1} (expr_list:REG_EQUAL
(compare:CC (reg:DI 187 [ ivtmp.59 ])
(const:DI (plus:DI (symbol_ref:DI ("fir_out") [flags 0x80]
<var_decl 0xf7d571c0 fir_out>)
(const_int 160 [0xa0]))))
(nil)))
Below is the output of df_ref_debug for adef in each iteration of the loop in
latch_dominating_def:
d40 reg 187 bb 3 insn 255 flag 0x0 type 0x0 loc 0xf7da4608(0xf7d9a4e0) chain {
}
d93 reg 187 bb 2 insn 40 flag 0x0 type 0x0 loc 0xf7d89cc8(0xf7d9a4e0) chain { }
For both the bitmap is set.
--
Summary: failing rtl iv analysis (maybe due to df)
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dorit at gcc dot gnu dot org
GCC build triplet: powerpc64-linux
GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33222