This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/50037] Unroll factor exceeds max trip count
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 11 Aug 2011 11:39:44 +0000
- Subject: [Bug rtl-optimization/50037] Unroll factor exceeds max trip count
- Auto-submitted: auto-generated
- References: <bug-50037-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50037
--- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-11 11:39:44 UTC ---
It probably doesn't help that tree IVOPTs replaces the nice induction variable
with a pointer one:
# BLOCK 2 freq:900
# PRED: ENTRY [100.0%] (fallthru,exec)
count_5 = n_4(D) & 3;
D.2721_17 = (long unsigned int) count_5;
D.2722_16 = D.2721_17 * 4;
D.2723_15 = (long unsigned int) addr_6(D);
D.2724_21 = D.2723_15 + 4;
D.2725_22 = D.2724_21 + D.2722_16;
D.2726_23 = (int *) D.2725_22;
# SUCC: 3 [100.0%] (fallthru,exec)
# BLOCK 3 freq:9100
# PRED: 3 [90.1%] (dfs_back,true,exec) 2 [100.0%] (fallthru,exec)
# addr_18 = PHI <addr_11(3), addr_6(D)(2)>
# sum_20 = PHI <sum_9(3), sum_7(D)(2)>
D.2703_8 = MEM[base: addr_18, offset: 0B];
sum_9 = D.2703_8 + sum_20;
addr_11 = addr_18 + 4;
if (addr_11 != D.2726_23)
goto <bb 3>;
else
goto <bb 4>;
# SUCC: 3 [90.1%] (dfs_back,true,exec) 4 [9.9%] (false,exec)
but even without IVOPTs RTL has difficulties and unrolls 8 times:
Loop 1 is simple:
simple exit 3 -> 4
number of iterations: (reg/v:SI 62 [ count ])
upper bound: -2
Most canonical testcase:
int foo (int n, int *addr)
{
int count, sum;
n = n & 0x3;
for (count = 0; count < n; count++)
sum += addr[count];
return sum;
}
with IVOPTs on (which preserves a count != n exit test and count):
Loop 1 is simple:
simple exit 4 -> 5
does not roll if: (expr_list:REG_DEP_TRUE (le:SI (and:SI (reg/v:SI 86 [ n ])
(const_int 3 [0x3]))
(const_int 0 [0]))
(expr_list:REG_DEP_TRUE (eq:SI (and:SI (reg/v:SI 86 [ n ])
(const_int 3 [0x3]))
(const_int -2147483648 [0xffffffff80000000]))
(nil)))
number of iterations: (subreg:SI (plus:DI (not:DI (reg:DI 83 [ ivtmp.11 ]))
(sign_extend:DI (reg/v:SI 76 [ n ]))) 0)
upper bound: 4294967295
it looks like we do not look for the definition of n in the exit test
(insn 31 29 32 4 (set (reg:CCGC 17 flags)
(compare:CCGC (reg/v:SI 76 [ n ])
(subreg:SI (reg:DI 83 [ ivtmp.11 ]) 0))) t.c:5 6 {*cmpsi_1}
(expr_list:REG_DEAD (reg/v:SI 76 [ n ])
(nil)))
which would show
(insn 23 19 24 2 (parallel [
(set (reg/v:SI 76 [ n ])
(and:SI (reg/v:SI 86 [ n ])
(const_int 3 [0x3])))
(clobber (reg:CC 17 flags))
]) t.c:4 378 {*andsi_1}
(expr_list:REG_DEAD (reg/v:SI 86 [ n ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))