[PATCH]Enable elimination of IV use with unsigned type candidate
Bin Cheng
bin.cheng@arm.com
Mon Jun 23 09:49:00 GMT 2014
Hi,
For below simplified case:
#define LEN (32000)
__attribute__((aligned(16))) float a[LEN],b[LEN];
int foo (int M)
{
for (int i = 0; i < M; i++)
a[i+M] = a[i] + b[i];
}
Compiling it with command like:
$ aarch64-elf-gcc -O3 -S foo.c -o foo.S -std=c99
The assembly code of vectorized loop is in below form:
mov x1, 0
mov w2, 0
.L4:
ldr q0, [x1, x3]
add w2, w2, 1
ldr q1, [x1, x4]
cmp w2, w5
fadd v0.4s, v0.4s, v1.4s
str q0, [x6, x1]
add x1, x1, 16
bcc .L4
Induction variable w2 is unnecessary and can be eliminated with x1. This is
safe because X1 will never overflow during all iterations of loop. The
optimal assembly should be like:
mov x1, 0
.L4:
ldr q0, [x1, x2]
ldr q1, [x1, x4]
fadd v0.4s, v0.4s, v1.4s
str q0, [x5, x1]
add x1, x1, 16
cmp x1, x3
bcc .L4
This case can be handled if we do more complex overflow check on unsigned
type in function iv_elimination_compare_lt.
Also there is another blocker for the transformation, function
number_of_iterations_lt calls fold_build2 to build folded form of
may_be_zero, while iv_elimination_compare_lt only handles simple form tree
expressions. It's possible to have iv_elimination_compare_lt to do some
undo transformation on may_be_zero, but I found it's difficult for cases
involving signed/unsigned conversion like case loop-41.c. Since I think
there is no obvious benefit to fold may_be_zero here (somehow because the
two operands are already in folded forms), this patch just calls build2_loc
instead.
This optimization is picked up by patch B, but patch A is necessary since
the check in iv_elimination_compare_lt of two aff_trees isn't enough when
two different types (one in signed type, the other in unsigned) involved. I
have to use tree comparison here instead. Considering below simple case:
Analyzing # of iterations of loop 5
exit condition [1, + , 1](no_overflow) <= i_1
bounds on difference of bases: -3 ... 1
result:
zero if i_1 + 1 < 1
# of iterations (unsigned int) i_1, bounded by 2
number of iterations (unsigned int) i_1; zero if i_1 + 1 < 1
use 0
compare
in statement if (S.7_9 > i_1)
at position
type integer(kind=4)
base 1
step 1
is a biv
related candidates
candidate 0 (important)
var_before ivtmp.28
var_after ivtmp.28
incremented at end
type unsigned int
base 0
step 1
When GCC trying to eliminate use 0 with cand 0, the miscellaneous trees in
iv_elimination_compare_lt are like below with i_1 of signed type:
B: i_1 + 1
A: 0
niter->niter: (unsigned int)i_1
Apparently, (B-A-1) is "i_1", which doesn't equal to "(unsigned int)i_1".
Without this patch, it is considered equal to each other.
Note that the induction variable IS necessary on 32 bit systems since
otherwise there is type overflow.
These two patch fix the mentioned problem.
They pass bootstrap and regression test on x86_64/x86/aarch64/arm, so any
comments?
Thanks,
bin
PATCH A)
2014-06-23 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-loop-ivopts.c (iv_elimination_compare_lt): Check number
of iteration using tree comparison.
PATCH B)
2014-06-23 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-loop-niter.c (number_of_iterations_lt): Build unfolded
form of may_be_zero.
* tree-ssa-loop-ivopts.c (iv_nowrap_period)
(nowrap_cand_for_loop_niter_p): New functions.
(period_greater_niter_exit): New function refactored from
may_eliminate_iv.
(iv_elimination_compare_lt): New parameter. Relax overflow check.
Handle special forms may_be_zero expression.
(may_eliminate_iv): Call period_greater_niter_exit. Pass new
argument for iv_elimination_compare_lt.
gcc/testsuite/ChangeLog
2014-06-23 Bin Cheng <bin.cheng@arm.com>
* gcc.dg/tree-ssa/loop-40.c: New test.
* gcc.dg/tree-ssa/loop-41.c: New test.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: condition-iv_use-elimination-20140623-b.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20140623/8634b65d/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: condition-iv_use-elimination-20140623-a.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20140623/8634b65d/attachment-0001.txt>
More information about the Gcc-patches
mailing list