This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Loop induction variable optimization question
- From: "Steve Ellcey " <sellcey at mips dot com>
- To: <gcc at gcc dot gnu dot org>
- Date: Mon, 17 Jun 2013 10:07:41 -0700
- Subject: Loop induction variable optimization question
I have a loop induction variable question involving post increment.
If I have this loop:
void *memcpy_word_ptr(int * __restrict d, int * __restrict s, unsigned int n )
{
int i;
for(i=0; i<n; i++) {*d++ = *s++; }
return d;
}
and compile it with: -O3 -fno-tree-loop-distribute-patterns,
the loop induction variable pass (ivopts) converts this loop:
<bb 4>:
# d_22 = PHI <d_10(6), d_5(D)(3)>
# s_23 = PHI <s_11(6), s_6(D)(3)>
# i_24 = PHI <i_14(6), 0(3)>
d_10 = d_22 + 4;
s_11 = s_23 + 4;
_12 = *s_23;
*d_22 = _12;
i_14 = i_24 + 1;
i.2_8 = (unsigned int) i_14;
if (i.2_8 < n_9(D))
goto <bb 6>; # bb 6 just loops back to bb 4
else
goto <bb 5>;
into this loop (using -4 offsets to compensate for incrementing the 's' and 'd'
variables before their use:
<bb 4>:
# d_22 = PHI <d_10(6), d_5(D)(3)>
# s_23 = PHI <s_11(6), s_6(D)(3)>
# i_24 = PHI <i_14(6), 0(3)>
d_10 = d_22 + 4;
s_11 = s_23 + 4;
_12 = MEM[base: s_11, offset: 4294967292B];
MEM[base: d_10, offset: 4294967292B] = _12;
i_14 = i_24 + 1;
if (i_14 != _2)
goto <bb 6>; # bb 6 just loops back to bb 4
else
goto <bb 5>;
But if I increment s and d by hand after the copy like this:
void *memcpy_word_ptr(int * __restrict d, int * __restrict s, unsigned int n )
{
int i;
for(i=0; i<n; i++) {*d = *s; d++; s++; }
return d;
}
Then ivopts converts this loop:
<bb 4>:
# d_22 = PHI <d_12(6), d_5(D)(3)>
# s_23 = PHI <s_13(6), s_6(D)(3)>
# i_24 = PHI <i_14(6), 0(3)>
_10 = *s_23;
*d_22 = _10;
d_12 = d_22 + 4;
s_13 = s_23 + 4;
i_14 = i_24 + 1;
i.0_8 = (unsigned int) i_14;
if (i.0_8 < n_9(D))
goto <bb 6>; # bb 6 just loops back to bb 4
else
goto <bb 5>;
into this loop (with 0 offsets):
<bb 4>:
# d_22 = PHI <d_12(6), d_5(D)(3)>
# s_23 = PHI <s_13(6), s_6(D)(3)>
# i_24 = PHI <i_14(6), 0(3)>
_10 = MEM[base: s_23, offset: 0B];
MEM[base: d_22, offset: 0B] = _10;
d_12 = d_22 + 4;
s_13 = s_23 + 4;
i_14 = i_24 + 1;
if (i_14 != _2)
goto <bb 6>; # bb 6 just loops back to bb 4
else
goto <bb 5>;
My question is is: why (and where) did ivopts decide to move the
post-increments above the usages in the first loop? In my case
(MIPS) the second loop generates better code for me then the first
loop and I would like to avoid the '-4' offsets that are used.
Ideally, one would think that GCC should generate the same code
for both of these loops but it does not.
Steve Ellcey
sellcey@mips.com