This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c


Sorry, I realize I forgot to attach the patch to the original email, this followed a couple of minutes later in message <553F91B9.7050703@arm.com> at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01745.html .

Cheers, Alan

Jeff Law wrote:
On 04/28/2015 07:55 AM, Alan Lawrence wrote:
Tree if-conversion currently bails out for loops that (a) contain nested
loops; (b) have more than one exit; (c) where the exit block (source of
the exit edge) does not dominate the loop latch; (d) where the exit
block is the loop header, or there are statements after the exit.

This patch removes restrictions (c) and (d). The intuition is that, for
(c), "if (P) {... if (Q) break;}" is equivalent to "if (P) {...}; if
(P&&Q) break;" and this is mostly handled by existing code for
propagating conditions. For (d), "if (P) break; stmts" is equivalent to
"if (!P) stmts; if (P) break;" - this requires inserting the predicated
stmts before the branch rather than after.

Mostly thus this patch is just removing assumptions about when we
do/don't need to store predicates. One 'gotcha' was in some test cases
the latch block passed into if-conversion is non-empty; in such cases,
if-conversion will now restore "good form" by moving the statement into
the exit block (predicated with !exit-condition).

The condition on dominance in add_to_predicate_list, I haven't quite
managed to convince myself is right; we _do_ want to store a predicate
for the latch block to handle the above case, but I'm not totally sure
of the postdominance condition - I think it may store conditions in
cases where we don't really need to (e.g. "for (;;) { ... if (P) { for
(;;) ; } }" which might look nested but isn't, and has no route to the
function exit). However, storing conditions when we don't need to, is
OK, unlike failing to store when we do need to ;).

A simple example of the patch at work:

int
foo ()
{
   for (int i = 0; i < N ; i++)
   {
     int m = (a[i] & i) ? 5 : 4;
     b[i] = a[i] * m;
   }
}

compiled at -O3, -fdump-tree-ivcanon shows this immediately before
tree-if-conversion:

...function entry, variables, etc...
   <bb 2>:
   _10 = a[0];
   goto <bb 6>;

   <bb 3>:
   _5 = a[i_9];
   _6 = _5 & i_9;
   if (_6 != 0)
     goto <bb 5>;
   else
     goto <bb 4>;

   <bb 4>:

   <bb 5>:
   # m_14 = PHI <5(3), 4(4)>

   <bb 6>:
   # m_2 = PHI <m_14(5), 4(2)>
   # _15 = PHI <_5(5), _10(2)>
   # i_16 = PHI <i_9(5), 0(2)>
   # ivtmp_13 = PHI <ivtmp_3(5), 32(2)>
   _7 = m_2 * _15;
   b[i_16] = _7;
   i_9 = i_16 + 1;
   ivtmp_3 = ivtmp_13 - 1;
   if (ivtmp_3 != 0)
     goto <bb 3>;
   else
     goto <bb 7>;

which previously was not if-converted. With this patch:

   <bb 2>:
   _10 = a[0];
   goto <bb 4>;

   <bb 3>:

   <bb 4>:
   # m_2 = PHI <m_14(3), 4(2)>
   # _15 = PHI <_5(3), _10(2)>
   # i_16 = PHI <i_9(3), 0(2)>
   # ivtmp_13 = PHI <ivtmp_3(3), 32(2)>
   _7 = m_2 * _15;
   b[i_16] = _7;
   i_9 = i_16 + 1;
   ivtmp_3 = ivtmp_13 - 1;
   _5 = a[i_9];
   _6 = _5 & i_9;
   m_14 = _6 != 0 ? 5 : 4;
   if (ivtmp_3 != 0)
     goto <bb 3>;
   else
     goto <bb 5>;

   <bb 5>:
   return;

(Unfortunately the vectorizer still doesn't handle this loop either, but
that's another issue/patch...)

Bootstrapped + check-gcc on x86_64-unknown-linux-gnu and
aarch64-none-linux-gnu.
Cross-tested check-gcc on aarch64-none-elf.
I'm investigating impact on benchmarks - on AArch64 Spec2k6, this
touches a number of object files, leading to an overall slight decrease
in the number of instructions, but no change that looks significant
(specifically, no more or less vectorization).

Is this OK for trunk?
ENOPATCH
jeff



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]