This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.
- From: Richard Biener <rguenther at suse dot de>
- To: Cong Hou <congh at google dot com>
- Cc: Jeff Law <law at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Michael Matz <matz at suse dot de>
- Date: Fri, 6 Dec 2013 10:27:33 +0100 (CET)
- Subject: Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=33JChF2hpKEhSDcEpYwvoO5y-cLJC9T-TNERJZMUJF9g at mail dot gmail dot com> <CAK=A3=343pS3FBpJbqJL-kRsVBaV4a5khyE9i0gUcQSY=M5k+Q at mail dot gmail dot com> <525D9871 dot 2030304 at redhat dot com> <CAK=A3=3vavn7u7vnziZPOw2d6uhvbhKx1AVTROnAYFCGCOF22Q at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1310161033330 dot 11149 at zhemvz dot fhfr dot qr> <CAK=A3=3n0fncGKfbHnQ6mue3SZA-Risv6suVo_t9QXE53xQ=3Q at mail dot gmail dot com>
On Thu, 5 Dec 2013, Cong Hou wrote:
> Hi Richard
>
> You mentioned that Micha has a patch pending that enables of zero-step
> stores. What is the status of this patch? I could not find it through
> searching "Micha".
Well, it's pending on his harddisk :/ Too late for 4.9.
Richard.
> Thank you!
>
>
> Cong
>
>
> On Wed, Oct 16, 2013 at 2:02 AM, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 15 Oct 2013, Cong Hou wrote:
> >
> >> Thank you for your reminder, Jeff! I just noticed Richard's comment. I
> >> have modified the patch according to that.
> >>
> >> The new patch is attached.
> >
> > (posting patches inline is easier for review, now you have to deal
> > with no quoting markers ;))
> >
> > Comments inline.
> >
> > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > index 8a38316..2637309 100644
> > --- a/gcc/ChangeLog
> > +++ b/gcc/ChangeLog
> > @@ -1,3 +1,8 @@
> > +2013-10-15 Cong Hou <congh@google.com>
> > +
> > + * tree-vect-loop-manip.c (vect_loop_versioning): Hoist loop invariant
> > + statement that contains data refs with zero-step.
> > +
> > 2013-10-14 David Malcolm <dmalcolm@redhat.com>
> >
> > * dumpfile.h (gcc::dump_manager): New class, to hold state
> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> > index 075d071..9d0f4a5 100644
> > --- a/gcc/testsuite/ChangeLog
> > +++ b/gcc/testsuite/ChangeLog
> > @@ -1,3 +1,7 @@
> > +2013-10-15 Cong Hou <congh@google.com>
> > +
> > + * gcc.dg/vect/pr58508.c: New test.
> > +
> > 2013-10-14 Tobias Burnus <burnus@net-b.de>
> >
> > PR fortran/58658
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr58508.c b/gcc/testsuite/gcc.dg/vect/pr58508.c
> > new file mode 100644
> > index 0000000..cb22b50
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/pr58508.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> > +
> > +
> > +/* The GCC vectorizer generates loop versioning for the following loop
> > + since there may exist aliasing between A and B. The predicate checks
> > + if A may alias with B across all iterations. Then for the loop in
> > + the true body, we can assert that *B is a loop invariant so that
> > + we can hoist the load of *B before the loop body. */
> > +
> > +void foo (int* a, int* b)
> > +{
> > + int i;
> > + for (i = 0; i < 100000; ++i)
> > + a[i] = *b + 1;
> > +}
> > +
> > +
> > +/* { dg-final { scan-tree-dump-times "hoist" 2 "vect" } } */
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> > index 574446a..f4fdec2 100644
> > --- a/gcc/tree-vect-loop-manip.c
> > +++ b/gcc/tree-vect-loop-manip.c
> > @@ -2477,6 +2477,92 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
> > adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
> > }
> >
> >
> > Note that applying this kind of transform at this point invalidates
> > some of the earlier analysis the vectorizer performed (namely the
> > def-kind which now effectively gets vect_external_def from
> > vect_internal_def). In this case it doesn't seem to cause any
> > issues (we re-compute the def-kind everytime we need it (how wasteful)).
> >
> > + /* Extract load and store statements on pointers with zero-stride
> > + accesses. */
> > + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
> > + {
> > + /* In the loop body, we iterate each statement to check if it is a load
> > + or store. Then we check the DR_STEP of the data reference. If
> > + DR_STEP is zero, then we will hoist the load statement to the loop
> > + preheader, and move the store statement to the loop exit. */
> >
> > We don't move the store yet. Micha has a patch pending that enables
> > vectorization of zero-step stores.
> >
> > + for (gimple_stmt_iterator si = gsi_start_bb (loop->header);
> > + !gsi_end_p (si);)
> >
> > While technically ok now (vectorized loops contain a single basic block)
> > please use LOOP_VINFO_BBS () to get at the vector of basic-blcoks
> > and iterate over them like other code does.
> >
> > + {
> > + gimple stmt = gsi_stmt (si);
> > + stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
> > + struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
> > +
> > + if (dr && integer_zerop (DR_STEP (dr)))
> > + {
> > + if (DR_IS_READ (dr))
> > + {
> > + if (dump_enabled_p ())
> > + {
> > + dump_printf_loc
> > + (MSG_NOTE, vect_location,
> > + "hoist the statement to outside of the loop ");
> >
> > "hoisting out of the vectorized loop: "
> >
> > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
> > + dump_printf (MSG_NOTE, "\n");
> > + }
> > +
> > + gsi_remove (&si, false);
> > + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt);
> >
> > Note that this will result in a bogus VUSE on the stmt at this point which
> > will be only fixed because of implementation details of loop versioning.
> > Either get the correct VUSE from the loop header virtual PHI node
> > preheader edge (if there is none then the current VUSE is the correct one
> > to use) or clear it.
> >
> > + }
> > + /* TODO: We also consider vectorizing loops containing zero-step
> > + data refs as writes. For example:
> > +
> > + int a[N], *s;
> > + for (i = 0; i < N; i++)
> > + *s += a[i];
> > +
> > + In this case the write to *s can be also moved after the
> > + loop. */
> >
> > Note that you then invalidate even more vectorizer analysis - you
> > basically introduce a scalar reduction (you have to add a PHI node).
> > Which means that the transform has to happen elsewhere.
> >
> > As Jakub now tries with if-conversion this would also be a candidate
> > for applying the loop versioning before even starting the rest of the
> > vectorizer analysis code.
> >
> > That said, I'd remove the TODO at this point because it's clearly not
> > possible to implement just here ;)
> >
> > + continue;
> > + }
> > + else if (!dr)
> > + {
> > + bool hoist = true;
> > + for (size_t i = 0; i < gimple_num_ops (stmt); i++)
> >
> > You are checking all kinds of statements, including assignments
> > of which you are also checking the LHS ... restricting to
> > assignments you can start walking at i = 1.
> >
> > + {
> > + tree op = gimple_op (stmt, i);
> > + if (TREE_CODE (op) == INTEGER_CST
> > + || TREE_CODE (op) == REAL_CST)
> >
> > There are other constants as well - just check
> >
> > if (is_gimple_min_invariant (op))
> >
> > + continue;
> > + if (TREE_CODE (op) == SSA_NAME)
> > + {
> > + gimple def = SSA_NAME_DEF_STMT (op);
> > + if (def == stmt
> >
> > with starting at op 1 you can drop this.
> >
> > + || gimple_nop_p (def)
> > + || !flow_bb_inside_loop_p (loop, gimple_bb (def)))
> > + continue;
> > + }
> >
> > Note that you fail to hoist if-converted code this way because
> > op1 of
> >
> > name_1 = a_2 < b_4 ? x_5 : y_6;
> >
> > is 'a_2 < b_4', a tree expression and not an SSA name (ugh). Maybe
> > we don't care (it's just a missed optimization), but if you are
> > restricting yourself to hoisting assignments without a data-ref
> > then you can walk over SSA uses on the stmt (instead of over gimple
> > ops) with
> >
> > FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_USE)
> >
> > which would automagically take care of that case.
> >
> > Can you add a testcase which involves invariant if-conversion?
> > Can you add a testcase with just an invariant store to make sure
> > we don't wreck it? Can you add a testcase with an invariant store
> > and load (the reduction case), too?
> >
> > + hoist = false;
> > + break;
> > + }
> >
> > For example you'll hoist all labels this way (no ops), as well as the
> > loop exit GIMPLE_COND in case it's operands were loop invariant,
> > breaking the CFG ... so please add && is_gimple_assign () to the
> > if (!dr) check ;)
> >
> > + if (hoist)
> > + {
> > + gsi_remove (&si, false);
> > + gsi_insert_on_edge_immediate (loop_preheader_edge (loop), stmt);
> > +
> > + if (dump_enabled_p ())
> > + {
> > + dump_printf_loc
> > + (MSG_NOTE, vect_location,
> > + "hoist the statement to outside of the loop ");
> > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
> > + dump_printf (MSG_NOTE, "\n");
> > + }
> > + continue;
> > + }
> > + }
> > + gsi_next (&si);
> > + }
> > + }
> > +
> > /* End loop-exit-fixes after versioning. */
> >
> > if (cond_expr_stmt_list)
> >
> >>
> >> thanks,
> >> Cong
> >>
> >>
> >> On Tue, Oct 15, 2013 at 12:33 PM, Jeff Law <law@redhat.com> wrote:
> >> > On 10/14/13 17:31, Cong Hou wrote:
> >> >>
> >> >> Any comment on this patch?
> >> >
> >> > Richi replied in the BZ you opened.
> >> >
> >> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58508
> >> >
> >> > Essentially he said emit the load on the edge rather than in the block
> >> > itself.
> >> > jeff
> >> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer