This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fixing PR59006 and PR58921 by delaying loop invariant hoisting in vectorizer.
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Richard Biener <rguenther at suse dot de>
- Cc: Cong Hou <congh at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 27 Nov 2013 12:37:31 +0100
- Subject: Re: [PATCH] Fixing PR59006 and PR58921 by delaying loop invariant hoisting in vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=3m-dK_HHR8K5CF0n+uTnu5k4=swXHf3SzdH3KHoLzuTw at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1311271020320 dot 8615 at zhemvz dot fhfr dot qr>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Nov 27, 2013 at 10:53:56AM +0100, Richard Biener wrote:
> Hmm. I'm still thinking that we should handle this during the regular
> transform step.
I wonder if it can't be done instead just in vectorizable_load,
if LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) and the load is
invariant, just emit the (broadcasted) load not inside of the loop, but on
the loop preheader edge.
> *** gcc/tree-vect-data-refs.c (revision 205435)
> --- gcc/tree-vect-data-refs.c (working copy)
> *************** again:
> *** 3668,3673 ****
> --- 3668,3682 ----
> }
> STMT_VINFO_STRIDE_LOAD_P (stmt_info) = true;
> }
> + else if (loop_vinfo
> + && integer_zerop (DR_STEP (dr)))
> + {
> + /* All loads from a non-varying address will be disambiguated
> + by data-ref analysis or via a runtime alias check and thus
> + they will become invariant. Force them to be vectorized
> + as external. */
> + STMT_VINFO_DEF_TYPE (stmt_info) = vect_external_def;
> + }
I think this is unsafe for simd loops.
I'd say:
int a[1024], b[1024];
int foo (void)
{
int i;
#pragma omp simd safelen(8)
for (i = 0; i < 1024; i++)
{
a[i] = i;
b[i] = a[0];
}
}
is valid testcase, the loop behaves the same if you execute it
sequentially, or vectorize using SIMD (hardware or emulated) instructions
with vectorization factor of 2, 4 or 8, as long as you do all memory
operations (either using scalar insns or simd instructions) in the order
they were written, which I believe the vectorizer right now handles
correctly, but the hoisting this patch wants to perform is not fine,
unless data ref analysis would prove that it can't alias. For non-simd
loops we of course perform that data ref analysis and either version for
alias, or prove that the drs can't alias, but for simd loops we take as
given that the loop is safe to be vectorized. It is, but not for hoisting.
So, the above say with emulated SIMD is safe to be executed as:
for (i = 0; i < 1024; i += 8)
{
int tmp;
for (tmp = i; tmp < i + 8; tmp++)
a[tmp] = tmp;
for (tmp = i; tmp < i + 8; tmp++)
b[tmp] = a[0];
}
but not as:
int tempa[8], tmp;
/* Hoisted HW or emulated load + splat. */
for (tmp = 0; tmp < 8; tmp++)
tempa[tmp] = a[0];
for (i = 0; i < 1024; i += 8)
{
for (tmp = i; tmp < i + 8; tmp++)
a[tmp] = tmp;
for (tmp = i; tmp < i + 8; tmp++)
b[tmp] = tempa[tmp];
}
Jakub