This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp4] Some progress on #pragma omp simd
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Aldy Hernandez <aldyh at redhat dot com>
- Cc: Richard Henderson <rth at redhat dot com>, Richard Biener <rguenther at suse dot de>, "Iyer, Balaji V" <balaji dot v dot iyer at intel dot com>, gcc-patches at gcc dot gnu dot org
- Date: Thu, 13 Jun 2013 22:20:52 +0200
- Subject: Re: [gomp4] Some progress on #pragma omp simd
- References: <BF230D13CA30DD48930C31D40993300032A37760 at FMSMSX101 dot amr dot corp dot intel dot com> <20130424060117 dot GV12880 at tucnak dot redhat dot com> <20130424062536 dot GW12880 at tucnak dot redhat dot com> <20130424064054 dot GX12880 at tucnak dot redhat dot com> <5178692F dot 2010902 at redhat dot com> <BF230D13CA30DD48930C31D40993300032A39A2B at FMSMSX101 dot amr dot corp dot intel dot com> <517C0B34 dot 3050804 at redhat dot com> <20130427181734 dot GX28963 at tucnak dot redhat dot com> <20130611165616 dot GG2336 at tucnak dot redhat dot com> <51BA2871 dot 60701 at redhat dot com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Thu, Jun 13, 2013 at 03:15:45PM -0500, Aldy Hernandez wrote:
>
> >it. Also, not sure what to do for lastprivate, probably use the magic
> >arrays and just in the epilogue of the loop compute which of the array items
> >belonged to the last iteration somehow.
>
> Can't you do (for lastprivate(abc) something like:
>
> if (i == 1024) {
> abc = magic_abc[__builtin_GOMP.simd_lane (1)];
> }
Well, if you do that inside of the loop, you make it probably not
vectorizable. So you need something like:
abc = magic_abc[(count - 1) & (__builtin_GOMP.simd_vf (1) - 1)];
or so.
> >#pragma omp declare simd
> >__attribute__((noinline, noclone)) void
> >bar (int &x, int &y)
> >{
> > x += 4;
> > y += 4;
> >}
>
> Does bar() have anything to do with this example, or was this an oversight?
It was there just to make the stuff addressable during gimplification, and
possibly no longer addressable afterwards.
> >using the magic arrays and so is reduction. While the vectorizer can
> >recognize some reductions, e.g. without -ffast-math it will not vectorize
> >any floating point ones because that means changing the order of
> >computations, while when they are mandated to be one copy per simd lane,
> >the order of computations is clear and thus can be vectorized.
>
> Let me see if I understand (all things floating point confuse me).
> You're saying that the vectorizer, in its present state will refuse
> to vectorize reductions with floats because it may possibly change
> the order of computations, but we should override that behavior for
> OMP simd loops?
No, I'm saying that in simd loops the order of computations is different
(and depending on the vectorization factor), as each SIMD lane is supposed
to have its own private variable and at the end everything is reduced
together.
> > D.2717[D.2714].s = D.2702;
> > D.2703 = b[i];
> > a.0 = a;
> > D.2705 = a.0 + x;
> > D.2701 = D.2717[D.2714].s;
>
> Is there some subtlety in which we have to dereference D.2717 twice
> here, or can we reuse D.2702?
Usually it is FRE/PRE that optimizes at least the loads, and DSE stores,
but FRE/PRE isn't run after vectorization I think.
Jakub