This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug ipa/81877] [7/8 Regression] Incorrect results with lto and -fipa-cp and -fipa-cp-clone
- From: "rguenther at suse dot de" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 18 Aug 2017 13:27:01 +0000
- Subject: [Bug ipa/81877] [7/8 Regression] Incorrect results with lto and -fipa-cp and -fipa-cp-clone
- Auto-submitted: auto-generated
- References: <bug-81877-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81877
--- Comment #12 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 18 Aug 2017, amonakov at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81877
>
> --- Comment #11 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #10)
> > Now - for refs that have an invariant address in such loop the interleaving
> > effectively means that they are independent even in the same iteration.
>
> Not if there's no "other iteration", i.e. runtime iteration count is 1.
Not sure about this, I interpret 'ivdep' as there's no dependence for
any number of iterations.
> [...]
> > for example. I suppose it's not their intent. So maybe there's an
> > additional
> > restriction on the interleaving? Preserve iteration order of individual
> > stmts? That would prevent autopar in face of just ivdep for example.
> >
> > Note that any "handle must-defs 'correctly'" writing is inherently fishy.
>
> I think you're saying that pragma-ivdep and do-concurrent are too hand-wavy
> about how the compiler may or must privatize variables, whether it must detect
> and handle reductions/inductions etc. But note that LIM is keying on 'simdlen',
> and simdlen is also set by OpenMP-SIMD which is more rigorous in that regard,
> i.e. privatization is explicit in GIMPLE. And there I believe LIM does not have
> the license to disregard may-alias relations *unless* it verifies that loop
> iterates at least twice and repeated writes are UB. On this example:
Yeah, the middle-end uses safelen which is also used for simdlen. It has
to adhere to the most conservative definition.
> void g(int p, int *out)
> {
> int x, y = 0, *r = p ? &x : &y;
> unsigned n = 0;
> asm("" : "+r" (n));
> #pragma omp simd
> for (int i = 0; i <= n; i++)
> {
> //#pragma omp ordered simd
> x = 42;
> out[i] = *r;
> }
> }
>
> I believe LSM is wrong for n=0, and for any n if the pragma-ordered is
> uncommented.
I see. I wonder if we handle pramga-ordered correctly in vectorization
for say
#pragma omp simd
for (int i = 0; i <= n; i++)
{
#pragma omp ordered simd
out[i+2] = 0;
out[i+1] = 1;
out[i] = 2;
}
I believe we vectorize this with SLP and unrolling with VF 12 as
out[i..i+3] = {2, 1, 0, 2};
out[i+4..i+7] = {1, 0, 2, 1};
out[i+8..i+11] = {0, 2, 1, 0};
I guess "at the same time" fulfils 'ordered' but does splitting
like above do? That moves out[i+3] store before out[i+5].
The safest thing would be to remove safelen handling from invariant
motion.
More rigorously defining the semantic of loop->safelen (the
middle-end term) is necessary nevertheless. I believe omp ordered
doesn't have any middle-end representation?