This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch: Add #pragma ivdep support to the ME and C FE

On Wed, 16 Oct 2013, Tobias Burnus wrote:

> Frederic Riss wrote:
> > Just one question. You describe the pragma in the doco patch as:
> >
> > +This pragma tells the compiler that the immediately following @code{for}
> > +loop can be executed in any loop index order without affecting the result.
> > +The pragma aids optimization and in particular vectorization as the
> > +compiler can then assume a vectorization safelen of infinity.
> >
> > I'm not a specialist, but I was always told that the 'original'
> > meaning of ivdep (which I believe was introduced by Cray), was that
> > the compiler could assume that there are only forward dependencies in
> > the loop, but not that it can be executed in any order.
> The nice thing about #pragma ivdep is that there is no real standard. And
> the explanation of the different vendors is also not completely clear.
> Some overview about this is given in the following file on pages 13-14 for
> Cray Reaseach PVP, MIPSPRO & Open64, Intel ICC, Multiflow
> That's summerized as:
> - vector: ignore lexical upward dependencies (Cray PVP, Intel ICC)
> - parallel: ignore loop-carried dependencies (MIPSPRO, Open64)
> - liberal: ignore loop-variant dependencies (Multiflow)
> The quotes for Cray and Intel are below.
> Cray:
> "The ivdep directive tells the compiler to ignore vector dependencies for
>  the loop immediately following the directive. Conditions other than vector
>  dependencies can inhibit vectorization. If these conditions are satisfactory,
>  the loop vectorizes. This directive is useful for some loops that contain
>  pointers and indirect addressing. The format of this directive is as follows:
>  #pragma _CRI ivdep"

Which suggests we use

#pragma GCC ivdep

to not collide with eventually different semantics in existing programs
that use variants of this pragma?

> Intel:
> "The ivdep pragma instructs the compiler to ignore assumed vector dependencies.
>  To ensure correct code, the compiler treats an assumed dependence as a proven
>  dependence, which prevents vectorization. This pragma overrides that decision.
>  Use this pragma only when you know that the assumed loop dependencies are safe
>  to ignore."

This suggests that _known_ dependences are still treated as dependences.
But what is known obviously depends on the implementation which
may not know that a[i] and a[i+1] depend but merely assume it.  Not
a standard-proof definition of the pragma ;)

That said, safelen even overrides know dependences (but with unknown
distance vector)! (that looks like a bug to me, or at least a QOI issue)

> > The Intel docs give this example:
> ...
> > Given your description, this loop wouldn't be a candidate for ivdep,
> > as reversing the loop index order changes the semantics. I believe
> > that the way you interpret it (ie. setting vectorization safe length
> > to INT_MAX) is correct with respect to this other definition, though.
> Do you have a suggestion for a better wording? My idea was to interpret
> this part similar to OpenMP's simd with safelen=infinity. (Actually, I
> believe loop->safelen was added for OpenMPv4's and/or Cilk Plus's "simd".)
> OpenMPv4.0, , states
> for this (excerpt from page 70):
> "A SIMD loop has logical iterations numbered 0, 1,...,N-1 where N is the
> number of loop iterations, and the logical numbering denotes the sequence
> in which the iterations would be executed if the associated loop(s) were
> executed with no SIMD instructions. If the safelen clause is used then no
> two iterations executed concurrently with SIMD instructions can have a
> greater distance in the logical iteration space than its value. The
> parameter of the safelen clause must be a constant positive integer
> expression. The number of iterations that are executed concurrently at
> any given time is implementation defined. Each concurrent iteration will
> be executed by a different SIMD lane. Each set of concurrent iterations
> is a SIMD chunk."

OTOH, if we are mapping ivdep to safelen why not simply allow

#pragma GCC safelen 4


> > Oh, and are there any plans to maintain this information in some way
> > till the back-end? Software pipelining could be another huge winner
> > for that kind of dependency analysis simplification.
> I don't know until when loop->safelen is kept. As it is late in the
> middle-end, providing the backend with this information should be
> simple.

It's kept as long as we preserve loops which at the moment is until
after RTL loop optimizations are finished.  Extending this isn't
hard, I just didn't see a reason to do that.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]