[PATCH] openmp, fortran: Add support for declare variant in Fortran

Wed Oct 6 12:53:04 GMT 2021

On Wed, Oct 06, 2021 at 12:39:01PM +0100, Kwok Cheung Yeung wrote:
> In secion 2.3.1 of the OpenMP 5.0 spec, it says:
> 
> 3. For functions within a declare target block, the target trait is added to
> the beginning of the set...
> 
> But OpenMP in Fortran doesn't have the notion of a declare target _block_
> (i.e. the #pragma omp declare target/#pragma omp end declare target form),
> only the !$omp declare target (extended-list)/[clause] form (which C/C++
> also has). The C FE differentiates between the two (it applies an 'omp
> declare target block' attribute for the first, an 'omp declare target' for
> the second) but only the first matches against the 'target' construct in a
> context selector. I opted to match against 'omp declare target' for Fortran
> only, otherwise this functionality won't get exercised in Fortran at all.
> This difference is tested in test3 of declare-variant-8.f90, which I have
> XFAILed for now.

Let me answer this separately.  The 5.0 wording I believe means it doesn't
apply to Fortran at all.  This has been noticed in 5.1 and changed to:

For device routines, the target trait is added to the beginning of the set...

which was actually far worse for the LLVM/ICC way of doing things (see
below), whether something is a device routine is determined either through explicit
#pragma omp {,begin} declare target
...
#pragma omp end declare target
block which has the advantage that it is known at compile time during
parsing (that is the reason why the 5.0 wording was written that way),
or through explicit or implicit to clauses on explicit declare target
(which can appear before the call site to the function or after it),
or during the implicit declare target to propagation that is done even
later.

Now, in 5.2 some language committee members wanted to make the presence of
absence of target in the construct selector a dynamic property, but that
would make all of the score computations dynamic as well, the presence or
absence of target in the construct selector affects what bit positions other
selectors get during the score computation.

For procedures that are determined to be target function variants by
a declare target directive, the target trait is added to the beginning of the set...

So, in the 5.2 wording and the current GCC implementation of offloading,
the presence or absence of target in the construct set or everything it
depends on needs to be deferred until omp_discover_implicit_declare_target
(i.e. before gimplification) for functions that aren't going to be marked
as "declare target to", and till after IPA for functions that are marked
that way (in that case, the host copy will not get target in the constructor
set and the offloading lto1 copies will get it).

As has been said multiple times, the way we do it in GCC is different from
the way LLVM/ICC etc. do it; they preprocess and parse, analyze etc. the
source code multiple times, once for each offloading target, trust user
isn't doing anything nasty and that e.g. preprocessor macros will not make
the host and offloading targets structures used during mapping different,
different target regions etc.  And the way they are implementing is then
shown in the amount of features that assume their way as the only way,
e.g. about begin declare variant ... end declare variant allowing
effectively not to parse what is in between those, which assumes that pretty
much all the static conditions can be resolved already during the parsing,
which is rarely the case for GCC.
So, perhaps we'll need one day to reconsider what we do and we could say
preprocess just once, but parse multiple times if we determine we need to
offload, and at that point questions like "is this the host or offloading
target variant of declare target?" can be answered already during that parsing.
E.g. C++ FE isn't that far from it, it creates the array of lexical tokens
and then parses those tokens (but I think from time to time modifies the
token array, e.g. the purged_p or error_reported bits).
The C FE doesn't do that.
And users can supply - as the source filename and have the source read from
stdin, at which time there is no file to parse again.

So, for this question, my preference would be for now to implement the 5.0
semantics and never add target to construct set for Fortran unless in
explicit !$omp target body.
When we implement the 5.1 semantics, that would basically mean we have to
defer everything related to target construct in the set outside of explicit
target until gimplification time (after the implicit declare target
discovery), though note at least the C/C++ FEs decide everything declare
variant related at gimplification time only (perhaps to be changed in the
future).

	Jakub