[gomp4] Tweak GOMP_target{,_data,_update} arguments

Jakub Jelinek jakub@redhat.com
Thu Sep 19 12:42:00 GMT 2013


On Thu, Sep 19, 2013 at 12:58:28PM +0400, Michael V. Zolotukhin wrote:
> Thanks for the explanation, it's getting a bit clearer, though I still have some
> questions.
> 
> > __OPENMP_TARGET__ would be a linker plugin inserted symbol at the start of
> > some linker plugin created data section, which would start with some header
> > and then data.
> > Say
> > uleb128 number_of_supported_targets - n
> > uleb128 number_of_host_var_pairs - m
> > [ name of offload target (asciiz?)
> >   relative offset to the start of the offload data for the target (in MIC case embedded DSO)
> >   size of the offload data
> >   perhaps something how to find the target addresses array
> > ] repeated n times
> > [ host_address, size ] repeated m times
> > (for the functions passed to GOMP_target the pair would be [ foobar.omp_fn.25, 1 ] ).
> So, in this table we store host addresses of global variables, marked with
> 'pragma omp declare target', and addresses of host-versions of OMP-versioned
> functions.  Correct?  Also, there are pointers to images of target-binaries,
> which are (presumably) placed in other (or the same?) data sections.

Yeah.  How exactly we define the section is up to us, but it should have all
the information that GOMP_target* will need to offload the stuff from the
current shared library or binary, and everything needed to initialize the
{ host_addr, size } -> { target_addr } mapping of declare target global var
definitions and functions passed to GOMP_target.  The fewer relocations
the section has, the better.  But, if we need any relocations, it will need
to be in a relro section, and supposedly the embedded shared library (resp.
libraries) don't need any relocations on them and will be large, thus
supposedly they should live in different sections and the header should just
point to them (e.g. using offset relative to __OPENMP_TARGET__ or something
that doesn't dynamic relocation).  Similarly, if the linker plugin puts in the
array of [ host_address, size ] rewritten such that host_address is an
offset from __OPENMP_TARGET__, then we won't need dynamic relocations for
that.  Another complication is dependent shared libraries.
Consider
liba.c:
#pragma omp declare target
int i;
int foo (void)
{
  return ++i;
}
#pragma omp end declare target
main.c:
#pragma omp declare target
extern int i;
extern int foo (void);
#pragma omp end declare target
int main ()
{
  int j;
  #pragma omp target
    {
      j = i;
      j += foo ();
    }
  if (j != 1)
    abort ();
  return 0;
}
gcc -shared -O2 -fpic -fopenmp -o liba.so -Wl,-soname,liba.so liba.c
gcc -O2 -fopenmp -o main main.c -L. -la
./main

Perhaps the linker plugin can extract the target shared libraries from
the embedded sections of dependent shared libraries (if any), and link the
"main" shared library against that, but GOMP_target will need to know that
it can't just offload main.so, but also has to offload the dependent
liba.so (and of course libgomp.so.1 from the libgomp plugin).
What does ICC do in this case?

> > So, when GOMP_target{,_data,_update} is called, it could easily determine
> > if the calling shared library resp. binary has been offloaded or not
> That's right.
> Then, if no initialization has been performed yet, GOMP_target{,_data,_update}
> is initialized.  Now let's look at the initialization.
> In initialization GOMP_target* looks at the __OPENMP_TARGET__ table (its address
> is passed as the 3rd argument), finds pointer to a data section with
> target-binary image, loads it to memory, runs a process on a target from it
> (e.g. in COI using COIProcessCreateFromFile and/or
> COIProcessLoadLibraryFromMemory).
> Global variables are mapped and the corresponding host<->target address pairs
> are inserted to the splay tree, as usual.
> Also, GOMP_target* should do the same for function addresses.  Could you please
> describe this step in more details?  Do we want to just add some offset to
> host_function_address (as we want host versions of functions to be ordered
> exactly as the target versions)?

The idea was that the host [ host_addr, size ] array (in some named section)
would be ordered exactly the same as corresponding [ targ_addr ] array in
the target shared library.  So, [25] pair in the host array will correspond
to [25] in the target shared library array.
So you just walk the whole arrays, and in each iteration pick nth host array
pair plus corresponding nth target array address, and put it into the splay
tree.

In the above testcase, host liba.so would contain a [ &i, sizeof(int) ]
pair and target liba.so corresponding [ &i ] entry (target i in that case).
In host main there would be [ &main.omp_fn.0, 1 ] and in target main.so
corresponding [ &main.omp_fn.0 ] (target main.omp_fn.0 in that case).

> > See above, names are just a bad idea.  You can just use some magic wrapper
> > name in the target binary (the one sitting in libgomp), to which you just
> > pass the pair of function address and it's argument and the named function
> > will just read the (target) function pointer and (target) pointer argument
> > from misc data block and tail call that function.
> Yes, if we know target function pointer, we can do this.
> 
> Basically, the main question I have now is how would we figure out target
> function address?  Of course, after initialization we just look for it in our
> splay tree, so the question relates to the initialization step.

After you ensure the shared library is offloaded and splay tree initialized,
you just splay_tree_lookup the host [ fnaddr, 1 ] and get corresponding
target address (if not present in splay tree, that would be toolchain bug,
so gomp_fatal or something).  And, then you just make sure you call that
function in the target, whether it is done by calling a named wrapper
function to which you pass that target fn address and target address of the
pointer array, or something else.

	Jakub



More information about the Gcc-patches mailing list