This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][gomp4] Offloading patches (1/3): Add '-fopenmp_target' option


On 01/22/2014 11:53 AM, Andrey Turetskiy wrote:
We have some testcases, but they require XeonPhi hardware and a
working libgomp plugin. Our current version of the plugin depends on
some libraries, that are not open-sourced yet, so currently we can’t
share it.

However, you could examine what these patches do, making the following steps:
1) Build GCC with patches:
         http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01484.html
         http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01485.html
         http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01486.html
         http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01896.html
2) Set environment variables (e.g. for two ‘targets’):
         export OFFLOAD_TARGET_NAMES=mic:hsail              (for now
names don’t really matter)
         export OFFLOAD_TARGET_COMPILERS=./gcc:./gcc    (use GCC with
patches above as target compiler, because it must support the
-fopenmp_target option)
3) Build any example with #pragma omp target (e.g. see attachment):
         ./gcc -flto -fopenmp test.c -o test.exe
     Options -flto and -fopenmp are necessary for using.

Now you have a binary with target images embedded and tables properly
filled. You can’t run it due to reasons mentioned above, though you
could examine it with objdump/nm/readelf to see new sections and their
content: there will be .offload_image_section with ‘target’ code and
.offload_func_table_section with ‘target’ function table.

I played around with this for a while last week. To have a slightly more realistic scenario where the offload compiler is for a different target, I built an aarch64-linux compiler and used that in OFFLOAD_TARGET_COMPILERS. This exposed some problems.

+  /* Run gcc for target.  */
+  obstack_init (&argv_obstack);
+  obstack_ptr_grow (&argv_obstack, compiler);
+  obstack_ptr_grow (&argv_obstack, "-shared");
+  obstack_ptr_grow (&argv_obstack, "-fPIC");
+  obstack_ptr_grow (&argv_obstack, "-xlto");
+  obstack_ptr_grow (&argv_obstack, "-fopenmp_target");
+  obstack_ptr_grow (&argv_obstack, "-o");
+  obstack_ptr_grow (&argv_obstack, target_image_file_name);

Since environment variables such as GCC_EXEC_PREFIX and COMPILER_PATH are set at this point, the compiler we're running here won't find the correct lto1 - best case it doesn't find anything, worst case it finds the lto1 for the host compiler and produces an image for the host, not the target (this fails with an arm compiler since the host assembler doesn't understand -meabi=5, but it could silently do the wrong thing with other offload toolchains).

Once I worked around this by unsetting the environment variables around this compiler invocation here, the next problem is exposed - the code tries to link together files compiled for the target (created by the code quoted above) and the host (the _omp_descr file, I believe). Linker errors ensue.

As mentioned before, I think all this target-specific code has no place in lto-wrapper to begin with. For ptx, we're going to require some quite different mechanisms, so I think it might be best to invoke a new tool, maybe called $target-gen-offload, which knows how to produce an image that can be linked into the host executable. Different offload targets can then use different strategies to produce such an image. Probably each such image should contain its own code to register itself with libgomp, so that we don't have to construct a table.

Some other observations:
 * is OFFLOAD_TARGET_NAMES actually useful, or would any string
   generated at link time suffice?
 * Is the user expected to set OFFLOAD_TARGET_COMPILERS, or should
   this be done by the gcc driver, possibly based on command line
   options (I'd much prefer that)?
 * Do we actually need an -fopenmp-target option? The way I imagine it
   (and which was somewhat present in the Makefile patches I posted
   last year) is that an offload compiler is specially configured to
   know that that's how it will be used, and to know what the host
   architecture is. A $target-gen-offload could then be built with
   knowledge of the host architecture and installed in the host
   compiler's libexec install directory.

I think I'll need to implement my own set of mechanisms for ptx, since this code doesn't seem suitable for inclusion in its current state. I'll try to take on board some of the ideas I've found here in the hope that we'll converge on something that works for everybody.


Bernd


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]