This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation
- From: Bernd Schmidt <bernds at codesourcery dot com>
- To: Ilya Verbin <iverbin at gmail dot com>
- Cc: Thomas Schwinge <thomas at codesourcery dot com>, "Michael V. Zolotukhin" <michael dot v dot zolotukhin at gmail dot com>, Jakub Jelinek <jakub at redhat dot com>, Richard Biener <rguenther at suse dot de>, Kirill Yukhin <kirill dot yukhin at gmail dot com>, Andrey Turetskiy <andrey dot turetskiy at gmail dot com>, Ilya Tocar <tocarip dot intel at gmail dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Nathan Sidwell <nathan_sidwell at mentor dot com>
- Date: Fri, 28 Feb 2014 17:21:37 +0100
- Subject: Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation
- Authentication-results: sourceware.org; auth=none
- References: <20131217113957 dot GA39975 at msticlxl57 dot ims dot intel dot com> <52E7927B dot 8030509 at codesourcery dot com> <CADG=Z0GQ8ORLe1XRUU7VMYeLhwuWisMyCcGLQj-nY_bhkbD_1Q at mail dot gmail dot com> <CADG=Z0HRb1ojtTc4xEAG=hH_GcfAARDAmn70XGB5khF0mME4pQ at mail dot gmail dot com> <52E9137C dot 4020706 at codesourcery dot com> <CADG=Z0HkhefrBJ_tKyhEHv+p+AMTvpbxf=Md6JOCv6rAUu1u9g at mail dot gmail dot com> <CADG=Z0GW==Wax+3B5Z2JiieOWoz_gWpqtdhHA_L9-Nzb6u4bnA at mail dot gmail dot com> <530648F8 dot 2010409 at codesourcery dot com> <CADG=Z0HE6AudmZuQK2vWz+E4fh8PnqoJ-aq9GXjZXgn-ZRW0kw at mail dot gmail dot com>
On 02/28/2014 05:09 PM, Ilya Verbin wrote:
2014-02-20 22:27 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
* Functions and variables now go into different tables, otherwise
intermixing between them could be a problem that causes tables to
go out of sync between host and target (imagine one big table being
generated by ptx lto1/mkoffload, and multiple small table fragments
being linked together on the host side).
If you need 2 different tables for funcs and vars, we can also use
them. But I still don't understand how it will help synchronization
between host and target tables.
I think it won't help that much - I still think this entire scheme is
likely to fail on nvptx. I'll try to construct an example at some point.
One other thing about the split tables is that we don't have to write a
useless size of 1 for functions.
* I've put the begin/end fragments for the host tables into crtstuff,
which seems like the standard way of doing things.
Our plan was that the host side descriptor __OPENMP_TARGET__ will
contain (in addition to func/var table) pointers to the images for all
enabled accelerators (e.g. omp_image_nvptx_start and
omp_image_intelmic_start), therefore we generated it in the
lto-wrapper.
The concept of "image" is likely to vary somewhat between accelerators.
For ptx, it's just a string and it can't really be generated the same
way as for your target where you can manipulate ELF images. So I think
it is better to have a call to a gomp registration function for every
offload target. That should also give you the ordering you said you
wanted between shared libraries.
* Is there a reason to call a register function for the host tables?
The way I've set it up, we register a target function/variable table
while also passing a pointer to the __OPENMP_TARGET__ symbol which
holds information about the host side tables.
In our case we can't register target table with a call to libgomp, it
can be obtained only from the accelerator. Therefore we propose a
target-independent approach: during device initialization libgomp
calls 2 functions from the plugin (or this can be implemented by a
single function):
1. devicep->device_load_image_func, which will load target image (its
pointer will be taken from the host descriptor);
2. devicep->device_get_table_func, which in our case connects to the
device and receives its table. And in your case it will return
func_mappings and var_mappings. Will it work for you?
Probably. I think the constructor call to the gomp registration function
would contain an opaque pointer to whatever data the target wants, so it
can arrange its image/table data in whatever way it likes.
It would help to see the code you have on the libgomp side, I don't
believe that's been posted yet?
Unfortunately I don't fully understand this configure magic... When a
user specifies 2 or 3 accelerators during configuration with
--enable-accelerators, will several different accel-gccs be built?
No - the idea is that --enable-accelerator= is likely specific to ptx,
where we really just want to build a gcc and no target libraries, so
building it alongside the host in an accel-gcc subdirectory is ideal.
For your use case, I'd imagine the offload compiler would be built
relatively normally as a full build with
"--enable-as-accelerator-for=x86_64-linux", which would install it into
locations where the host will eventually be able to find it. Then the
host compiler would be built with another new configure option (as yet
unimplemented in my patch set) "--enable-offload-targets=mic,..." which
would tell the host compiler about the pre-built offload target
compilers. On the ptx side, "--enable-accelerator=ptx" would then also
add ptx to the list of --enable-offload-targets.
Naming of all these configure options can be discussed, I have no real
preference for any of them.
Bernd