This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp] Move openacc vector& worker single handling to RTL
- From: Nathan Sidwell <nathan at acm dot org>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 07 Jul 2015 10:43:05 -0400
- Subject: Re: [gomp] Move openacc vector& worker single handling to RTL
- Authentication-results: sourceware.org; auth=none
- References: <5597120D dot 2080308 at acm dot org> <20150703231159 dot GP10247 at tucnak dot redhat dot com> <559844EF dot 6010208 at acm dot org> <559AD85B dot 2050102 at acm dot org> <20150707095408 dot GD10247 at tucnak dot redhat dot com> <559BDE68 dot 9010302 at acm dot org> <20150707142229 dot GG10247 at tucnak dot redhat dot com>
On 07/07/15 10:22, Jakub Jelinek wrote:
On Tue, Jul 07, 2015 at 10:12:56AM -0400, Nathan Sidwell wrote:
Wouldn't function attributes be better for that case, and just use the internal
functions for the case when the mode is being changed in the middle of
function?
It may be. I've been thinking how the top-level offloaded function (kernel),
should be marked to specify gangs/worker/vector dimensions to allow a less
device-specific launch mechanism. I suspect that and routines will have similar
solutions.
I agree that fork/join might be less confusing.
BTW, where do you plan to lower the internal functions for non-PTX?
Doing it in RTL mach reorg is too late for those, we shouldn't be writing it
for each single target, as for non-PTX (perhaps non-HSA) I bet the behavior
is the same.
I suspect other devices can add a new device-specific lowering pass somewhere
soon after the LTO readback. I think we're going to need that pass for some
other pieces of PTX.
FWIW on a device that has a PTX-like architecture, I think this specific piece
should be done as late as possible. Perhaps pieces of the PTX mach-dep-reorg
can be abstracted for general use?
nathan