This is the mail archive of the
mailing list for the GCC project.
Re: nvptx offloading patches [2/n]
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Bernd Schmidt <bernds at codesourcery dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 19 Feb 2015 14:38:48 +0100
- Subject: Re: nvptx offloading patches [2/n]
- Authentication-results: sourceware.org; auth=none
- References: <5454C944 dot 4090107 at codesourcery dot com> <20150204105554 dot GL1746 at tucnak dot redhat dot com> <CAFiYyc090-qas3vPUgLxQ_LHtZPvy5ee5NK3+BHPQLDVcBbfpA at mail dot gmail dot com> <54E36E30 dot 8040408 at codesourcery dot com> <20150217171001 dot GZ1746 at tucnak dot redhat dot com> <54E3AAC4 dot 2050601 at codesourcery dot com> <20150219130929 dot GG1746 at tucnak dot redhat dot com>
On Thu, Feb 19, 2015 at 2:09 PM, Jakub Jelinek <firstname.lastname@example.org> wrote:
> On Tue, Feb 17, 2015 at 09:55:32PM +0100, Bernd Schmidt wrote:
>> On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
>> >What exact testcase are you trying to fix with this patch, and how do you
>> >think offloading of code using va_list can work?
>> The exact testcase is any offloaded program - streaming in lto will crash if
>> there is a mismatch in these preloaded nodes.
>> For OpenACC programs using va_list - I don't expect them to work at all. I
>> don't believe the spec considers such issues, and ptx isn't expected to
>> support variadic functions in the first place ("The current version of PTX
>> does not support variadic functions" is what the spec has to say; the gcc
>> port overachieves a little by implementing them anyway).
> How do you support printf etc.? Those are all varargs functions.
> Anyway, could following untested patch be used as a temporary hack?
> It might pessimize for GCC5 slightly the intelmic offloading, but I hope
> the way forward is the stdarg late lowering for GCC 6.
> Richard on IRC said it might be better for the lto_stream_offload_p
> path to whitelist nodes it has to preload rather than blacklist the ones
> that it doesn't, otherwise e.g. if you try to offload from say x86_64-mingw
> with 32-bit long, but 64-bit pointers to 64-bit intelmic, preloading
> the long_type_node will certainly break lots of things, while not preloading
> them would only be problematic for builtins, we'll need some pass over the
> builtins in the IL in any case, to find out if they are compatible or not,
> adjust if needed and give up otherwise. But I'd hope it can be worked on
> incrementally, if this patch (plus the approved nvptx offloading patches,
> plus mode_table streaming) makes the nvptx offloading work.
The patch works for me if it helps anything.
> 2015-02-19 Bernd Schmidt <email@example.com>
> Jakub Jelinek <firstname.lastname@example.org>
> * tree-streamer.c (preload_common_nodes): Don't preload
> TI_VA_LIST* for offloading.
> * tree-stdarg.c (pass_stdarg::gate): Disable for ACCEL_COMPILER
> --- gcc/tree-streamer.c.jj 2015-02-18 12:36:20.000000000 +0100
> +++ gcc/tree-streamer.c 2015-02-19 13:57:26.089626006 +0100
> @@ -342,7 +342,14 @@ preload_common_nodes (struct streamer_tr
> && i != TI_TARGET_OPTION_DEFAULT
> && i != TI_TARGET_OPTION_CURRENT
> && i != TI_CURRENT_TARGET_PRAGMA
> - && i != TI_CURRENT_OPTIMIZE_PRAGMA)
> + && i != TI_CURRENT_OPTIMIZE_PRAGMA
> + /* Skip va_list* related nodes if offloading. For native LTO
> + we want them to be merged for the stdarg pass, for offloading
> + they might not be identical between host and offloading target. */
> + && (!lto_stream_offload_p
> + || (i != TI_VA_LIST_TYPE
> + && i != TI_VA_LIST_GPR_COUNTER_FIELD
> + && i != TI_VA_LIST_FPR_COUNTER_FIELD)))
> record_common_node (cache, global_trees[i]);
> --- gcc/tree-stdarg.c.jj 2015-02-18 22:55:52.000000000 +0100
> +++ gcc/tree-stdarg.c 2015-02-19 14:00:11.882905823 +0100
> @@ -705,6 +705,13 @@ public:
> virtual bool gate (function *fun)
> return (flag_stdarg_opt
> +#ifdef ACCEL_COMPILER
> + /* Disable for GCC5 in the offloading compilers, as
> + va_list and gpr/fpr counter fields are not merged.
> + In GCC6 when stdarg is lowered late this shouldn't be
> + an issue. */
> + && !in_lto_p
> /* This optimization is only for stdarg functions. */
> && fun->stdarg != 0);