nvptx offloading patches [1/n]

Mon Nov 3 22:22:00 GMT 2014

On 11/01/14 05:47, Bernd Schmidt wrote:
> This is one of the patches required to make offloading via the LTO path
> work when the machines involved differ.
>
> x86 requires bigger alignments for some types than nvptx does, which
> becomes an issue when reading LTO produced by the host compiler. The
> problem with having a variable with DECL_ALIGN larger than the stack
> alignment is that gcc will try to align the variable dynamically with an
> alloca/rounding operation, and there isn't a working alloca on nvptx.
> Besides, the overhead would be pointless.
>
> The patch below restricts the alignments to the maximum possible when
> reading in LTO data in an offload compiler. Unfortunately
> BIGGEST_ALIGNMENT isn't suitable for this, as it can vary at runtime
> with attribute((target)), and because vector modes can exceed it, so a
> limit based on BIGGEST_ALIGNMENT would be unsuitable for some ports.
> Instead I've added a hook called limit_offload_alignment which is called
> when reading LTO on an offload compiler. It does nothing anywhere except
> on ptx where it limits alignments to 64 bit.
>
> Bootstrapped and tested on x86_64-linux. Ok?
Not ideal.

Doesn't this affect our ability to pass data back and forth between the 
host and GPU?  Or is this strictly a problem with stack objects and thus 
lives entirely on the GPU?

jeff