[gomp-nvptx 2/9] nvptx backend: new "uniform SIMT" codegen variant

Nathan Sidwell nathan@acm.org
Wed Dec 2 15:18:00 GMT 2015


On 12/02/15 10:12, Jakub Jelinek wrote:

> If we have a reasonable IPA pass to discover which addressable variables can
> be shared by multiple threads and which can't, then we could use soft-stack
> for those that can be shared by multiple PTX threads (different warps, or
> same warp, different threads in it), then we shouldn't need to copy any
> stack, just broadcast the scalar vars.

Note the current scalar (.reg)  broadcasting uses the live register set.  Not 
the subset of that that is actually read within the partitioned region.  That'd 
be a relatively straightforward optimization I think.

nathan



More information about the Gcc-patches mailing list