This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp-nvptx 2/9] nvptx backend: new "uniform SIMT" codegen variant
- From: Alexander Monakov <amonakov at ispras dot ru>
- To: Nathan Sidwell <nathan at acm dot org>
- Cc: Jakub Jelinek <jakub at redhat dot com>, gcc-patches at gcc dot gnu dot org, Bernd Schmidt <bschmidt at redhat dot com>, Dmitry Melnik <dm at ispras dot ru>, Thomas Schwinge <thomas at codesourcery dot com>
- Date: Wed, 2 Dec 2015 17:41:01 +0300 (MSK)
- Subject: Re: [gomp-nvptx 2/9] nvptx backend: new "uniform SIMT" codegen variant
- Authentication-results: sourceware.org; auth=none
- References: <1448983707-18854-1-git-send-email-amonakov at ispras dot ru> <1448983707-18854-3-git-send-email-amonakov at ispras dot ru> <20151202104034 dot GG5675 at tucnak dot redhat dot com> <565EEBF7 dot 8070105 at acm dot org>
On Wed, 2 Dec 2015, Nathan Sidwell wrote:
> On 12/02/15 05:40, Jakub Jelinek wrote:
> > Don't know the HW good enough, is there any power consumption, heat etc.
> > difference between the two approaches? I mean does the HW consume different
> > amount of power if only one thread in a warp executes code and the other
> > threads in the same warp just jump around it, vs. having all threads busy?
>
> Having all threads busy will increase power consumption. >
Is that from general principles (i.e. "if it doesn't increase power
consumption, the GPU is poorly optimized"), or is that based on specific
knowledge on how existing GPUs operate (presumably reverse-engineered or
privately communicated -- I've never seen any public statements on this
point)?
The only certain case I imagine is instructions that go to SFU rather than
normal SPs -- but those are relatively rare.
> It's also bad if the other vectors are executing memory access instructions.
How so? The memory accesses are the same independent of whether you reading
the same data from 1 thread or 32 synchronous threads.
Alexander