This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, OpenACC] Make reduction arguments addressable
- From: Chung-Lin Tang <cltang at codesourcery dot com>
- To: Thomas Schwinge <thomas at codesourcery dot com>, Jakub Jelinek <jakub at redhat dot com>, Cesar Philippidis <cesar_philippidis at mentor dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 31 May 2016 17:51:00 +0800
- Subject: Re: [PATCH, OpenACC] Make reduction arguments addressable
- Authentication-results: sourceware.org; auth=none
- References: <fac841d6-320a-6de8-2bcd-07145151d0f3 at codesourcery dot com> <20160530165341 dot GS28550 at tucnak dot redhat dot com> <878tyqsn32 dot fsf at kepler dot schwinge dot homeip dot net>
On 2016/5/31 3:28 PM, Thomas Schwinge wrote:
> Hi!
>
> On Mon, 30 May 2016 18:53:41 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote:
>>> Hi, a previous patch of Cesar's has made the middle-end omp-lowering
>>> automatically create and insert a tofrom (i.e. present_or_copy) map for
>>> parallel reductions. This allowed the user to not need explicit
>>> clauses to copy out the reduction result, but because reduction arguments
>>> are not marked addressable, async does not work as expected,
>>> i.e. the asynchronous copy-out results are not used in the compiler generated code.
>>
>> If you need it only for async parallel/kernels? regions, can't you do that
>> only for those and not for others?
That is achievable, but not in line with how we currently treat all other
data clause OMP_CLAUSE_MAPs, which are all marked addressable. Is this special
case handling really better here?
> Also, please add comments to the source code to document the need for
> such special handling.
>
>>> This patch fixes this in the front-ends, I've tested this patch without
>>> new regressions, and fixes some C++ OpenACC tests that regressed after
>>> my last OpenACC async patch. Is this okay for trunk?
>>
>> Testcases in the testsuite or others? If the latter, we should add them.
>
> The r236772 commit "[PATCH, libgomp] Rewire OpenACC async",
> <http://news.gmane.org/find-root.php?message_id=%3C56FA4F69.1060101%40codesourcery.com%3E>
> regressed (or, triggered/exposed the existing wrong behavior?)
> libgomp.oacc-c++/template-reduction.C execution testing for nvptx
> offloading. (Please always send email about such known regressions, and
> XFAIL them with your commit -- that would have saved me an hour
> yesterday, when I bisected recent changes to figure out why that test
> suddenly fails.)
Sorry, Thomas. I was going to quickly send this follow-up patch, so glossed over XFAILing.
> For reference, here is a test case, a reduced C version of
> libgomp/testsuite/libgomp.oacc-c++/template-reduction.C. This test case
> works (without Chung-Lin's "[PATCH, OpenACC] Make reduction arguments
> addressable" patch) if I enable "POCs", which surprises me a bit, because
> I thought after Cesar's recent changes, the gimplifier is doing the same
> thing of adding a data clause next to the reduction clause. Probably
> it's not doing the exactly same thing, though. Should it? Cesar, do you
> have any comments on this? For example (just guessing), should
> TREE_ADDRESSABLE be set where the gimplifier does its work, instead of in
> the three front ends?
There's really nothing wrong about Cesar's patch. The marking addressable needs
to be done earlier, or it may be too late during gimplification. I already
tried to fix this in gimplify.c before, but didn't completely work.
I'll add more testcases for this before I commit any final patches.
Thanks,
Chung-Lin
>
> // Reduced C version of libgomp/testsuite/libgomp.oacc-c++/template-reduction.C.
>
> const int n = 100;
>
> // Check present and async and an explicit firstprivate
>
> int
> async_sum (int c)
> {
> int s = 0;
>
> #define POCs //present_or_copy(s)
> #pragma acc parallel loop num_gangs (10) gang reduction (+:s) POCs firstprivate (c) async
> for (int i = 0; i < n; i++)
> s += i+c;
>
> #pragma acc wait
>
> return s;
> }
>
> int
> main()
> {
> int result = 0;
>
> for (int i = 0; i < n; i++)
> {
> result += i+1;
> }
>
> if (async_sum (1) != result)
> __builtin_abort ();
>
> return 0;
> }
>
>
> GrÃÃe
> Thomas
>