This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

From: Thomas Schwinge <thomas at codesourcery dot com>
To: Bernd Schmidt <bschmidt at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>
Cc: <gcc-patches at gcc dot gnu dot org>, Tom de Vries <vries at codesourcery dot com>
Date: Wed, 10 Feb 2016 18:37:55 +0100
Subject: Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
Authentication-results: sourceware.org; auth=none
References: <87r3hac1w9 dot fsf at hertz dot schwinge dot homeip dot net> <569D2059 dot 4010105 at mentor dot com> <87d1subnu5 dot fsf at hertz dot schwinge dot homeip dot net> <87a8nyawph dot fsf at hertz dot schwinge dot homeip dot net> <20160122083625 dot GL3017 at tucnak dot redhat dot com> <56A22C2E dot 6000408 at redhat dot com> <20160122132538 dot GT3017 at tucnak dot redhat dot com> <56A22F37 dot 5010505 at redhat dot com> <87zivg8rcy dot fsf at hertz dot schwinge dot homeip dot net> <87h9hg9450 dot fsf at hertz dot schwinge dot homeip dot net> <56BB3A5E dot 6000506 at redhat dot com> <87d1s48w97 dot fsf at hertz dot schwinge dot homeip dot net> <56BB56EC dot 90707 at redhat dot com> <8737t08rgi dot fsf at hertz dot schwinge dot homeip dot net> <56BB674A dot 7050401 at redhat dot com>

Hi!

On Wed, 10 Feb 2016 17:37:30 +0100, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 02/10/2016 05:23 PM, Thomas Schwinge wrote:
> > Why?  A user of GCC has no intrinsic interest in getting OpenACC kernels
> > constructs' code offloaded; the user wants his code to execute as fast as
> > possible.
> >
> > If you consider the whole of OpenACC kernels code offloading as a
> > compiler optimization, then it's fine for GCC to abort this
> > "optimization" if it's reasonably clear that this transformation (code
> > offloading) will not be profitable -- just like what GCC does with other
> > possible code optimizations/transformations.
> 
> Yes, but if a single kernel (which might not even get executed at 
> run-time) can inhibit offloading for the whole program, then we're not 
> making an intelligent decision, and IMO violating user expectations. 

Sure, I agree it's a pretty "rough-grained" decision.  (Owed to the
non-shared-memory offloading architecture -- shared-memory offloading
indeed can make such decisions case by case.)

> IIUC it's also disabling offloading for parallels rather than just 
> kernels, which we previously said shouldn't happen.

Ah, you're talking about mixed OpenACC parallel/kernels codes -- I
understood the earlier discussion to apply to parallel-only codes, where
the "avoid offloading" flag will never be set.  In mixed parallel/kernels
code with one un-parallelized kernels construct, offloading would also
(have to be) disabled for the parallel constructs (for the same data
consistency reasons explained before).  The majority of codes I've seen
use either parallel or kernels constructs, typically not both.

> > As I've said before,
> > profiling the execution times of several real-world codes has shown that
> > under the assumtion that parloops fails to parallelize one kernel (one
> > out of possibly many), this one kernel has always been a "hot spot", and
> > avoiding offloading in this case has always helped prevent performance
> > degradation below host-fallback performance.
> 
> IMO a warning for the specific kernel that's problematic would be better 

That's something Tom suggested,
<http://news.gmane.org/find-root.php?message_id=%3C569D2059.4010105%40mentor.com%3E>,
and which motivated my patch, in going one step further:

> so that users can selectively apply -fopenacc to files where it is 
> profitable.

This puts it into the hands of the user to selectively mark kernels
constructs as suitable for GCC's current parloops processing (for
example, by disabling OpenACC/offloading on a per-file basis) -- which is
something we wanted to avoid, given the idea that in the future, GCC will
improve, and will be able to handle kernels constructs better, and the
user would then have to re-visit/un-do their earlier changes with each
GCC release, instead of just recompiling their code.

> > It's of course unfortunate that we have to disable our offloading
> > machinery for a lot of codes using OpenACC kernels, but given the current
> > state of OpenACC kernels parallelization analysis (parloops), doing so is
> > still profitable for a user, compared to regressed performance with
> > single-threaded offloaded execution.
> 
> How often does this occur on real-world code?

Quite a lot for code using the kernels construct, as discussed before,
given that parloops fails to handle a lot of constructs in real-world
code.

> Will we end up supporting 
> OpenACC by not doing offloading at all in the usual case?

This whole discussion does not at all apply to the body of OpenACC code
using the parallel instead of the kernels construct, which will be
parallelized/offloaded just fine.

> The way you 
> describe it, it sounds like we should recommend that -fopenacc not be 
> used in gcc-6 and restore the previous invoke.texi langauge that marks 
> it as experimental.

Huh?  Like, at random, discouraging users from using GCC's SIMD
vectorizer just because that one fails to vectorize some code that it
could/should vectorize?  (Of course, I'm well aware that GCC's SIMD
vectorizer is much more mature than the OpenACC kernels/parloops
handling; it's seen many more years of development.)

Certainly we should document that there is still a lot of room for
improvement in OpenACC kernels handling (just like it's the case for a
lot of other generic compiler optimizations) -- and we're doing exactly
that on <https://gcc.gnu.org/wiki/OpenACC>.  I don't follow how that
translates to discouraging use of -fopenacc however?

GrÃÃe
 Thomas

Follow-Ups:
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Bernd Schmidt

References:
- Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Thomas Schwinge
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Thomas Schwinge
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Bernd Schmidt
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Thomas Schwinge
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Bernd Schmidt
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Thomas Schwinge
- Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"
  - From: Bernd Schmidt

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]