This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [gomp4 11/14] libgomp: avoid variable-length stack allocation in team.c
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Alexander Monakov <amonakov at ispras dot ru>
- Cc: gcc-patches at gcc dot gnu dot org, Dmitry Melnik <dm at ispras dot ru>
- Date: Wed, 21 Oct 2015 11:58:23 +0200
- Subject: Re: [gomp4 11/14] libgomp: avoid variable-length stack allocation in team.c
- Authentication-results: sourceware.org; auth=none
- References: <1445366076-16082-1-git-send-email-amonakov at ispras dot ru> <1445366076-16082-12-git-send-email-amonakov at ispras dot ru>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Tue, Oct 20, 2015 at 09:34:33PM +0300, Alexander Monakov wrote:
> NVPTX does not support alloca or variable-length stack allocations, thus
> heap allocation needs to be used instead. I've opted to make this a generic
> change instead of guarding it with an #ifdef: libgomp usually leaves thread
> stack size up to libc, so avoiding unbounded stack allocation makes sense.
>
> * task.c (GOMP_task): Use a fixed-size on-stack buffer or a heap
> allocation instead of a variable-size on-stack allocation.
I don't like this unconditionally.
This really isn't unbounded, the buffer just contains the privatized
variables. If one uses
int c[124];
void foo (void)
{
int a, b[10];
#pragma omp parallel firstprivate (a, b, c)
{
use (a, b, c);
}
}
then the private copies of the variables are allocated on the stack too,
a, b already in addition to the original non-privatized vars a and b,
c, which has been above a global var, is automatic just in the private copy.
Now, for #pragma omp task firstprivate (a, b, c) if there are copy constructors
involved, the copy ctors for the firstprivate vars need to be run before
GOMP_task returns, and therefore we let those variables live in the heap
rather than on the stack; but if we know the task needs to execute immediately,
with the alloca we do pretty much the same thing as parallel does, all the
privatized variables are allocated on the stack (just all of them together
using alloca instead of individually by the compiler).
I'm fine with temporarily having some #ifdef HAVE_BROKEN_ALLOCA or similar,
but as nvptx I think doesn't support setjmp/longjmp nor computed goto,
I think just supporting alloca by using malloc instead and freeing at the
end of function if any allocations happened in the function is the right
thing.
> ---
> libgomp/task.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/libgomp/task.c b/libgomp/task.c
> index 74920d5..ffb7ed2 100644
> --- a/libgomp/task.c
> +++ b/libgomp/task.c
> @@ -162,11 +162,16 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
> thr->task = &task;
> if (__builtin_expect (cpyfn != NULL, 0))
> {
> - char buf[arg_size + arg_align - 1];
> + long buf_size = arg_size + arg_align - 1;
> + char buf_fixed[2048], *buf = buf_fixed;
> + if (sizeof(buf_fixed) < buf_size)
> + buf = gomp_malloc (buf_size);
> char *arg = (char *) (((uintptr_t) buf + arg_align - 1)
> & ~(uintptr_t) (arg_align - 1));
> cpyfn (arg, data);
> fn (arg);
> + if (buf != buf_fixed)
> + free (buf);
> }
> else
> fn (data);
Jakub