Bug 63979 - [openacc] undefined reference to main._omp_fn.x
Summary: [openacc] undefined reference to main._omp_fn.x
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: other (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: openacc
Depends on:
Blocks:
 
Reported: 2014-11-19 20:02 UTC by Tom de Vries
Modified: 2015-01-20 15:43 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
asyncwait-2.c (17.00 KB, text/x-csrc)
2014-11-19 20:02 UTC, Tom de Vries
Details
ltrans0.s (1.43 KB, text/plain)
2014-11-20 21:38 UTC, Tom de Vries
Details
ltrans1.s (5.14 KB, text/plain)
2014-11-20 21:39 UTC, Tom de Vries
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2014-11-19 20:02:58 UTC
Created attachment 34041 [details]
asyncwait-2.c

When compiling an openacc test-case with the gomp-4_0-branch, I run into:
...
$ ./lean-c/install/bin/gcc asyncwait-2.c -fopenacc -flto --param lto-min-partition=900
/tmp/ccRhKrwN.ltrans0.ltrans.o:(__gnu_offload_funcs+0x0): undefined reference to `main._omp_fn.20'
/tmp/ccRhKrwN.ltrans0.ltrans.o:(__gnu_offload_funcs+0x8): undefined reference to `main._omp_fn.19'
/tmp/ccRhKrwN.ltrans1.ltrans.o: In function `main':
ccRhKrwN.ltrans1.o:(.text+0xe19): undefined reference to `cuStreamCreate'
collect2: error: ld returned 1 exit status
...

Note that I'm using patch https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00085.html on top of gomp-4_0-branch, otherwise any openacc testcase will fail in lto when processing openacc builtins.
Comment 1 Tom de Vries 2014-11-19 20:05:41 UTC
I only run into this with -flto-partition=balanced.

From the exe.wpa.000i.cgraph dump:
...
Total unit size: 2034, partition size: 1000
Step 0: added main._omp_fn.0/24, size 22, cost 1/0 best 1/0, step 0
Step 1: added main._omp_fn.1/23, size 44, cost 2/0 best 2/0, step 1
  ...
Step 17: added main._omp_fn.17/7, size 694, cost 18/0 best 18/0, step 17
Step 18: added main._omp_fn.18/6, size 735, cost 19/0 best 19/0, step 18
Step 19: added main._omp_fn.19/5, size 775, cost 20/0 best 19/0, step 18
Step 20: added main._omp_fn.20/4, size 816, cost 21/0 best 19/0, step 18
Step 21: added main/3, size 2034, cost 53/21 best 19/0, step 18
Unwinding 3 insertions to step 18
New partition
Step 19: added main._omp_fn.19/5, size 40, cost 1/21 best 1/21, step 19
Step 20: added main._omp_fn.20/4, size 81, cost 2/21 best 2/21, step 20
Step 21: added main/3, size 1299, cost 72/23 best 2/21, step 20
Privatizing symbol name: main._omp_fn.0 -> main._omp_fn.0.lto_priv.0
Promoting as hidden: main._omp_fn.0
Privatizing symbol name: main._omp_fn.1 -> main._omp_fn.1.lto_priv.1
Promoting as hidden: main._omp_fn.1
...

In .exe.ltrans0.s, main._omp_fn.18 is privatized, but exported as global hidden:
...
        .text
        .globl  main._omp_fn.18.lto_priv.18
        .hidden main._omp_fn.18.lto_priv.18
        .type   main._omp_fn.18.lto_priv.18, @function
main._omp_fn.18.lto_priv.18:
...

In .exe.ltrans1.s, it is referenced, and declared as hidden:
...
        .hidden main._omp_fn.18.lto_priv.18
...

Conversely, in .exe.ltrans1.s, main._omp_fn.20 is not privatized:
...
        .type   main._omp_fn.20, @function
main._omp_fn.20:
...

But in .exe.ltrans0.s, main._omp_fn.20 is referenced, and not declared:
...
.omp_func_table.4851:
        .quad   main._omp_fn.20
...
Comment 2 Ilya Verbin 2014-11-20 11:09:05 UTC
I tried to reproduce this issue using trunk gcc and OpenMP:

gcc -fopenmp -flto -flto-partition=balanced -lgfortran -save-temps libgomp/testsuite/libgomp.fortran/target2.f90

But all functions are privatized, e.g. __target2_MOD_foo._omp_fn.3.lto_priv.5, it's exported as global hidden in partition 1, and referenced in the offload table in partition 0 as it was planned.

We should figure out why in your case main._omp_fn.19 and main._omp_fn.20 were not marked as global...
Comment 3 Bernd Schmidt 2014-11-20 15:27:02 UTC
Can you reproduce this with the trunk patch kit I posted internally? gomp-4_0-branch is somewhat out of date wrt offloading.
Comment 4 Tom de Vries 2014-11-20 17:26:57 UTC
> Can you reproduce this with the trunk patch kit I posted internally?
> gomp-4_0-branch is somewhat out of date wrt offloading.

No, it does not reproduce that way.

The split falls somewhat differently:
...
Total unit size: 1929, partition size: 900
Step 0: added main._omp_fn.20/24, size 37, cost 1/0 best 1/0, step 0
Step 1: added main._omp_fn.19/23, size 73, cost 2/0 best 2/0, step 1
Step 2: added main._omp_fn.18/22, size 110, cost 3/0 best 3/0, step 2
  ...
Step 17: added main._omp_fn.3/7, size 660, cost 18/0 best 18/0, step 17
Step 18: added main._omp_fn.2/6, size 696, cost 19/0 best 18/0, step 17
Step 19: added main._omp_fn.1/5, size 714, cost 20/0 best 18/0, step 17
Step 20: added main._omp_fn.0/4, size 732, cost 21/0 best 18/0, step 17
Step 21: added main/3, size 1929, cost 53/21 best 18/0, step 17
Unwinding 4 insertions to step 17
New partition
Step 18: added main._omp_fn.2/6, size 36, cost 1/21 best 1/21, step 18
Step 19: added main._omp_fn.1/5, size 54, cost 2/21 best 2/21, step 19
Step 20: added main._omp_fn.0/4, size 72, cost 3/21 best 3/21, step 20
Step 21: added main/3, size 1269, cost 71/24 best 3/21, step 20
...

But I think the main difference is that the offload table and main (using the offload table) are now in the same partition. I don't know whether that's by design or accident.
Comment 5 Ilya Verbin 2014-11-20 17:34:17 UTC
(In reply to vries from comment #4)
> But I think the main difference is that the offload table and main (using
> the offload table) are now in the same partition. I don't know whether
> that's by design or accident.

What do you mean by "main (using the offload table)"?
The design was to have the offload table in the first partition (number zero), and the table should be used only in libgomp through the GOMP_offload_register function.
Comment 6 Tom de Vries 2014-11-20 21:38:27 UTC
Created attachment 34058 [details]
ltrans0.s
Comment 7 Tom de Vries 2014-11-20 21:39:02 UTC
Created attachment 34059 [details]
ltrans1.s
Comment 8 Tom de Vries 2014-11-20 21:42:17 UTC
(In reply to Ilya Verbin from comment #5)
> (In reply to vries from comment #4)
> > But I think the main difference is that the offload table and main (using
> > the offload table) are now in the same partition. I don't know whether
> > that's by design or accident.
> 
> What do you mean by "main (using the offload table)"?
> The design was to have the offload table in the first partition (number
> zero),

It seems to be in partition 1:
...
$ grep -c OFFLOAD_TABLE ltrans1.s ltrans0.s 
ltrans1.s:33
ltrans0.s:0
...

> and the table should be used only in libgomp through the
> GOMP_offload_register function.

It's used like this, in main:
...
	movl	$__OFFLOAD_TABLE__, %esi
...
Comment 9 Ilya Verbin 2014-11-20 23:38:56 UTC
(In reply to vries from comment #8)
> (In reply to Ilya Verbin from comment #5)
> > (In reply to vries from comment #4)
> > > But I think the main difference is that the offload table and main (using
> > > the offload table) are now in the same partition. I don't know whether
> > > that's by design or accident.
> > What do you mean by "main (using the offload table)"?
> It's used like this, in main:
> 	movl	$__OFFLOAD_TABLE__, %esi

Ah, I see, this is something OpenACC specific, for some reason it passes __OFFLOAD_TABLE__ to all functions.
Anyway, this is just a weak symbol, which points to the start of the offload table. It's defined by mkoffload when all partitions are ready. I don't think that it could somehow affect the LTO partitioning and the functions' visibility.
Comment 10 Tom de Vries 2015-01-20 15:43:10 UTC
Can't reproduce this with current trunk.