Compiling gfortran -fopenmp libgomp.fortran/use_device_ptr-optional-3.f90 \ -O1 -foffload=-lgfortran -ftree-pre with actual offloading (amdgcn, nvidia) fails with: /tmp/ccUEo7uX.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.5' It works with -fno-tree-pre (or when compiling without actual offloading). The optimization happens on the host side as -foffload="-O0 -lgfortran" does not solve the issue. In the Fortran code, this array (A.12) appears in a device function ("omp declare target") as: if (any (c_z /= [1,2,3])) stop 37 As mentioned below, the other array (A.9) appears in: if (any (x /= [3,4,6,2])) stop 44 And in the dump as: static integer(kind=4) A.12[3] = {1, 2, 3}; static integer(kind=4) A.9[4] = {3, 4, 6, 2}; … _20 = A.9[S.10]; … _26 = A.12[S.13_67]; In the optimized dump (-fno-tree-pre): ivtmp.333_78 = (unsigned long) &A.9; … ivtmp.325_89 = (unsigned long) &A.12; But with -ftree-pre, the last assignment is gone – but <bb 43> [local count: 428295]: _gfortran_stop_numeric (37, 0); still exists. Here, the array has been "unrolled", i.e.: if (_61 != 1) goto <bb 43>; [5.50%] (Followed by the conditions for "2" and "3".) That's perfectly fine and optimizes "A.12" away. * * * If I look at the dumps (-fdump-tree-all) on the device side, those (still) contain: pretmp_157 = A.12[_15]; … if (_134 != pretmp_157) goto <bb 45>; [5.50%] My impression is that the local static variable "A.12" is removed before writing the LTO data – based on the -ftree-pre analysis. But the LTO expression usage is written before that removal. – At least that would explain why it fails on the device side.
Similar: * PR71536 – *ICE* with *unused* static variable in declare-target function Richard wrote in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544905.html > IIRC there was a bugreport similar to this where offload > variables were computed "early" instead of at the point > of streaming. > > It's obvious that the offload "part" use is not reflected in > the non-offloat "part" IL (like via a function call parameter > or so), so I assume this is a local static in a function > we simply compile twice (but did not actually clone), > once for each offloat target and once for the host. So it's > likely the above issue.
Also possibly related (static local device variable) are the fixed PRs, which might give a hint how to solve it (although, those look more like a band aid): * PR85063 – local variable for switch statement; solution: "Mark CSWTCH.x variable with attribute omp declare target for offloading functions." * PR84592 → see previous PR above → PR 90779 - comment 10 has: "Add "omp declare target" attributes to static block scope variables inside of target region or target functions."
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>: https://gcc.gnu.org/g:1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8 commit r11-1075-g1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8 Author: Tobias Burnus <tobias@codesourcery.com> Date: Mon Jun 8 23:24:57 2020 +0200 openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551) gcc/ChangeLog: PR lto/94848 PR middle-end/95551 * omp-offload.c (add_decls_addresses_to_decl_constructor, omp_finish_file): Skip removed items. * lto-cgraph.c (output_offload_tables): Likewise; set force_output to this node for variables and functions. libgomp/ChangeLog: PR lto/94848 PR middle-end/95551 * testsuite/libgomp.fortran/target-var.f90: New test.
The releases/gcc-10 branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>: https://gcc.gnu.org/g:9074deee2c5039ff26c8587cc598bc658d8ff32f commit r10-8305-g9074deee2c5039ff26c8587cc598bc658d8ff32f Author: Tobias Burnus <tobias@codesourcery.com> Date: Mon Jun 8 23:24:57 2020 +0200 openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551) gcc/ChangeLog: PR lto/94848 PR middle-end/95551 * omp-offload.c (add_decls_addresses_to_decl_constructor, omp_finish_file): Skip removed items. * lto-cgraph.c (output_offload_tables): Likewise; set force_output to this node for variables and functions. libgomp/ChangeLog: PR lto/94848 PR middle-end/95551 * testsuite/libgomp.fortran/target-var.f90: New test. (cherry picked from commit 1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8)
Seemingly, I forgot the PR .../* in the commit. Hence, the follow-up commits were now recorded here: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547641.html GCC mainline: https://gcc.gnu.org/g:bf4ab2689bb586971d5b2ab6b13d078cd7ac45af GCC 10: https://gcc.gnu.org/g:0a76844b5d2db59768835318d001aa6548265cd1 openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551) gcc/ChangeLog: * omp-offload.c (add_decls_addresses_to_decl_constructor, omp_finish_file): With in_lto_p, stream out all offload-table items even if the symtab_node does not exist. Close as FIXED. – Follow up: PR95583.
The master branch has been updated by Thomas Schwinge <tschwinge@gcc.gnu.org>: https://gcc.gnu.org/g:5864930754f63e2dcef9606f2514ae20e80f436e commit r11-1466-g5864930754f63e2dcef9606f2514ae20e80f436e Author: Thomas Schwinge <thomas@codesourcery.com> Date: Tue Jan 7 17:40:14 2020 +0100 Add 'dg-do run' to 'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848] Fix-up for r279858/commit f760c0c77fe350616da9dbeaea16442b0acfb09c "Fortran] OpenMP/OpenACC â fix more issues with OPTIONAL". With offloading enabled, we then saw: PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O0 (test for excess errors) PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O0 execution test PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O1 (test for excess errors) PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O1 execution test FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O2 (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O2 compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -g (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -g compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -Os (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -Os compilation failed to produce executable ... due to: /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064' [...] ..., but after the recent PR94848, PR95551 changes, that problem is now gone. libgomp/ PR lto/94848 * testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: Add 'dg-do run'.
The releases/gcc-10 branch has been updated by Thomas Schwinge <tschwinge@gcc.gnu.org>: https://gcc.gnu.org/g:61c896d84bdefbfffa7573a8af89119d4db7b3de commit r10-8319-g61c896d84bdefbfffa7573a8af89119d4db7b3de Author: Thomas Schwinge <thomas@codesourcery.com> Date: Tue Jan 7 17:40:14 2020 +0100 Add 'dg-do run' to 'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848] Fix-up for r279858/commit f760c0c77fe350616da9dbeaea16442b0acfb09c "Fortran] OpenMP/OpenACC â fix more issues with OPTIONAL". With offloading enabled, we then saw: PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O0 (test for excess errors) PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O0 execution test PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O1 (test for excess errors) PASS: libgomp.fortran/use_device_ptr-optional-3.f90 -O1 execution test FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O2 (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O2 compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -g (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -O3 -g compilation failed to produce executable FAIL: libgomp.fortran/use_device_ptr-optional-3.f90 -Os (test for excess errors) UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90 -Os compilation failed to produce executable ... due to: /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064' [...] ..., but after the recent PR94848, PR95551 changes, that problem is now gone. libgomp/ PR lto/94848 * testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: Add 'dg-do run'. (cherry picked from commit 5864930754f63e2dcef9606f2514ae20e80f436e)