Bug 94848 - [Offloading][LTO] error due to only partially eliminated var / -ftree-pre causes link errors | libgomp.fortran/use_device_ptr-optional-3.f90 failures
Summary: [Offloading][LTO] error due to only partially eliminated var / -ftree-pre cau...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: lto (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks: 95551
  Show dependency treegraph
 
Reported: 2020-04-29 13:51 UTC by Tobias Burnus
Modified: 2020-06-17 22:16 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2020-04-29 13:51:16 UTC
Compiling
  gfortran -fopenmp libgomp.fortran/use_device_ptr-optional-3.f90 \
      -O1 -foffload=-lgfortran -ftree-pre

with actual offloading (amdgcn, nvidia) fails with:
  /tmp/ccUEo7uX.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.5'

It works with -fno-tree-pre (or when compiling without actual offloading).

The optimization happens on the host side as -foffload="-O0 -lgfortran"
does not solve the issue.


In the Fortran code, this array (A.12) appears in a device function ("omp declare target") as:
    if (any (c_z /= [1,2,3])) stop 37

As mentioned below, the other array (A.9) appears in:
    if (any (x /= [3,4,6,2]))  stop 44

And in the dump as:
  static integer(kind=4) A.12[3] = {1, 2, 3};
  static integer(kind=4) A.9[4] = {3, 4, 6, 2};
…
  _20 = A.9[S.10];
…
  _26 = A.12[S.13_67];


In the optimized dump (-fno-tree-pre):
  ivtmp.333_78 = (unsigned long) &A.9;
…
  ivtmp.325_89 = (unsigned long) &A.12;

But with -ftree-pre, the last assignment is gone – but 
  <bb 43> [local count: 428295]:
  _gfortran_stop_numeric (37, 0);
still exists. Here, the array has been "unrolled", i.e.:

  if (_61 != 1)
    goto <bb 43>; [5.50%]

(Followed by the conditions for "2" and "3".)

That's perfectly fine and optimizes "A.12" away.

 * * *

If I look at the dumps (-fdump-tree-all) on the device side, those (still) contain:
  pretmp_157 = A.12[_15];
…
  if (_134 != pretmp_157)
     goto <bb 45>; [5.50%]

My impression is that the local static variable "A.12" is removed before writing the LTO data – based on the -ftree-pre analysis.

But the LTO expression usage is written before that removal. – At least that would explain why it fails on the device side.
Comment 1 Tobias Burnus 2020-04-30 07:22:24 UTC
Similar:
* PR71536 – *ICE* with *unused* static variable in declare-target function

Richard wrote in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544905.html
> IIRC there was a bugreport similar to this where offload
> variables were computed "early" instead of at the point
> of streaming.
>
> It's obvious that the offload "part" use is not reflected in
> the non-offloat "part" IL (like via a function call parameter
> or so), so I assume this is a local static in a function
> we simply compile twice (but did not actually clone),
> once for each offloat target and once for the host.  So it's
> likely the above issue.
Comment 2 Tobias Burnus 2020-04-30 07:47:16 UTC
Also possibly related (static local device variable) are the fixed PRs,
which might give a hint how to solve it (although, those look more like a band aid):

* PR85063 – local variable for switch statement; solution:
  "Mark CSWTCH.x variable with attribute omp declare target for
   offloading functions."

* PR84592
  → see previous PR above
  → PR 90779 - comment 10 has: "Add "omp declare target" attributes
	to static block scope variables inside of target region or target
	functions."
Comment 3 GCC Commits 2020-06-08 21:25:33 UTC
The master branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>:

https://gcc.gnu.org/g:1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8

commit r11-1075-g1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Mon Jun 8 23:24:57 2020 +0200

    openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
    
    gcc/ChangeLog:
    
            PR lto/94848
            PR middle-end/95551
            * omp-offload.c (add_decls_addresses_to_decl_constructor,
            omp_finish_file): Skip removed items.
            * lto-cgraph.c (output_offload_tables): Likewise; set force_output
            to this node for variables and functions.
    
    libgomp/ChangeLog:
    
            PR lto/94848
            PR middle-end/95551
            * testsuite/libgomp.fortran/target-var.f90: New test.
Comment 4 GCC Commits 2020-06-15 09:07:14 UTC
The releases/gcc-10 branch has been updated by Tobias Burnus <burnus@gcc.gnu.org>:

https://gcc.gnu.org/g:9074deee2c5039ff26c8587cc598bc658d8ff32f

commit r10-8305-g9074deee2c5039ff26c8587cc598bc658d8ff32f
Author: Tobias Burnus <tobias@codesourcery.com>
Date:   Mon Jun 8 23:24:57 2020 +0200

    openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
    
    gcc/ChangeLog:
    
            PR lto/94848
            PR middle-end/95551
            * omp-offload.c (add_decls_addresses_to_decl_constructor,
            omp_finish_file): Skip removed items.
            * lto-cgraph.c (output_offload_tables): Likewise; set force_output
            to this node for variables and functions.
    
    libgomp/ChangeLog:
    
            PR lto/94848
            PR middle-end/95551
            * testsuite/libgomp.fortran/target-var.f90: New test.
    
    (cherry picked from commit 1c0fdaf79e3618fd7512608a2e5c62b6b306e9e8)
Comment 5 Tobias Burnus 2020-06-15 09:11:34 UTC
Seemingly, I forgot the PR .../* in the commit. Hence, the follow-up commits were now recorded here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547641.html

GCC mainline: https://gcc.gnu.org/g:bf4ab2689bb586971d5b2ab6b13d078cd7ac45af
GCC 10: https://gcc.gnu.org/g:0a76844b5d2db59768835318d001aa6548265cd1

    openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
    
    gcc/ChangeLog:
    
            * omp-offload.c (add_decls_addresses_to_decl_constructor,
            omp_finish_file): With in_lto_p, stream out all offload-table
            items even if the symtab_node does not exist.
 
Close as FIXED. – Follow up: PR95583.
Comment 6 GCC Commits 2020-06-17 22:16:25 UTC
The master branch has been updated by Thomas Schwinge <tschwinge@gcc.gnu.org>:

https://gcc.gnu.org/g:5864930754f63e2dcef9606f2514ae20e80f436e

commit r11-1466-g5864930754f63e2dcef9606f2514ae20e80f436e
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Tue Jan 7 17:40:14 2020 +0100

    Add 'dg-do run' to 'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848]
    
    Fix-up for r279858/commit f760c0c77fe350616da9dbeaea16442b0acfb09c "Fortran]
    OpenMP/OpenACC â fix more issues with OPTIONAL".
    
    With offloading enabled, we then saw:
    
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable
    
     ... due to:
    
        /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
        [...]
    
    ..., but after the recent PR94848, PR95551 changes, that problem is now gone.
    
            libgomp/
            PR lto/94848
            * testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: Add
            'dg-do run'.
Comment 7 GCC Commits 2020-06-17 22:16:41 UTC
The releases/gcc-10 branch has been updated by Thomas Schwinge <tschwinge@gcc.gnu.org>:

https://gcc.gnu.org/g:61c896d84bdefbfffa7573a8af89119d4db7b3de

commit r10-8319-g61c896d84bdefbfffa7573a8af89119d4db7b3de
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Tue Jan 7 17:40:14 2020 +0100

    Add 'dg-do run' to 'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848]
    
    Fix-up for r279858/commit f760c0c77fe350616da9dbeaea16442b0acfb09c "Fortran]
    OpenMP/OpenACC â fix more issues with OPTIONAL".
    
    With offloading enabled, we then saw:
    
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
        PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
        FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
        UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable
    
     ... due to:
    
        /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
        [...]
    
    ..., but after the recent PR94848, PR95551 changes, that problem is now gone.
    
            libgomp/
            PR lto/94848
            * testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: Add
            'dg-do run'.
    
    (cherry picked from commit 5864930754f63e2dcef9606f2514ae20e80f436e)