Bug 71499 - ICE in LTO1 when attempting NVPTX offloading (-fopenacc)
Summary: ICE in LTO1 when attempting NVPTX offloading (-fopenacc)
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: lto (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: ice-on-valid-code, lto, openacc
Depends on:
Blocks:
 
Reported: 2016-06-10 22:24 UTC by Joel Yliluoma
Modified: 2019-01-15 10:24 UTC (History)
2 users (show)

See Also:
Host:
Target: nvptx
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-06-29 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joel Yliluoma 2016-06-10 22:24:40 UTC
Summary: Error message:

    lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203
On GCC 6.1.0

Compiling this code:

    void test()
    {
    }
    int main()
    {
      #pragma acc parallel
      test();
    }

With this commandline:

    gcc tmpe.c  -O0 -fopenacc -v

Complete output of GCC:

    Using built-in specs.
    COLLECT_GCC=/usr/local/bin/gcc
    COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper
    OFFLOAD_TARGET_NAMES=nvptx-none
    Target: x86_64-pc-linux-gnu
    Configured with: ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-offload-targets=nvptx-none=/usr/local/nvptx-none --enable-languages=c,c++ --with-cuda-driver=/usr --disable-bootstrap
    Thread model: posix
    gcc version 6.1.0 (GCC) 
    COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread'
     /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/cc1 -quiet -v -imultiarch x86_64-linux-gnu -D_REENTRANT tmpe.c -quiet -dumpbase tmpe.c -mtune=generic -march=x86-64 -auxbase tmpe -O0 -version -fopenacc -o /tmp/ccPHnCW0.s
    GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu)
    	compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
    GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
    ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
    ignoring nonexistent directory "/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../x86_64-pc-linux-gnu/include"
    #include "..." search starts here:
    #include <...> search starts here:
     /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include
     /usr/local/include
     /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include-fixed
     /usr/include/x86_64-linux-gnu
     /usr/include
    End of search list.
    GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu)
    	compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
    GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
    Compiler executable checksum: 1c46fde4e47f1157bf1461541c266a3c
    COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread'
     as -v --64 -o /tmp/ccd60m4w.o /tmp/ccPHnCW0.s
    GNU assembler version 2.26 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.26
    COMPILER_PATH=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/
    LIBRARY_PATH=/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../:/lib/:/usr/lib/
    Reading specs from /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/libgomp.spec
    COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread'
     /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/collect2 -plugin /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/liblto_plugin.so -plugin-opt=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper -plugin-opt=-fresolution=/tmp/ccPr7Oc3.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtbegin.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadbegin.o -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../.. /tmp/ccd60m4w.o -lgomp -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadend.o
    /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/nvptx-none/mkoffload @/tmp/ccDURU7y
    /usr/local/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc @/tmp/cchbn3g5
    Using built-in specs.
    COLLECT_GCC=/usr/local/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc
    COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/accel/nvptx-none/lto-wrapper
    Target: nvptx-none
    Configured with: ../configure --target=nvptx-none --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-sjlj-exceptions --enable-newlib-io-long-long --enable-languages=c,c++ --with-build-time-tools=/usr/local/nvptx-none/bin : (reconfigured) ../configure --target=nvptx-none --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-sjlj-exceptions --enable-newlib-io-long-long --enable-languages=c,c++ --with-build-time-tools=/usr/local/nvptx-none/bin
    Thread model: single
    gcc version 6.1.0 (GCC) 
    COLLECT_GCC_OPTIONS='-v' '-m64' '-v' '-fmath-errno' '-fsigned-zeros' '-ftrapping-math' '-fno-trapv' '-fno-strict-overflow' '-fno-openmp' '-foffload-abi=lp64' '-O0' '-fopenacc' '-o' '/tmp/ccbsSD6y.mkoffload'
     /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/accel/nvptx-none/lto1 -quiet -dumpbase ccd60m4w.o -m64 -auxbase ccd60m4w -O0 -version -fmath-errno -fsigned-zeros -ftrapping-math -fno-trapv -fno-strict-overflow -fno-openmp -foffload-abi=lp64 -fopenacc @/tmp/ccBVSiU2 -o /tmp/ccSbKQ4y.s
    GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
    	compiled by GNU C version 4.9.2, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
    GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
    GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
    	compiled by GNU C version 4.9.2, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
    GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
    lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203
    0x7984cd input_overwrite_node
    	../../gcc/lto-cgraph.c:1201
    0x7984cd input_node
    	../../gcc/lto-cgraph.c:1296
    0x7984cd input_cgraph_1
    	../../gcc/lto-cgraph.c:1546
    0x7984cd input_symtab()
    	../../gcc/lto-cgraph.c:1849
    0x53e23f read_cgraph_and_symbols
    	../../gcc/lto/lto.c:2856
    0x53e23f lto_main()
    	../../gcc/lto/lto.c:3304
    Please submit a full bug report,
    with preprocessed source if appropriate.
    Please include the complete backtrace with any bug report.
    See <http://gcc.gnu.org/bugs.html> for instructions.
    mkoffload: fatal error: /usr/local/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
    compilation terminated.
    lto-wrapper: fatal error: /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/nvptx-none/mkoffload returned 1 exit status
    compilation terminated.
    /usr/bin/ld: error: lto-wrapper failed
    collect2: error: ld returned 1 exit status

Operating system: Debian Jessie on x86_64.

GCC was installed using the following sequence of commands, after instructions from https://gcc.gnu.org/wiki/Offloading:

        apt-get install -t jessie-backports nvidia-cuda-toolkit libcuda1:i386 build-essential git gcc-multilib
        apt-get install -t binutils/testing binutils-multiarch/testing
        cd /usr/local/src
                git clone https://github.com/MentorEmbedded/nvptx-tools
                cd nvptx-tools
                ./configure; make -j8; make install
        cd /usr/local/src
                git clone https://github.com/MentorEmbedded/nvptx-newlib
        cd /dev/shm
                wget ftp://ftp.uvsq.fr/pub/gcc/releases/gcc-6.1.0/gcc-6.1.0.tar.bz2
                tar xvfj gcc-6.1.0.tar.bz2
                cd gcc-6.1.0
                mkdir build ; cd build
                ln /usr/local/src/nvptx-newlib/newlib -s ../newlib
                ../configure --target=nvptx-none --enable-as-accelerator-for=x86_64-pc-linux-gnu \
                        --disable-sjlj-exceptions --enable-newlib-io-long-long \
                        --enable-languages=c,c++ --with-build-time-tools=/usr/local/nvptx-none/bin
                make -j8 ; make install
                rm -rf *
                ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu \
                        --target=x86_64-pc-linux-gnu \
                        --enable-offload-targets=nvptx-none=/usr/local/nvptx-none \
                        --enable-languages=c,c++ \
                        --with-cuda-driver=/usr \
                        --disable-bootstrap
                make -j8 ; make install

The exact same error was also reproduced when --disable-bootstrap was not used.

For reference, the problems seems to be in the transfer of code from the host compiler to the nvptx cross compiler. These individual commands run successfully:

    x86_64-pc-linux-gnu-accel-nvptx-none-gcc tmpe.c -S -o-
    gcc tmpe.c  -O0 -fopenacc -c

But if I run the host GCC separately like that, I get a different ICE from mkoffload:

    COLLECT_GCC=x86_64-pc-linux-gnu-accel-nvptx-none-gcc \
    /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/nvptx-none/mkoffload tmpe.o
    mkoffload: internal compiler error: in main, at config/nvptx/mkoffload.c:502
    Please submit a full bug report,
    with preprocessed source if appropriate.
    Please include the complete backtrace with any bug report.
    See <http://gcc.gnu.org/bugs.html> for instructions.
Comment 1 Joel Yliluoma 2016-06-10 23:24:38 UTC
Addendum: While this works, reading LTO data and producing HOST code:

/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto1 -dumpbase tmpe.o -auxbase tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s 

This does not:

/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/accel/nvptx-none/lto1 -dumpbase tmpe.o -auxbase tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s
GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
        compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
        compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
options passed:  -fopenacc tmpe.o
options enabled:  -faggressive-loop-optimizations -fauto-inc-dec
 -fchkp-check-incomplete-type -fchkp-check-read -fchkp-check-write
 -fchkp-instrument-calls -fchkp-narrow-bounds -fchkp-optimize
 -fchkp-store-bounds -fchkp-use-static-bounds
 -fchkp-use-static-const-bounds -fchkp-use-wrappers -fcommon
 -fdelete-null-pointer-checks -fearly-inlining
 -feliminate-unused-debug-types -ffunction-cse -fgcse-lm -fgnu-runtime
 -fgnu-unique -fident -finline-atomics -fipa-pta -fira-hoist-pressure
 -fira-share-save-slots -fira-share-spill-slots -fivopts
 -fkeep-static-consts -fleading-underscore -flifetime-dse
 -flto-odr-type-merging -fmath-errno -fmerge-debug-strings -fpeephole -fplt
 -fprefetch-loop-arrays -freg-struct-return -fsched-critical-path-heuristic
 -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
 -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
 -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fschedule-fusion
 -fsemantic-interposition -fshow-column -fsigned-zeros
 -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt
 -fstrict-volatile-bitfields -fsync-libcalls -ftoplevel-reorder
 -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert
 -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
 -ftree-parallelize-loops= -ftree-phiprop -ftree-reassoc -ftree-scev-cprop
 -funit-at-a-time -fvar-tracking-assignments -fzero-initialized-in-bss -m64
Reading object files: tmpe.o {GC start 776k} 
Reading the callgraph
lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203
0x7a73c5 input_overwrite_node
        ../../gcc/lto-cgraph.c:1201
0x7a73c5 input_node
        ../../gcc/lto-cgraph.c:1296
0x7a73c5 input_cgraph_1
        ../../gcc/lto-cgraph.c:1546
0x7a73c5 input_symtab()
        ../../gcc/lto-cgraph.c:1849
0x5537fb read_cgraph_and_symbols
        ../../gcc/lto/lto.c:2856
0x5537fb lto_main()
        ../../gcc/lto/lto.c:3304
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.

I also tried using different combinations of --enable-languages=c,c++,lto , --enable-lto , including neither, but none affected the problem. I also tried using the svn version of gcc, but it also exhibited the same problem.
The nvptx-newlib revision is aadc8eb0ec43b7cd0dd2dfb484bae63c8b05ef24 and nvptx-tools revision is c28050f60193b3b95a18866a96f03334e874e78f.
Comment 2 Thomas Schwinge 2016-06-29 13:20:22 UTC
The is the OpenACC variant of OpenMP's PR71535.

You need to add "#pragma acc routine" for function "test".  (Of course, we shouldn't run into an ICE nevertheless.)
Comment 3 Tom de Vries 2019-01-15 10:24:23 UTC
(In reply to Thomas Schwinge from comment #2)
> The is the OpenACC variant of OpenMP's PR71535.
> 
> You need to add "#pragma acc routine" for function "test".  (Of course, we
> shouldn't run into an ICE nevertheless.)

This is a duplicate of libgomp/testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c , which was fixed here ( https://gcc.gnu.org/ml/gcc-cvs/2018-12/msg00431.html ):
...
Author: vries
Date: Fri Dec 14 13:48:56 2018
New Revision: 267134

URL: https://gcc.gnu.org/viewcvs?rev=267134&root=gcc&view=rev
Log:
[offloading] Error on missing symbols

When compiling an OpenMP or OpenACC program containing a reference in the
offloaded code to a symbol that has not been included in the offloaded code,
the offloading compiler may ICE in lto1.

Fix this by erroring out instead, mentioning the problematic symbol:
...
error: variable 'var' has been referenced in offloaded code but hasn't
  been marked to be included in the offloaded code
lto1: fatal error: errors during merging of translation units
compilation terminated.
...

Build x86_64 with nvptx accelerator and reg-tested libgomp.

Build x86_64 and reg-tested libgomp.

2018-12-14  Tom de Vries  <tdevries@suse.de>

	* lto-cgraph.c (verify_node_partition): New function.
	(input_overwrite_node, input_varpool_node): Use verify_node_partition.

	* testsuite/libgomp.c-c++-common/function-not-offloaded-aux.c: New test.
	* testsuite/libgomp.c-c++-common/function-not-offloaded.c: New test.
	* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/variable-not-offloaded.c: New test.

Added:
    trunk/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded-aux.c
    trunk/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded.c
    trunk/libgomp/testsuite/libgomp.c-c++-common/variable-not-offloaded.c
    trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c
    trunk/libgomp/testsuite/libgomp.oacc-c-c++-common/variable-not-offloaded.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lto-cgraph.c
    trunk/libgomp/ChangeLog
...