This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug libgomp/71646] New: incompability between ptx code and GPU hardware

From: "didu31 at hotmail dot fr" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Fri, 24 Jun 2016 13:23:54 +0000
Subject: [Bug libgomp/71646] New: incompability between ptx code and GPU hardware
Auto-submitted: auto-generated

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71646

            Bug ID: 71646
           Summary: incompability between ptx code and GPU hardware
           Product: gcc
           Version: 6.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: didu31 at hotmail dot fr
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 38758
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38758&action=edit
Very simple OpenACC program

Hardware : Core 2 Quad + Nvidia Geforce GT 430

OS : Linux 4.4.0-24-generic x86_64

lib environ : 
              - gcc 6.1 (compiled from sources) 
              - nvidia-toolkit-7.5
              - libcudart 7.5
              - libcuda1-361
              - nvptx-tools, master branch of June, the 17th (compiled from
sources)

The attached source program is compiled and linked thanks to this command :

gcc t.c -fopenacc -foffload=nvptx-none -foffload="-O3" -O3 -o t -lgomp
-Wl,-rpath=/usr/local/lib64 

Typing this : export ACC_DEVICE_TYPE=

then executing ./t and these messages appear :

libgomp: Link error log ptxas fatal   : SM version specified by .target is
higher than default SM version assumed


libgomp: cuLinkAddData (ptx_code) error: no kernel image is available for
execution on the device

Moreover, ./t hangs.

It is expected as my video card supports at most sm_20 ptx code while sm_30
instructions are generated by gcc and even .target sm_30 is hardcoded at
gcc/config/nvptx/nvptx.c:3904 : fputs ("\t.target\tsm_30\n", asm_out_file);

From my point of view, as sm_30 ptx code only is generated,  int
nvptx_get_num_devices (void) (libgomp/plugin/plugin-nvptx.c:680) should be
aware of that and should not count such a video card.
As a result, gomp runtime would switch to host as it does when cuInit(0) !=
CUDA_SUCCESS.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]