This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libgomp/71646] New: incompability between ptx code and GPU hardware
- From: "didu31 at hotmail dot fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 24 Jun 2016 13:23:54 +0000
- Subject: [Bug libgomp/71646] New: incompability between ptx code and GPU hardware
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71646
Bug ID: 71646
Summary: incompability between ptx code and GPU hardware
Product: gcc
Version: 6.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: didu31 at hotmail dot fr
CC: jakub at gcc dot gnu.org
Target Milestone: ---
Created attachment 38758
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38758&action=edit
Very simple OpenACC program
Hardware : Core 2 Quad + Nvidia Geforce GT 430
OS : Linux 4.4.0-24-generic x86_64
lib environ :
- gcc 6.1 (compiled from sources)
- nvidia-toolkit-7.5
- libcudart 7.5
- libcuda1-361
- nvptx-tools, master branch of June, the 17th (compiled from
sources)
The attached source program is compiled and linked thanks to this command :
gcc t.c -fopenacc -foffload=nvptx-none -foffload="-O3" -O3 -o t -lgomp
-Wl,-rpath=/usr/local/lib64
Typing this : export ACC_DEVICE_TYPE=
then executing ./t and these messages appear :
libgomp: Link error log ptxas fatal : SM version specified by .target is
higher than default SM version assumed
libgomp: cuLinkAddData (ptx_code) error: no kernel image is available for
execution on the device
Moreover, ./t hangs.
It is expected as my video card supports at most sm_20 ptx code while sm_30
instructions are generated by gcc and even .target sm_30 is hardcoded at
gcc/config/nvptx/nvptx.c:3904 : fputs ("\t.target\tsm_30\n", asm_out_file);
From my point of view, as sm_30 ptx code only is generated, int
nvptx_get_num_devices (void) (libgomp/plugin/plugin-nvptx.c:680) should be
aware of that and should not count such a video card.
As a result, gomp runtime would switch to host as it does when cuInit(0) !=
CUDA_SUCCESS.