On Linux/x86-64, The %fs segment register is used to implement the thread pointer. The linear address of the thread pointer is stored at offset 0 relative to the %fs segment register. The following code loads the thread pointer in the %rax register: movq %fs:0, %rax On Linux/i386, the %gs segment register is used: movl %gs:0, %eax We need to 1. Implement __builtin_thread_pointer for Linux/x86. 2. Document its behavior.
And if possible, optimize, so that if one does say int *p = (int *)__builtin_thread_pointer (); return p[4]; or return p[i]; it will not read %fs:0 into a register and read 16(%reg), but rather read %fs:16 etc. (of course only if not -mno-tls-direct-seg-refs) or not read 16(%reg,%regI,4) but %fs:16(,%regI,4) etc.
Do we also need "__builtin_set_thread_pointer" ?
(In reply to Jakub Jelinek from comment #1) > And if possible, optimize, so that if one does say > int *p = (int *)__builtin_thread_pointer (); > return p[4]; > or > return p[i]; > it will not read %fs:0 into a register and read 16(%reg), but rather read > %fs:16 > etc. (of course only if not -mno-tls-direct-seg-refs) or not read > 16(%reg,%regI,4) but %fs:16(,%regI,4) etc. This optimization already exists in i386/x86-64 backend.
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>: https://gcc.gnu.org/g:e470d8af81d390df1166e9d9cf10b00c0692a495 commit r11-3067-ge470d8af81d390df1166e9d9cf10b00c0692a495 Author: liuhongt <hongtao.liu@intel.com> Date: Tue Sep 8 15:44:58 2020 +0800 Implement __builtin_thread_pointer for x86 TLS. gcc/ChangeLog: PR target/96955 * config/i386/i386.md (get_thread_pointer<mode>): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/builtin_thread_pointer.c: New test.
Fixed in GCC11.
Fixed for GCC 11.
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>: https://gcc.gnu.org/g:bf69edf8ce47ca618eff30df2308279a40b22096 commit r11-3081-gbf69edf8ce47ca618eff30df2308279a40b22096 Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Sep 9 10:29:47 2020 -0700 x32: Update gcc.target/i386/builtin_thread_pointer.c Update gcc.target/i386/builtin_thread_pointer.c for x32. For int foo3 (int i) { int* p = (int*) __builtin_thread_pointer (); return p[i]; } we can't generate: movl %fs:0(,%edi,4), %eax ret for x32 since the address of %fs:0(,%edi,4) is %fs + zero-extended to 64 bits of 0(,%edi,4). Instead, we generate: movl %fs:0, %eax movl (%eax,%edi,4), %eax PR target/96955 * gcc.target/i386/builtin_thread_pointer.c: Update scan-assembler for x32.