Bug 96955 - Implement __builtin_thread_pointer for x86 TLS
Summary: Implement __builtin_thread_pointer for x86 TLS
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: 11.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 96200
  Show dependency treegraph
 
Reported: 2020-09-07 13:46 UTC by H.J. Lu
Modified: 2020-09-09 17:44 UTC (History)
3 users (show)

See Also:
Host:
Target: i386,x86-64
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2020-09-07 13:46:28 UTC
On Linux/x86-64, The %fs segment register is used to implement the thread
pointer. The linear address of the thread pointer is stored at offset 0
relative to the %fs segment register. The following code loads the thread
pointer in the %rax register:

movq %fs:0, %rax

On Linux/i386, the %gs segment register is used:

movl %gs:0, %eax

We need to

1. Implement __builtin_thread_pointer for Linux/x86.
2. Document its behavior.
Comment 1 Jakub Jelinek 2020-09-07 14:05:34 UTC
And if possible, optimize, so that if one does say
int *p = (int *)__builtin_thread_pointer ();
return p[4];
or
return p[i];
it will not read %fs:0 into a register and read 16(%reg), but rather read %fs:16
etc. (of course only if not -mno-tls-direct-seg-refs) or not read 16(%reg,%regI,4) but %fs:16(,%regI,4) etc.
Comment 2 Hongtao.liu 2020-09-08 06:54:09 UTC
Do we also need "__builtin_set_thread_pointer" ?
Comment 3 Hongtao.liu 2020-09-08 08:15:02 UTC
(In reply to Jakub Jelinek from comment #1)
> And if possible, optimize, so that if one does say
> int *p = (int *)__builtin_thread_pointer ();
> return p[4];
> or
> return p[i];
> it will not read %fs:0 into a register and read 16(%reg), but rather read
> %fs:16
> etc. (of course only if not -mno-tls-direct-seg-refs) or not read
> 16(%reg,%regI,4) but %fs:16(,%regI,4) etc.

This optimization already exists in i386/x86-64 backend.
Comment 4 GCC Commits 2020-09-09 08:29:33 UTC
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:e470d8af81d390df1166e9d9cf10b00c0692a495

commit r11-3067-ge470d8af81d390df1166e9d9cf10b00c0692a495
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Sep 8 15:44:58 2020 +0800

    Implement __builtin_thread_pointer for x86 TLS.
    
    gcc/ChangeLog:
            PR target/96955
            * config/i386/i386.md (get_thread_pointer<mode>): New
            expander.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.target/i386/builtin_thread_pointer.c: New test.
Comment 5 Hongtao.liu 2020-09-09 08:40:12 UTC
Fixed in GCC11.
Comment 6 H.J. Lu 2020-09-09 11:05:39 UTC
Fixed for GCC 11.
Comment 7 GCC Commits 2020-09-09 17:44:05 UTC
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:bf69edf8ce47ca618eff30df2308279a40b22096

commit r11-3081-gbf69edf8ce47ca618eff30df2308279a40b22096
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Sep 9 10:29:47 2020 -0700

    x32: Update gcc.target/i386/builtin_thread_pointer.c
    
    Update gcc.target/i386/builtin_thread_pointer.c for x32.  For
    
    int
    foo3 (int i)
    {
      int* p = (int*) __builtin_thread_pointer ();
      return p[i];
    }
    
    we can't generate:
    
            movl    %fs:0(,%edi,4), %eax
            ret
    
    for x32 since the address of %fs:0(,%edi,4) is %fs + zero-extended to 64
    bits of 0(,%edi,4).  Instead, we generate:
    
            movl    %fs:0, %eax
            movl    (%eax,%edi,4), %eax
    
            PR target/96955
            * gcc.target/i386/builtin_thread_pointer.c: Update scan-assembler
            for x32.