Bug 58066 - __tls_get_addr is called with misaligned stack on x86-64
Summary: __tls_get_addr is called with misaligned stack on x86-64
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: 4.9.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-02 23:11 UTC by Paul Pluzhnikov
Modified: 2017-06-30 19:31 UTC (History)
5 users (show)

See Also:
Host:
Target: x86_64
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-07-11 00:00:00


Attachments
A patch (2.82 KB, patch)
2014-03-12 21:05 UTC, H.J. Lu
Details | Diff
Combined middle/end/target patch (1.00 KB, patch)
2015-07-13 11:08 UTC, Uroš Bizjak
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Pluzhnikov 2013-08-02 23:11:57 UTC
Google ref: b/10151411

Reproduced with current trunk, but is broken since at least gcc-4.3.1.

On Linux/x86_64, libstdc++.so.6 __cxa_get_globals looks like so:

Dump of assembler code for function __cxa_get_globals:
   0x00000000000cb430 <+0>:     lea    0x233131(%rip),%rdi
   0x00000000000cb437 <+7>:     callq  0x4f570 <__tls_get_addr@plt>
   0x00000000000cb43c <+12>:    add    $0x0,%rax
   0x00000000000cb442 <+18>:    retq   

This calls external function __tls_get_addr with mis-aligned stack.
__tls_get_addr may itself call malloc, and malloc is user-replaceable,
and may assume that stack is properly aligned (and crash when it isn't).

Trivial test case:


static __thread char ccc;
extern "C" void* __cxa_get_globals() throw()
{
 return &ccc;
}

  g++ -fPIC -S -O2 t.cc

results in:

__cxa_get_globals:
       leaq    _ZL3ccc@tlsld(%rip), %rdi
       call    __tls_get_addr@PLT
       addq    $_ZL3ccc@dtpoff, %rax
       ret



Ian Lance Taylor says:

  There is code in the i386 backend that is designed to avoid this.
  However, it appears to have only been fully implemented for the GNU2 TLS
  descriptor style ...

  I suspect that the right fix is to add the line

     ix86_tls_descriptor_calls_expanded_in_cfun = true;

  to tls_global_dynamic_64_<mode> and tls_local_dynamic_base_64_<mode>
  in gcc/config/i386/i386.md.
Comment 1 Andrew Pinski 2013-08-02 23:57:42 UTC
> However, it appears to have only been fully implemented for the GNU2 TLS
>  descriptor style ...

Which most Linux distro default to anyways ...
Comment 2 Paul Pluzhnikov 2013-08-03 00:21:23 UTC
(In reply to Andrew Pinski from comment #1)

> Which most Linux distro default to anyways ...

Ubuntu 12.04.1 LTS doesn't.
Configuring trunk GCC on it doesn't default to GNU2 TLS either.

What is the way to turn it on?
Comment 3 Paul Pluzhnikov 2013-08-06 23:14:35 UTC
(In reply to Paul Pluzhnikov from comment #2)

> What is the way to turn it on?

Compiling test case with -mtls-dialect=gnu2 does appear to improve the picture:

g++ -fPIC -O2 -S t.cc -mtls-dialect=gnu2

__cxa_get_globals:
        leaq    _ZL3ccc@TLSDESC(%rip), %rax
        call    *_ZL3ccc@TLSCALL(%rax)
        addq    %fs:0, %rax
        ret

The indirect call goes to _dl_tlsdesc_dynamic in ld-linux-x86-64.so.2 with misaligned stack, and the latter re-aligns it.
Comment 4 H.J. Lu 2014-03-12 21:05:29 UTC
Created attachment 32341 [details]
A patch

This patch sets ix86_tls_descriptor_calls_expanded_in_cfun after
reload is complete and checks it for stack boundary in ix86_frame_pointer_required.
Comment 5 H.J. Lu 2014-03-12 22:22:47 UTC
Another problem:

[hjl@gnu-6 gcc]$ cat /tmp/c.i 
static __thread char ccc;

void* __cxa_get_globals()
{
 return &ccc;
}
[hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i 
[hjl@gnu-6 gcc]$ cat /tmp/c.i 
static __thread char ccc;

void* __cxa_get_globals()
{
 return &ccc;
}
[hjl@gnu-6 gcc]$ ./xgcc -B./ -S -O2 -fPIC /tmp/c.i  -m32
[hjl@gnu-6 gcc]$ cat c.s
	.file	"c.i"
	.section	.text.unlikely,"ax",@progbits
.LCOLDB0:
	.text
.LHOTB0:
	.p2align 4,,15
	.globl	__cxa_get_globals
	.type	__cxa_get_globals, @function
__cxa_get_globals:
.LFB0:
	.cfi_startproc
	pushl	%ebx
	.cfi_def_cfa_offset 8
	.cfi_offset 3, -8
	call	__x86.get_pc_thunk.bx
	addl	$_GLOBAL_OFFSET_TABLE_, %ebx
	subl	$8, %esp
	.cfi_def_cfa_offset 16
	addl	$8, %esp
	.cfi_def_cfa_offset 8
	leal	ccc@tlsgd(,%ebx,1), %eax
	call	___tls_get_addr@PLT
	popl	%ebx
	.cfi_restore 3
	.cfi_def_cfa_offset 4
	ret
	.cfi_endproc
.LFE0:
	.size	__cxa_get_globals, .-__cxa_get_globals

sched2 doesn't know

(insn:TI 15 25 13 2 (parallel [
            (set (reg:SI 0 ax [86])
                (unspec:SI [
                        (reg:SI 3 bx)
                        (symbol_ref:SI ("ccc") [flags 0x1a]  <var_decl 0x7f2b2be
5e980 ccc>)
                        (symbol_ref:SI ("___tls_get_addr"))
                    ] UNSPEC_TLS_GD))
            (clobber (reg:SI 1 dx [88]))
            (clobber (reg:SI 2 cx [89]))
            (clobber (reg:CC 17 flags))
        ]) /tmp/c.i:5 772 {*tls_global_dynamic_32_gnu}
     (expr_list:REG_DEAD (reg:SI 3 bx)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (expr_list:REG_UNUSED (reg:SI 2 cx [89])
                (expr_list:REG_UNUSED (reg:SI 1 dx [88])
                    (expr_list:REG_EQUIV (unspec:SI [
                                (reg:SI 3 bx)
                                (symbol_ref:SI ("ccc") [flags 0x1a]  <var_decl 0
x7f2b2be5e980 ccc>)
                                (symbol_ref:SI ("___tls_get_addr"))
                            ] UNSPEC_TLS_GD)
                        (nil)))))))

is a function call and move stack adjustment cross it.
Comment 6 wmi 2014-05-08 16:45:23 UTC
Author: wmi
Date: Thu May  8 16:44:52 2014
New Revision: 210222

URL: http://gcc.gnu.org/viewcvs?rev=210222&root=gcc&view=rev
Log:
gcc/
2014-05-08  Wei Mi  <wmi@google.com>

	PR target/58066
	* config/i386/i386.c (ix86_compute_frame_layout):
	Update preferred_stack_boundary for call, expanded from
	tls descriptor.
	* config/i386/i386.md:
	(*tls_global_dynamic_32_gnu): Update RTX to depend on
	SP register.
	(*tls_local_dynamic_base_32_gnu): Ditto.
	(*tls_local_dynamic_32_once): Ditto.
	(tls_global_dynamic_64_<mode>): Set
	ix86_tls_descriptor_calls_expanded_in_cfun.
	(tls_local_dynamic_base_64_<mode>): Ditto.
	(tls_global_dynamic_32): Set
	ix86_tls_descriptor_calls_expanded_in_cfun. Update RTX
	to depend on SP register.
	(tls_local_dynamic_base_32): Ditto.

gcc/testsuite/
2014-05-08  Wei Mi  <wmi@google.com>

	PR target/58066
	* gcc.target/i386/pr58066.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr58066.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.md
    trunk/gcc/testsuite/ChangeLog
Comment 7 wmi 2014-05-19 05:26:18 UTC
Author: wmi
Date: Mon May 19 05:25:45 2014
New Revision: 210601

URL: http://gcc.gnu.org/viewcvs?rev=210601&root=gcc&view=rev
Log:
2014-05-18  Wei Mi  <wmi@google.com>

        PR target/58066
        * gcc.target/i386/pr58066.c: Replace pattern matching of .cfi
        directive with rtl insns. Add effective-target of fpic and
        tls_native.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/pr58066.c
Comment 8 Dmitry Vyukov 2014-12-18 17:54:08 UTC
Is there any progress on this? Is it fixed?

I've hit this issue in ThreadSanitizer. It intercepts __tls_get_addr and then code that uses MOVDQA [rbp] crashes. I remember that I hit it previously in some other context as well.
Comment 9 H.J. Lu 2015-07-11 21:03:34 UTC
__tls_get_addr is called with misaligned stack on x86-64. It
crashes ld.so when it tries to save and restore XMM registers
with aligned load/store:

https://sourceware.org/ml/libc-alpha/2015-07/msg00365.html
Comment 10 H.J. Lu 2015-07-13 03:58:43 UTC
Another testcase:

[hjl@gnu-tools-1 pr58066]$ cat x.i
struct in_addr
  {
    int s_addr;
  };

typedef long unsigned int size_t;
extern void __snprintf (char *__restrict __s, size_t __maxlen,
         const char *__restrict __format, ...)
     __attribute__ ((__format__ (__printf__, 3, 4)));

static __thread char buffer[18];

char *
inet_ntoa (struct in_addr in)
{
  unsigned char *bytes = (unsigned char *) &in;
  __snprintf (buffer, sizeof (buffer), "%d.%d.%d.%d",
       bytes[0], bytes[1], bytes[2], bytes[3]);

  return buffer;
}
[hjl@gnu-tools-1 pr58066]$ gcc -S -fPIC -O2 x.i
[hjl@gnu-tools-1 pr58066]$ cat x.s
	.file	"x.i"
	.section	.rodata.str1.1,"aMS",@progbits,1
.LC0:
	.string	"%d.%d.%d.%d"
	.section	.text.unlikely,"ax",@progbits
.LCOLDB1:
	.text
.LHOTB1:
	.p2align 4,,15
	.globl	inet_ntoa
	.type	inet_ntoa, @function
inet_ntoa:
.LFB0:
	.cfi_startproc
	pushq	%r14
	.cfi_def_cfa_offset 16
	.cfi_offset 14, -16
	pushq	%r13
	.cfi_def_cfa_offset 24
	.cfi_offset 13, -24
	movzbl	%dil, %r13d
	pushq	%r12
	.cfi_def_cfa_offset 32
	.cfi_offset 12, -32
	pushq	%rbp
	.cfi_def_cfa_offset 40
	.cfi_offset 6, -40
	movl	%edi, %r12d
	pushq	%rbx
	.cfi_def_cfa_offset 48
	.cfi_offset 3, -48
	movl	%edi, %ebx
	shrl	$16, %r12d
	movzbl	%bh, %eax
	shrl	$24, %ebx
	movzbl	%r12b, %r12d
	subq	$8, %rsp
	.cfi_def_cfa_offset 56
	movl	%eax, %r14d
	leaq	buffer@tlsld(%rip), %rdi
	call	__tls_get_addr@PLT
	pushq	%rbx
	.cfi_def_cfa_offset 64
	leaq	.LC0(%rip), %rdx
	movl	%r12d, %r9d
	leaq	buffer@dtpoff(%rax), %rbp
	movl	%r14d, %r8d
	movl	%r13d, %ecx
	xorl	%eax, %eax
	movl	$18, %esi
	movq	%rbp, %rdi
	call	__snprintf@PLT
	popq	%rax
	.cfi_def_cfa_offset 56
	movq	%rbp, %rax
	popq	%rdx
	.cfi_def_cfa_offset 48
	popq	%rbx
	.cfi_def_cfa_offset 40
	popq	%rbp
	.cfi_def_cfa_offset 32
	popq	%r12
	.cfi_def_cfa_offset 24
	popq	%r13
	.cfi_def_cfa_offset 16
	popq	%r14
	.cfi_def_cfa_offset 8
	ret
	.cfi_endproc
.LFE0:
	.size	inet_ntoa, .-inet_ntoa
	.section	.text.unlikely
.LCOLDE1:
	.text
.LHOTE1:
	.section	.tbss,"awT",@nobits
	.type	buffer, @object
	.size	buffer, 18
buffer:
	.zero	18
	.ident	"GCC: (GNU) 5.1.1 20150707 (Red Hat 5.1.1-5)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-tools-1 pr58066]$ 

__tls_get_addr is called with misaligned stack.
Comment 11 Uroš Bizjak 2015-07-13 06:40:56 UTC
Please make 64bit TLS patterns dependant on SP_REG, in the same way as 32bit are.
Comment 12 Uroš Bizjak 2015-07-13 09:16:26 UTC
(In reply to Uroš Bizjak from comment #11)
> Please make 64bit TLS patterns dependant on SP_REG, in the same way as 32bit
> are.

This wont't fix this particular case, but this dependency would be nice to have.

The problem with the testcase from Comment #10 is caused by stack anti-adjustment, emitted from calls.c:

    1: NOTE_INSN_DELETED
    4: NOTE_INSN_BASIC_BLOCK 2
    2: r96:SI=di:SI
    3: NOTE_INSN_FUNCTION_BEG
    6: {sp:DI=sp:DI-0x8;clobber flags:CC;}               <<--- *** here ***
      REG_ARGS_SIZE 0x8
    7: {r98:SI=r96:SI 0>>0x10;clobber flags:CC;}
    8: {r99:QI=r98:SI#0&0xffffffffffffffff;clobber flags:CC;}
    9: r100:SI=zero_extend(r99:QI)
   10: r101:QI#0=zero_extract(r96:SI,0x8,0x8)
   11: r102:SI=zero_extend(r101:QI)
   12: r103:SI=zero_extend(r96:SI#0)
   13: ax:DI=call [`__tls_get_addr'] argc:0
      REG_EH_REGION 0xffffffff80000000
   14: r105:DI=ax:DI
      REG_EQUAL unspec[0] 21
   15: {r106:DI=r105:DI+const(unspec[`buffer'] 6);clobber flags:CC;}
   16: r104:DI=r106:DI
      REG_EQUAL `buffer'
   17: {r108:SI=r96:SI 0>>0x18;clobber flags:CC;}
   18: r109:SI=zero_extend(r108:SI#0)
   19: [pre sp:DI+=0xfffffffffffffff8]=r109:SI
      REG_ARGS_SIZE 0x10
   20: r9:SI=r100:SI
   21: r8:SI=r102:SI
   22: cx:SI=r103:SI
   23: dx:DI=`*.LC0'
   24: si:DI=0x12
   25: di:DI=r104:DI
   26: ax:QI=0
   27: call [`__snprintf'] argc:0x10
      REG_CALL_DECL `__snprintf'
   28: ax:DI=call [`__tls_get_addr'] argc:0
      REG_EH_REGION 0xffffffff80000000
   29: r111:DI=ax:DI
      REG_EQUAL unspec[0] 21
   30: {r112:DI=r111:DI+const(unspec[`buffer'] 6);clobber flags:CC;}
   31: r95:DI=r112:DI
      REG_EQUAL `buffer'
   32: {sp:DI=sp:DI+0x10;clobber flags:CC;}
      REG_ARGS_SIZE 0
   36: ax:DI=r95:DI
   37: use ax:DI

Putting a breakpoint on anti_adjust_stack will show where it happens:

Breakpoint 1, anti_adjust_stack (adjust=0x2aaaae7b0500) at /home/uros/gcc-svn/trunk/gcc/explow.c:902
902       if (adjust == const0_rtx)
(gdb) bt
#0  anti_adjust_stack (adjust=0x2aaaae7b0500) at /home/uros/gcc-svn/trunk/gcc/explow.c:902
#1  0x000000000080f24c in expand_call (exp=0x2aaaae7b3680, target=0x0, ignore=1) at /home/uros/gcc-svn/trunk/gcc/calls.c:3165
#2  0x0000000000966084 in expand_expr_real_1 (exp=0x2aaaae7b3680, target=0x0, tmode=VOIDmode, modifier=EXPAND_NORMAL, alt_rtl=0x0, inner_reference_p=false)
    at /home/uros/gcc-svn/trunk/gcc/expr.c:10362

There is already precompute_register_parameters function where:

	/* If the value is a non-legitimate constant, force it into a
	   pseudo now.  TLS symbols sometimes need a call to resolve.  */
	if (CONSTANT_P (args[i].value)
	    && !targetm.legitimate_constant_p (args[i].mode, args[i].value))
	  args[i].value = force_reg (args[i].mode, args[i].value);

So, the core of the problem is in the call infrastructure that should emit precomputed register parameters before anti_adjust_stack is emitted

After this infrastructure problem is fixed, proposed SP_REG dependency will prevent stack adjustment to be scheduled above TLS patterns.

Re-confirmed as RTL-optimization problem.
Comment 13 Uroš Bizjak 2015-07-13 11:08:11 UTC
Created attachment 35964 [details]
Combined middle/end/target patch

Patch in testing.
Comment 14 Uroš Bizjak 2015-07-13 11:10:50 UTC
(In reply to Uroš Bizjak from comment #13)
 
> Patch in testing.

This patch fixes the testcase, now we get:

0000000000000000 <inet_ntoa>:
   0:   41 56                   push   %r14
   2:   41 55                   push   %r13
   4:   44 0f b6 ef             movzbl %dil,%r13d
   8:   41 54                   push   %r12
   a:   55                      push   %rbp
   b:   41 89 fc                mov    %edi,%r12d
   e:   53                      push   %rbx
   f:   89 fb                   mov    %edi,%ebx
  11:   41 c1 ec 10             shr    $0x10,%r12d
  15:   0f b6 c7                movzbl %bh,%eax
  18:   c1 eb 18                shr    $0x18,%ebx
  1b:   45 0f b6 e4             movzbl %r12b,%r12d
  1f:   41 89 c6                mov    %eax,%r14d
  22:   48 8d 3d 00 00 00 00    lea    0(%rip),%rdi        # 29 <inet_ntoa+0x29>
                        25: R_X86_64_TLSLD      buffer+0xfffffffffffffffc
  29:   e8 00 00 00 00          callq  2e <inet_ntoa+0x2e>
                        2a: R_X86_64_PLT32      __tls_get_addr+0xfffffffffffffffc
  2e:   48 83 ec 08             sub    $0x8,%rsp
  32:   48 8d 15 00 00 00 00    lea    0(%rip),%rdx        # 39 <inet_ntoa+0x39>
                        35: R_X86_64_PC32       .LC0+0xfffffffffffffffc
  39:   45 89 e1                mov    %r12d,%r9d
  3c:   48 8d a8 00 00 00 00    lea    0x0(%rax),%rbp
                        3f: R_X86_64_DTPOFF32   buffer
  43:   53                      push   %rbx
  44:   45 89 f0                mov    %r14d,%r8d
  47:   44 89 e9                mov    %r13d,%ecx
  4a:   31 c0                   xor    %eax,%eax
  4c:   be 12 00 00 00          mov    $0x12,%esi
  51:   48 89 ef                mov    %rbp,%rdi
  54:   e8 00 00 00 00          callq  59 <inet_ntoa+0x59>
                        55: R_X86_64_PLT32      __snprintf+0xfffffffffffffffc
  59:   58                      pop    %rax
  5a:   48 89 e8                mov    %rbp,%rax
  5d:   5a                      pop    %rdx
  5e:   5b                      pop    %rbx
  5f:   5d                      pop    %rbp
  60:   41 5c                   pop    %r12
  62:   41 5d                   pop    %r13
  64:   41 5e                   pop    %r14
  66:   c3                      retq   

The difference between patched (+++) and unpatched (---) code is:

--- pr58066_.s  2015-07-13 11:58:23.000000000 +0200
+++ pr58066.s   2015-07-13 11:58:26.000000000 +0200
@@ -28,16 +28,16 @@
        movzbl  %bh, %eax
        shrl    $24, %ebx
        movzbl  %r12b, %r12d
-       subq    $8, %rsp
-.LCFI5:
        movl    %eax, %r14d
        leaq    buffer@tlsld(%rip), %rdi
        call    __tls_get_addr@PLT
-       pushq   %rbx
-.LCFI6:
+       subq    $8, %rsp
+.LCFI5:
        leaq    .LC0(%rip), %rdx
        movl    %r12d, %r9d
        leaq    buffer@dtpoff(%rax), %rbp
+       pushq   %rbx
+.LCFI6:
        movl    %r14d, %r8d
        movl    %r13d, %ecx
        xorl    %eax, %eax

HJ, can you please test the patch if it fixes your problem?
Comment 15 H.J. Lu 2015-07-13 12:08:32 UTC
(In reply to Uroš Bizjak from comment #13)
> Created attachment 35964 [details]
> Combined middle/end/target patch
> 
> Patch in testing.

I tried it on GCC 5 and it works on glibc.  Thanks.
Comment 16 uros 2015-07-15 07:40:01 UTC
Author: uros
Date: Wed Jul 15 07:39:30 2015
New Revision: 225807

URL: https://gcc.gnu.org/viewcvs?rev=225807&root=gcc&view=rev
Log:
	PR rtl-optimization/58066
	* calls.c (expand_call): Precompute register parameters before stack
	alignment is performed.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/calls.c
Comment 17 Uroš Bizjak 2015-07-15 07:41:07 UTC
Back to target component.
Comment 18 uros 2015-07-15 13:42:39 UTC
Author: uros
Date: Wed Jul 15 13:42:07 2015
New Revision: 225829

URL: https://gcc.gnu.org/viewcvs?rev=225829&root=gcc&view=rev
Log:
	PR target/58066
	* config/i386/i386.md (*tls_global_dynamic_64_<mode>): Depend on SP_REG.
	(*tls_local_dynamic_base_64_<mode>): Ditto.
	(*tls_local_dynamic_base_64_largepic): Ditto.
	(tls_global_dynamic_64_<mode>): Update expander pattern.
	(tls_local_dynamic_base_64_<mode>): Ditto.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.md
Comment 19 uros 2015-07-23 18:52:28 UTC
Author: uros
Date: Thu Jul 23 18:51:56 2015
New Revision: 226119

URL: https://gcc.gnu.org/viewcvs?rev=226119&root=gcc&view=rev
Log:
	Backport from mainline:
	2015-07-17  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/66891
	* calls.c (expand_call): Wrap precompute_register_parameters with
	NO_DEFER_POP/OK_DEFER_POP to prevent deferred pops.

	2015-07-15  Uros Bizjak  <ubizjak@gmail.com>

	PR target/58066
	* config/i386/i386.md (*tls_global_dynamic_64_<mode>): Depend on SP_REG.
	(*tls_local_dynamic_base_64_<mode>): Ditto.
	(*tls_local_dynamic_base_64_largepic): Ditto.
	(tls_global_dynamic_64_<mode>): Update expander pattern.
	(tls_local_dynamic_base_64_<mode>): Ditto.

	2015-07-15  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/58066
	* calls.c (expand_call): Precompute register parameters before stack

testsuite/ChangeLog:

	Backport from mainline:
	2015-07-17  Uros Bizjak  <ubizjak@gmail.com>

	PR target/66891
	* gcc.target/i386/pr66891.c: New test.


Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr66891.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/calls.c
    branches/gcc-5-branch/gcc/config/i386/i386.md
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 20 uros 2015-07-30 08:54:20 UTC
Author: uros
Date: Thu Jul 30 08:53:48 2015
New Revision: 226389

URL: https://gcc.gnu.org/viewcvs?rev=226389&root=gcc&view=rev
Log:
	Backport from mainline:
	2015-07-17  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/66891
	* calls.c (expand_call): Wrap precompute_register_parameters with
	NO_DEFER_POP/OK_DEFER_POP to prevent deferred pops.

	2015-07-15  Uros Bizjak  <ubizjak@gmail.com>

	PR target/58066
	* config/i386/i386.md (*tls_global_dynamic_64_<mode>): Depend on SP_REG.
	(*tls_local_dynamic_base_64_<mode>): Ditto.
	(*tls_local_dynamic_base_64_largepic): Ditto.
	(tls_global_dynamic_64_<mode>): Update expander pattern.
	(tls_local_dynamic_base_64_<mode>): Ditto.

	2015-07-15  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/58066
	* calls.c (expand_call): Precompute register parameters before stack
	alignment is performed.

	2014-05-08  Wei Mi  <wmi@google.com>

	PR target/58066
	* config/i386/i386.c (ix86_compute_frame_layout): Update
	preferred_stack_boundary for call, expanded from tls descriptor.
	* config/i386/i386.md (*tls_global_dynamic_32_gnu): Update RTX
	to depend on SP register.
	(*tls_local_dynamic_base_32_gnu): Ditto.
	(*tls_local_dynamic_32_once): Ditto.
	(tls_global_dynamic_64_<mode>): Set
	ix86_tls_descriptor_calls_expanded_in_cfun.
	(tls_local_dynamic_base_64_<mode>): Ditto.
	(tls_global_dynamic_32): Set
	ix86_tls_descriptor_calls_expanded_in_cfun. Update RTX
	to depend on SP register.
	(tls_local_dynamic_base_32): Ditto.

testsuite/ChangeLog:

	Backport from mainline:
	2015-07-17  Uros Bizjak  <ubizjak@gmail.com>

	PR target/66891
	* gcc.target/i386/pr66891.c: New test.

	2014-05-18  Wei Mi  <wmi@google.com>

	PR target/58066
	* gcc.target/i386/pr58066.c: Replace pattern matching of .cfi
	directive with rtl insns. Add effective-target fpic and
	tls_native.

	2014-05-08  Wei Mi  <wmi@google.com>

	PR target/58066
	* gcc.target/i386/pr58066.c: New test.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr58066.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr66891.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/config/i386/i386.c
    branches/gcc-4_9-branch/gcc/config/i386/i386.md
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 21 Uroš Bizjak 2015-07-30 09:00:04 UTC
Fixed everywhere.