Bug 89502 - Inefficient access to stack_guard in TCB
Summary: Inefficient access to stack_guard in TCB
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-26 00:22 UTC by H.J. Lu
Modified: 2019-02-26 23:32 UTC (History)
1 user (show)

See Also:
Host:
Target: x32
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-02-26 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2019-02-26 00:22:07 UTC
[hjl@gnu-cfl-1 xx]$ cat x.i
typedef unsigned int uint32_t;
struct ctx
{
  uint32_t A;
};

void *
buffer_copy (const struct ctx *ctx, void *resbuf)
{
  uint32_t buffer[4];
  buffer[0] = (ctx->A);
  __builtin_memcpy (resbuf, buffer, sizeof (buffer));
  return resbuf;
}
[hjl@gnu-cfl-1 xx]$ make x.o
/export/build/gnu/tools-build/gcc-mmx-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-mmx-debug/build-x86_64-linux/gcc/ -fstack-protector-strong -O1 -frename-registers -mx32  -S x.i
/export/build/gnu/tools-build/gcc-mmx-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-mmx-debug/build-x86_64-linux/gcc/ -fstack-protector-strong -O1 -frename-registers -mx32  -c -o x.o x.s
x.s: Assembler messages:
x.s:11: Error: can't encode segment `%fs' with 32-bit address
x.s:18: Error: can't encode segment `%fs' with 32-bit address
make: *** [Makefile:21: x.o] Error 1
[hjl@gnu-cfl-1 xx]$ 

The problem is

	movl	%fs:(%edx), %ecx

In x32, we can't have 0x67 prefix (from %edx) with segment override
since final address will be

segment base + zero-extended (base + index * scale + disp)

Since assembler

https://sourceware.org/bugzilla/show_bug.cgi?id=24263

doesn't know %edx is always positive, it issues an error.
Comment 1 H.J. Lu 2019-02-26 00:43:25 UTC
We should do

	movl	$24, %edx
	movl	%fs:(%rdx), %ecx

instead of

	movl	$24, %edx
	movl	%fs:(%edx), %ecx

Ever better, we can use UNSPEC_TP to handle it:

	movl	%fs:24, %ecx

to save a register.
Comment 2 H.J. Lu 2019-02-26 01:26:13 UTC
(In reply to H.J. Lu from comment #1)
> 
> Ever better, we can use UNSPEC_TP to handle it:
> 
> 	movl	%fs:24, %ecx
> 

This is how TCB is accessed in glibc.
Comment 3 Uroš Bizjak 2019-02-26 10:20:26 UTC
(In reply to H.J. Lu from comment #0)
> x.s: Assembler messages:
> x.s:11: Error: can't encode segment `%fs' with 32-bit address
> x.s:18: Error: can't encode segment `%fs' with 32-bit address

Is this some new warning, since

GNU assembler version 2.31.1-17.fc29

happily assembles 

 	movl	%fs:(%edx), %ecx

to:

   8:   64 67 8b 0a             mov    %fs:(%edx),%ecx

Anyway, the following patch should fix this issue:

--cut here--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e77653d66a4..67872e1c15d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -18542,6 +18542,10 @@ ix86_print_operand_address_as (FILE *file, rtx addr,
 
   if (!ADDR_SPACE_GENERIC_P (as))
     {
+      /* We can't have 0x67 prefix with segment override.  */
+      if (TARGET_64BIT)
+       code = 'q';
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
        putc ('%', file);
 
--cut here--
Comment 4 Uroš Bizjak 2019-02-26 10:22:51 UTC
(In reply to H.J. Lu from comment #1)
> Ever better, we can use UNSPEC_TP to handle it:
> 
> 	movl	%fs:24, %ecx
> 
> to save a register.

We do, with -O2, but -O1 CSEs the constant into the register for some reason.
Comment 5 H.J. Lu 2019-02-26 13:43:26 UTC
(In reply to Uroš Bizjak from comment #3)
> (In reply to H.J. Lu from comment #0)
> > x.s: Assembler messages:
> > x.s:11: Error: can't encode segment `%fs' with 32-bit address
> > x.s:18: Error: can't encode segment `%fs' with 32-bit address
> 
> Is this some new warning, since

Yes. it will be in binutils 2.33.

> GNU assembler version 2.31.1-17.fc29
> 
> happily assembles 
> 
>  	movl	%fs:(%edx), %ecx
> 
> to:
> 
>    8:   64 67 8b 0a             mov    %fs:(%edx),%ecx
> 
> Anyway, the following patch should fix this issue:
> 
> --cut here--
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index e77653d66a4..67872e1c15d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -18542,6 +18542,10 @@ ix86_print_operand_address_as (FILE *file, rtx addr,
>  
>    if (!ADDR_SPACE_GENERIC_P (as))
>      {
> +      /* We can't have 0x67 prefix with segment override.  */
> +      if (TARGET_64BIT)
> +       code = 'q';
> +
>        if (ASSEMBLER_DIALECT == ASM_ATT)
>         putc ('%', file);
>  
> --cut here--

I think we should avoid %fs:(%edx) address.  Is that possible to generate
%fs:24 even at -O0?  Glibc can do it at -O0:

[hjl@gnu-cfl-1 gcc]$ cat x.i
int
foo (void)
{
  int i;
  asm volatile ("movl %%fs:%P1, %0" : "=r" (i) : "i" (24)); 
  return i;
}
[hjl@gnu-cfl-1 gcc]$ gcc -S -O0 x.i 
[hjl@gnu-cfl-1 gcc]$ cat x.s
	.file	"x.i"
	.text
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
#APP
# 5 "x.i" 1
	movl %fs:24, %eax
# 0 "" 2
#NO_APP
	movl	%eax, -4(%rbp)
	movl	-4(%rbp), %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 gcc]$
Comment 6 Uroš Bizjak 2019-02-26 13:58:47 UTC
(In reply to H.J. Lu from comment #5)

> I think we should avoid %fs:(%edx) address.  Is that possible to generate

It is what middle-end expands to, so nothing much that we can do here.
Comment 7 H.J. Lu 2019-02-26 23:32:47 UTC
rtx
memory_address_addr_space (machine_mode mode, rtx x, addr_space_t as)
{
  rtx oldx = x; 
  scalar_int_mode address_mode = targetm.addr_space.address_mode (as);

  x = convert_memory_address_addr_space (address_mode, x, as); 

  /* By passing constant addresses through registers
     we get a chance to cse them.  */
  if (! cse_not_expected && CONSTANT_P (x) && CONSTANT_ADDRESS_P (x)) 
    x = force_reg (address_mode, x);

cprop undoes it, which isn't enabled by O1.  For x86, using register
instead of a 32-bit constant usually is a performance loss.  Shouldn't
backend have a choice here?