Bug 86236 - Stack alignment prologue clobbers %edi for fastcall functions with global register variable
Summary: Stack alignment prologue clobbers %edi for fastcall functions with global reg...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2018-06-20 11:16 UTC by Florian Weimer
Modified: 2018-11-20 17:01 UTC (History)
5 users (show)

See Also:
Host:
Target: i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-11-20 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Weimer 2018-06-20 11:16:54 UTC
#include <xmmintrin.h>

void f1 (void *, int);

register int edi __asm__ ("edi");

__attribute__ ((fastcall))
void
f2 (void)
{ 
  // Force stack alignment.                                                                                                           
  __m128i m;
  f1 (&m, edi);
}


Compile with “-m32 -O2 -march=x86-64 -msse2 -mfpmath=sse -mstackrealign”:

        .globl  f2
        .type   f2, @function
f2:
.LFB504:
        .cfi_startproc
        leal    4(%esp), %edi
        .cfi_def_cfa 7, 0
        andl    $-16, %esp
        pushl   -4(%edi)
        pushl   %ebp
        .cfi_escape 0x10,0x5,0x2,0x75,0
        movl    %esp, %ebp
        pushl   %edi
        .cfi_escape 0xf,0x3,0x75,0x7c,0x6
        leal    -24(%ebp), %eax
        subl    $28, %esp
        pushl   %edi
        pushl   %eax
        call    f1
        movl    -4(%ebp), %edi
        .cfi_def_cfa 7, 0
        addl    $16, %esp
        leave
        .cfi_restore 5
        leal    -4(%edi), %esp
        .cfi_def_cfa 4, 4
        ret
        .cfi_endproc


The value of %edi is clobbered before it is saved on the stack.  I think %edi is callee-saved even for fastcall functions, so this is wrong on multiple levels.
Comment 1 Florian Weimer 2018-06-20 11:29:50 UTC
Jakub observed that any stack realignment triggers this, e.g. this uses %edi even without -mstackrealign:

void f1 (void *, int);

register int edi __asm__ ("edi");

__attribute__ ((fastcall))
void
f2 (void)
{ 
  // Force stack alignment.                                                                                                           
  char buf[256] __attribute__ ((aligned (256)));
  f1 (buf, edi);
}

And -ffixed-edi does not make a difference.
Comment 2 Jakub Jelinek 2018-06-20 11:37:55 UTC
/* Find an available register to be used as dynamic realign argument
   pointer regsiter.  Such a register will be written in prologue and
   used in begin of body, so it must not be
        1. parameter passing register.
        2. GOT pointer.
   We reuse static-chain register if it is available.  Otherwise, we
   use DI for i386 and R13 for x86-64.  We chose R13 since it has
   shorter encoding.

   Return: the regno of chosen register.  */

static unsigned int
find_drap_reg (void)
...

Nothing checks if those chosen registers aren't fixed_regs (it doesn't work even with -ffixed-edi), nor if they are global registers.  Similarly for the static chain (though, that one is part of the ABI, so if we can't use it we need to error out).
Comment 3 Jakub Jelinek 2018-06-20 11:40:30 UTC
Though, obviously with -m32 we are getting -><- this close to getting rid out of all usable registers with fastcall, static chain and drap, especially if also -fpic.
Comment 4 Jakub Jelinek 2018-11-20 16:56:22 UTC
H.J., any thoughts on this?  Can we try to pick other regs if %edi is taken (if there are any left, I'm afraid often there won't be any) or should we just error out?
Comment 5 H.J. Lu 2018-11-20 17:01:36 UTC
(In reply to Jakub Jelinek from comment #4)
> H.J., any thoughts on this?  Can we try to pick other regs if %edi is taken
> (if there are any left, I'm afraid often there won't be any) or should we
> just error out?

We need to fix find_drap_reg to check for fixed registers.  If we can't
find a register, we should issue an error.