Bug 49503 - Incorrect stack alignment, produced by inline assembler in tests gcc.target/i386/cleanup-1.c and gcc.target/i386/cleanup-2.c
Summary: Incorrect stack alignment, produced by inline assembler in tests gcc.target/i...
Status: RESOLVED DUPLICATE of bug 70364
Alias: None
Product: gcc
Classification: Unclassified
Component: testsuite (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-22 15:25 UTC by Michael Zolotukhin
Modified: 2016-04-02 00:25 UTC (History)
5 users (show)

See Also:
Host: x64
Target: x64
Build: 4.7.0 20110619
Known to work:
Known to fail:
Last reconfirmed: 2011-06-22 16:48:48


Attachments
Test showing that cleanup-* tests are not quite correct. (1.39 KB, text/plain)
2011-06-23 11:42 UTC, Michael Zolotukhin
Details
Fixed cleanup-regression.c (1.48 KB, text/plain)
2011-06-23 16:00 UTC, Michael Zolotukhin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Zolotukhin 2011-06-22 15:25:47 UTC
The tests contain asm-listing like this:
__asm (
      testl  %0, %0
      jnz    1f
      .subsection 1
      .type  _L_mutex_lock_%=, @function
_L_mutex_lock_%=:
1:    leaq   %1, %%rdi
2:    subq   $128, %%rsp
3:    call   bar
4:    addq   $128, %%rsp
5:    jmp    21f
...

As _L_mutex_lock is a function, GCC generates a prologue and epilogue for it - in prologue stack alignment is performed (according to ABI64, stack should be aligned to 128-bit).
Before a call, SP is assumed to be a multiple of 16, at function entry, when return address is pushed to stack, (SP+8) becomes multiple of 16 - GCC uses these assumptions when generating prologue for stack alignment.
But if JUMP is used instead of CALL, 8-byte displacement doesn't take place, and stack becomes unaligned - that's violation of ABI.

To fix this error, JUMP should be replaced with CALL, or L_mutex_lock shouldn't be declared as a function.
Comment 1 H.J. Lu 2011-06-22 16:48:48 UTC
(In reply to comment #0)
> 
> As _L_mutex_lock is a function, GCC generates a prologue and epilogue for it -
> in prologue stack alignment is performed (according to ABI64, stack should be
> aligned to 128-bit).

I didn't see any prologue and epilogue for _L_mutex_lock. Do you have
a run-time testcase to show the problem?
Comment 2 Michael Zolotukhin 2011-06-22 17:50:34 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > 
> > As _L_mutex_lock is a function, GCC generates a prologue and epilogue for it -
> > in prologue stack alignment is performed (according to ABI64, stack should be
> > aligned to 128-bit).
> 
> I didn't see any prologue and epilogue for _L_mutex_lock. Do you have
> a run-time testcase to show the problem?

I don't have run-time test for this fail, but here is a way to see the problem:

Firstly, you need to compile the test (it is in gcc/testsuite/gcc.target/i386):

gcc cleanup-1.c   -fexceptions -fnon-call-exceptions -fasynchronous-unwind-tables -O0  -lm -g -fno-inline

Then, in debugger one can see that before each callq before foo (here 'before' means 'upper in call stack') RSP contains aligned value, after foo (deeper in call stack) - unaligned.


Here is corresponding a assembler dump:

-Dump of assembler code for function foo:
   0x0000000000400886 <+0>:     push   %rbp
   0x0000000000400887 <+1>:     mov    %rsp,%rbp
   0x000000000040088a <+4>:     push   %r12
   0x000000000040088c <+6>:     push   %rbx
   0x000000000040088d <+7>:     sub    $0x98,%rsp         # It is the prologue,
   0x0000000000400894 <+14>:    mov    %edi,-0x114(%rbp)  # that I've spoken
   0x000000000040089a <+20>:    mov    -0x114(%rbp),%ebx  # about.
   0x00000000004008a0 <+26>:    lea    -0x110(%rbp),%r12  #
   0x00000000004008a7 <+33>:    test   %ebx,%ebx
   0x00000000004008a9 <+35>:    jne    0x400941 <_L_mutex_lock_216>
   0x00000000004008af <+41>:    add    $0x98,%rsp
   0x00000000004008b6 <+48>:    pop    %rbx
   0x00000000004008b7 <+49>:    pop    %r12
   0x00000000004008b9 <+51>:    leaveq
   0x00000000004008ba <+52>:    retq
-End of assembler dump.

RTL dumps showed that these instructions appear phase "197r.pro_and_epilogue".

I'm not sure, if prologue generation is incorrect - in my opinion the problem is with such untrivial call of _L_mutex_lock.
Comment 3 H.J. Lu 2011-06-22 18:34:32 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > (In reply to comment #0)
> > > 
> > > As _L_mutex_lock is a function, GCC generates a prologue and epilogue for it -
> > > in prologue stack alignment is performed (according to ABI64, stack should be
> > > aligned to 128-bit).
> > 
> > I didn't see any prologue and epilogue for _L_mutex_lock. Do you have
> > a run-time testcase to show the problem?
> 
> I don't have run-time test for this fail, but here is a way to see the problem:
> 

It is very easy to check if stack alignment is correct at run-time.
Please see how it is done in testcases under gcc.dg/torture/stackalign.
Comment 4 Michael Zolotukhin 2011-06-23 11:42:03 UTC
Created attachment 24584 [details]
Test showing that cleanup-* tests are not quite correct.
Comment 5 Michael Zolotukhin 2011-06-23 11:44:17 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)
> > > (In reply to comment #0)
> > > > 
> > > > As _L_mutex_lock is a function, GCC generates a prologue and epilogue for it -
> > > > in prologue stack alignment is performed (according to ABI64, stack should be
> > > > aligned to 128-bit).
> > > 
> > > I didn't see any prologue and epilogue for _L_mutex_lock. Do you have
> > > a run-time testcase to show the problem?
> > 
> > I don't have run-time test for this fail, but here is a way to see the problem:
> > 
> 
> It is very easy to check if stack alignment is correct at run-time.
> Please see how it is done in testcases under gcc.dg/torture/stackalign.

Please find the testcase attached.
It passes on 32 bits and fails on 64 bits (4.7.0 and 4.5.1(RedHat) GCC was checked).
Comment 6 H.J. Lu 2011-06-23 12:45:31 UTC
(In reply to comment #4)
> Created attachment 24584 [details]
> Test showing that cleanup-* tests are not quite correct.

Wrong test.  You don't know the alignment of buf.
You should put an alignment attribute on buf.
Comment 7 Michael Zolotukhin 2011-06-23 16:00:36 UTC
Created attachment 24587 [details]
Fixed cleanup-regression.c
Comment 8 Michael Zolotukhin 2011-06-23 16:03:32 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > Created attachment 24584 [details]
> > Test showing that cleanup-* tests are not quite correct.
> 
> Wrong test.  You don't know the alignment of buf.
> You should put an alignment attribute on buf.

I added new test, with align attribute.
Compiler sometimes could make stack aligned (accidentally), and the bug wouldn't show up.
To watch the fail, try use following command line:
gcc cleanup-regression-2.c -O0 -lm -fno-inline -fexceptions -fnon-call-exceptions -fasynchronous-unwind-tables && ./a.out
Comment 9 H.J. Lu 2011-06-23 20:15:44 UTC
(In reply to comment #0)
> The tests contain asm-listing like this:
> __asm (
>       testl  %0, %0
>       jnz    1f
>       .subsection 1
>       .type  _L_mutex_lock_%=, @function
> _L_mutex_lock_%=:
> 1:    leaq   %1, %%rdi
> 2:    subq   $128, %%rsp
> 3:    call   bar
> 4:    addq   $128, %%rsp
> 5:    jmp    21f
> ...
> 

The bug is in testcase.  When _L_mutex_lock_ calls bar, it didn't
align stack to 16byte.  But I don't think it matters here. Jakub,
do we need to fix it?
Comment 10 Jakub Jelinek 2011-06-23 20:26:06 UTC
The testcase tests what glibc does, so IMHO it should stay as is.  None of the functions that are called with misaligned stack actually really require it to be aligned.
Comment 11 H.J. Lu 2011-06-23 22:03:49 UTC
(In reply to comment #10)
> The testcase tests what glibc does, so IMHO it should stay as is.  None of the
> functions that are called with misaligned stack actually really require it to
> be aligned.

But gcc may generate SSE codes and assume stack is 16byte aligned,
which leads to segfault.
Comment 12 H.J. Lu 2016-04-02 00:25:36 UTC
Dup

*** This bug has been marked as a duplicate of bug 70364 ***