Bug 39118

Summary: [4.3/4.4 Regression] x86_64 red zone violation
Product: gcc Reporter: Ian Lance Taylor <ian>
Component: targetAssignee: Uroš Bizjak <ubizjak>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, hjl.tools, rguenth
Priority: P2 Keywords: wrong-code
Version: 4.3.4   
Target Milestone: 4.3.4   
URL: http://gcc.gnu.org/ml/gcc-patches/2009-02/msg00512.html
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2009-02-11 08:24:28
Attachments: Test case
Test case
Patch
gcc44-pr39118.patch

Description Ian Lance Taylor 2009-02-06 05:52:14 UTC
This is a bug report for gcc 4_3 branch.  I will attach a test case, slightly reduced from zlib code.  When compiling this test case for the x86_64-linux target with -O2 -fomit-frame-pointer, I see this at the start of the function:

adler32:
	pushq	%rbp
	movq	%rdi, %rax
	andl	$65535, %edi
	shrq	$16, %rax
	movq	%rsp, %rbp
	pushq	%r15
	andl	$65535, %eax
	movl	%edx, -140(%rbp)

After %rsp was copied to %rbp, %r15 was pushed.  So now %rsp is 8 bytes less than %rbp.  The red zone is 128 bytes, so this means that any reference down to -136(%rbp) is valid.  However, the code actually stores a value into -140(%rbp).  This is invalid, and can cause subtle and unpredictable bugs depending upon when the kernel interrupts this code.

This error happens because the scheduler moves this movl instruction ahead of some other pushq instructions.  If all the pushq instructions happened first, the stack pointer would have been decremented sufficiently that the reference to -140(%rbp) would be safely within the red zone.

4.4 generates completely different, and seemingly better, code for this test case.  However, I don't see anything in the 4.4 source code which would prevent this same bug from occurring given the right test case.  So this is potentially a serious error for the 4.4 release as well, unless it has been fixed by some scheduling dependency that I haven't uncovered.  However, I have not been able to find a test case.
Comment 1 Ian Lance Taylor 2009-02-06 05:53:14 UTC
Created attachment 17260 [details]
Test case
Comment 2 Uroš Bizjak 2009-02-06 13:45:35 UTC
(In reply to comment #0)
> This is a bug report for gcc 4_3 branch.  I will attach a test case, slightly
> reduced from zlib code.  When compiling this test case for the x86_64-linux
> target with -O2 -fomit-frame-pointer, I see this at the start of the function:

-fno-omit-frame-pointer

> adler32:
>         pushq   %rbp
>         movq    %rdi, %rax
>         andl    $65535, %edi
>         shrq    $16, %rax
>         movq    %rsp, %rbp
>         pushq   %r15
>         andl    $65535, %eax
>         movl    %edx, -140(%rbp)
> 
> After %rsp was copied to %rbp, %r15 was pushed.  So now %rsp is 8 bytes less
> than %rbp.  The red zone is 128 bytes, so this means that any reference down to
> -136(%rbp) is valid.  However, the code actually stores a value into
> -140(%rbp).  This is invalid, and can cause subtle and unpredictable bugs
> depending upon when the kernel interrupts this code.

We have several options here:

1. disable redzone with -fno-omit-frame-pointer
2. emit scheduling barrier just after the prologue for -fno-omit-frame-pointer
3. make instructions that access to redzone dependant on %rsp

Confirmed, but since -fomit-frame-pointer is the default (and works OK) it is not a critical issue.
Comment 3 Ian Lance Taylor 2009-02-06 14:14:44 UTC
Yes, -fno-omit-frame-pointer, sorry.

I don't see why this has anything to do with -fno-omit-frame-pointer per se.  As far as I can see so far the same problem can arise with any function which happens to require a frame pointer for some reason, such as a call to __builtin_return_address or if profiling is turned on.
Comment 4 Uroš Bizjak 2009-02-06 14:50:12 UTC
(In reply to comment #3)
> Yes, -fno-omit-frame-pointer, sorry.
> 
> I don't see why this has anything to do with -fno-omit-frame-pointer per se. 
> As far as I can see so far the same problem can arise with any function which
> happens to require a frame pointer for some reason, such as a call to
> __builtin_return_address or if profiling is turned on.

-fno-omit-frame-pointer just forces usage of frame pointer in x86_64 case to trigger this problem. When frame pointer is used, there is no connection between %ebp and %esp at the point where prologue ends. The offset between %ebp and %esp is calculated at this point, but scheduler can still move instructions referring to %ebp all the way to (insn 567).

(insn/f 567 566 568 2 pr39118.c:6 (set (reg/f:DI 6 bp)
        (reg/f:DI 7 sp)) -1 (nil))

(insn/f 568 567 569 2 pr39118.c:6 (set (mem:DI (pre_dec:DI (reg/f:DI 7 sp)) [0 S8 A8])
        (reg:DI 44 r15)) -1 (nil))

...

(insn/f 572 571 573 2 pr39118.c:6 (set (mem:DI (pre_dec:DI (reg/f:DI 7 sp)) [0 S8 A8])
        (reg:DI 3 bx)) -1 (nil))

(note 574 573 93 2 NOTE_INSN_PROLOGUE_END)

(insn:HI 93 574 94 2 pr39118.c:6 (set (mem/c:DI (plus:DI (reg/f:DI 6 bp)
                (const_int -136 [0xffffffffffffff78])) [14 buf+0 S8 A8])
        (reg:DI 4 si [ buf ])) 89 {*movdi_1_rex64} (nil))

(insn:HI 94 93 95 2 pr39118.c:6 (set (mem/c:SI (plus:DI (reg/f:DI 6 bp)
                (const_int -140 [0xffffffffffffff74])) [15 len+0 S4 A8])
        (reg:SI 1 dx [ len ])) 47 {*movsi_1} (nil))


This problem can be solved in the most elegant way by inserting some kind of artificial instruction at the end of prologue, perhaps

(set %rbp)(unspec [(%rsp)] UNSPEC_REDZONE_BLOCKAGE)
Comment 5 Uroš Bizjak 2009-02-06 15:05:17 UTC
This patch fixes wrong scheduling:

Index: i386.md
===================================================================
--- i386.md	(revision 143890)
+++ i386.md	(working copy)
@@ -67,6 +67,7 @@
    (UNSPEC_DEF_CFA		15)
    (UNSPEC_SET_RIP		16)
    (UNSPEC_SET_GOT_OFFSET	17)
+   (UNSPEC_REDZONE_BLOCKAGE	18)
 
    ; TLS support
    (UNSPEC_TP			18)
@@ -14885,6 +14886,14 @@
   ""
   [(set_attr "length" "0")])
 
+(define_insn "redzone_blockage"
+  [(set (match_operand 0 "" "")
+	(unspec [(match_operand 1 "" "")] UNSPEC_REDZONE_BLOCKAGE))]
+  ""
+  ""
+  [(set_attr "length" "0")])
+
+
 ;; Insn emitted into the body of a function to return from a function.
 ;; This is only done if the function's epilogue is known to be simple.
 ;; See comments for ix86_can_use_return_insn_p in i386.c.
Index: i386.c
===================================================================
--- i386.c	(revision 143890)
+++ i386.c	(working copy)
@@ -6492,6 +6492,10 @@
       emit_insn (gen_blockage ());
     }
 
+  if (frame_pointer_needed && frame.red_zone_size)
+    emit_insn (gen_redzone_blockage (hard_frame_pointer_rtx,
+				     stack_pointer_rtx));
+
   /* Emit cld instruction if stringops are used in the function.  */
   if (TARGET_CLD && ix86_current_function_needs_cld)
     emit_insn (gen_cld ());
Comment 6 H.J. Lu 2009-02-06 16:44:41 UTC
Gcc 4.1 -fno-omit-frame-pointer -O2 generates

---
adler32:
	pushq	%rbp
	movq	%rdi, %rax
	andl	$65535, %edi
	shrq	$16, %rax
	movq	%rsp, %rbp
	pushq	%r15
	andl	$65535, %eax
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%rbx
	subq	$16, %rsp
	movl	%edx, -148(%rbp)
---

Is this a regression?
Comment 7 Ian Lance Taylor 2009-02-06 17:25:13 UTC
Yes, it's a regression for 4.3 relative to 4.2 for this test case.  I don't know if it is a general regression.
Comment 8 Uroš Bizjak 2009-02-08 21:00:58 UTC
We should plug this for 4.3 and 4.4, so due to comment 6 I think this is a regression.
Comment 9 uros 2009-02-10 16:12:55 UTC
Subject: Bug 39118

Author: uros
Date: Tue Feb 10 16:12:33 2009
New Revision: 144063

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144063
Log:
	PR target/39118
	* config/i386/i386.c (expand_prologue): Emit blockage at the end
	of function prologue when frame pointer is used to access
	red zone area.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c

Comment 10 uros 2009-02-10 16:13:06 UTC
Subject: Bug 39118

Author: uros
Date: Tue Feb 10 16:12:47 2009
New Revision: 144064

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144064
Log:
	PR target/39118
	* config/i386/i386.c (expand_prologue): Emit blockage at the end
	of function prologue when frame pointer is used to access
	red zone area.


Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/config/i386/i386.c

Comment 11 Uroš Bizjak 2009-02-10 16:14:04 UTC
Fixed.
Comment 12 Ian Lance Taylor 2009-02-10 19:56:15 UTC
I didn't get around to commenting on the patch; I'll just note that it is conservative.  We don't have to block every instruction, just those which use memory.

Do we have to worry about the function epilogue?  I don't see any reason why the scheduler would move any of the pop instructions ahead of something which references off of %rbp.  But I also don't see anything which explicitly prevents that from happening.
Comment 13 Ian Lance Taylor 2009-02-10 21:03:50 UTC
Created attachment 17278 [details]
Test case

This test case is from Mark Heffernan.  When compiling with -O2 -fno-omit-frame-pointer with gcc 4.3, it shows a red zone violation in the epilogue:

	movl	-160(%rbp), %eax
	popq	%r15
	leave
	ret

It does work with mainline, though.
Comment 14 Ian Lance Taylor 2009-02-10 21:10:46 UTC
I've verified that the problem in the epilogue occurs using the current 4.3 sources, so reopening the bug report.  (There is no longer any problem in the prologue.)
Comment 15 Uroš Bizjak 2009-02-11 07:09:12 UTC
(In reply to comment #12)
> I didn't get around to commenting on the patch; I'll just note that it is
> conservative.  We don't have to block every instruction, just those which use
> memory.

True, but as described in the gcc-patches@ mailing list, this situation should not happen often. OTOTH, if there is some standard way to insert memory blockage no-ops, we can use these instead of unspec_volatile.

> Do we have to worry about the function epilogue?  I don't see any reason why
> the scheduler would move any of the pop instructions ahead of something which
> references off of %rbp.  But I also don't see anything which explicitly
> prevents that from happening.

We should also add blockage there.

Comment 16 Uroš Bizjak 2009-02-11 08:24:28 UTC
(In reply to comment #13)
> This test case is from Mark Heffernan.  When compiling with -O2
> -fno-omit-frame-pointer with gcc 4.3, it shows a red zone violation in the
> epilogue:
> 
>         movl    -160(%rbp), %eax
>         popq    %r15
>         leave
>         ret

This does not show redzone violation, but following does:

	popq	%rbx
	popq	%r12
	popq	%r13
	popq	%r14
	movl	-160(%rbp), %eax
	popq	%r15
	leave
	ret

Anyway, I have a patch.
Comment 17 Uroš Bizjak 2009-02-11 08:32:42 UTC
Created attachment 17280 [details]
Patch

This patch inserts _memory_ blockages at the end of function prologue and at the beginning of function epilogue.
Comment 18 Jakub Jelinek 2009-02-11 11:00:15 UTC
Created attachment 17281 [details]
gcc44-pr39118.patch

Patch I've bootstrapped/regtested on x86_64-linux.
Comment 19 uros 2009-02-11 11:43:40 UTC
Subject: Bug 39118

Author: uros
Date: Wed Feb 11 11:43:24 2009
New Revision: 144100

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144100
Log:
	PR target/39118
	* config/i386/i386.md (UNSPEC_MEMORY_BLOCKAGE): New constant.
	(memory_blockage): New expander.
	(*memory_blockage): New insn pattern.
	* config/i386/i386.c (ix86_expand_prologue): Use memory_blockage
	instead of general blockage at the end of function prologue when
	frame pointer is used to access red zone area.  Do not emit blockage
	when profiling, it is emitted in generic code.
	(ix86_expand_epilogue): Emit memory_blockage at the beginning of
	function epilogue when frame pointer is used to access red zone area.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.md

Comment 20 uros 2009-02-11 11:53:59 UTC
Subject: Bug 39118

Author: uros
Date: Wed Feb 11 11:53:47 2009
New Revision: 144101

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144101
Log:
	PR target/39118
	* config/i386/i386.md (UNSPEC_MEMORY_BLOCKAGE): New constant.
	(memory_blockage): New expander.
	(*memory_blockage): New insn pattern.
	* config/i386/i386.c (ix86_expand_prologue): Use memory_blockage
	instead of general blockage at the end of function prologue when
	frame pointer is used to access red zone area.  Do not emit blockage
	when profiling, it is emitted in generic code.
	(ix86_expand_epilogue): Emit memory_blockage at the beginning of
	function epilogue when frame pointer is used to access red zone area.


Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/config/i386/i386.c
    branches/gcc-4_3-branch/gcc/config/i386/i386.md

Comment 21 Uroš Bizjak 2009-02-11 11:55:22 UTC
Fixed again.
Comment 22 Ian Lance Taylor 2009-02-11 14:49:37 UTC
Thanks.
Comment 23 Richard Biener 2009-02-17 18:40:01 UTC
This may have caused bootstrap memory requirements for insn-recog to go up
significantly enough to cause

cc1: out of memory allocating 4064 bytes after a total of 660041728 bytes
make[3]: *** [insn-recog.o] Error 1

for Andreas.
Comment 24 H.J. Lu 2009-02-17 18:45:23 UTC
(In reply to comment #23)
> This may have caused bootstrap memory requirements for insn-recog to go up
> significantly enough to cause
> 
> cc1: out of memory allocating 4064 bytes after a total of 660041728 bytes
> make[3]: *** [insn-recog.o] Error 1
> 

FWIW, I have no problems with gcc 4.4 revision 144236 on Linux/ia32 with
1GB RAM/2GB swap as well as Linux/Intel64 with 4GB RAM/2GB swap.

Comment 25 rguenther@suse.de 2009-02-17 19:00:22 UTC
Subject: Re:  [4.3/4.4 Regression] x86_64 red zone
 violation

On Tue, 17 Feb 2009, hjl dot tools at gmail dot com wrote:

> ------- Comment #24 from hjl dot tools at gmail dot com  2009-02-17 18:45 -------
> (In reply to comment #23)
> > This may have caused bootstrap memory requirements for insn-recog to go up
> > significantly enough to cause
> > 
> > cc1: out of memory allocating 4064 bytes after a total of 660041728 bytes
> > make[3]: *** [insn-recog.o] Error 1
> > 
> 
> FWIW, I have no problems with gcc 4.4 revision 144236 on Linux/ia32 with
> 1GB RAM/2GB swap as well as Linux/Intel64 with 4GB RAM/2GB swap.

The above bails out at ~650MB, so I guess his machine has way less than
1GB ram.

Richard.
Comment 26 rguenther@suse.de 2009-02-17 19:50:13 UTC
Subject: Re:  [4.3/4.4 Regression] x86_64 red zone
 violation

On Tue, 17 Feb 2009, Richard Guenther wrote:

> On Tue, 17 Feb 2009, hjl dot tools at gmail dot com wrote:
> 
> > ------- Comment #24 from hjl dot tools at gmail dot com  2009-02-17 18:45 -------
> > (In reply to comment #23)
> > > This may have caused bootstrap memory requirements for insn-recog to go up
> > > significantly enough to cause
> > > 
> > > cc1: out of memory allocating 4064 bytes after a total of 660041728 bytes
> > > make[3]: *** [insn-recog.o] Error 1
> > > 
> > 
> > FWIW, I have no problems with gcc 4.4 revision 144236 on Linux/ia32 with
> > 1GB RAM/2GB swap as well as Linux/Intel64 with 4GB RAM/2GB swap.
> 
> The above bails out at ~650MB, so I guess his machine has way less than
> 1GB ram.

I can reproduce it with ulimit -v 500000 and --enable-checking=yes,rtl
--disable-multilib --enable-languages=c.

Richard.