Bug 15551 - [3.4/3.5? Regression] -mtune=pentium4 -O2 with sjlj EH breaks stack probe worker on windows32 targets
Summary: [3.4/3.5? Regression] -mtune=pentium4 -O2 with sjlj EH breaks stack probe wo...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.4.0
: P2 normal
Target Milestone: 3.4.1
Assignee: Kelley Cook
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2004-05-20 07:58 UTC by Danny Smith
Modified: 2004-06-22 22:16 UTC (History)
3 users (show)

See Also:
Host: i686-pc-mingw32
Target: i686-pc-mingw32
Build: i686-pc-mingw32
Known to work:
Known to fail:
Last reconfirmed: 2004-06-01 20:44:27


Attachments
A potential more correct fix (804 bytes, patch)
2004-06-07 20:00 UTC, Kelley Cook
Details | Diff
New patch using RTH's suggestion (669 bytes, patch)
2004-06-21 16:54 UTC, Kelley Cook
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Danny Smith 2004-05-20 07:58:04 UTC
When following testcase is compiled with

g++ -O2 -mtune=pentium4

for a cygwin or mingw target, the executable segfaults.

When tuning switch is set to pentium3 or lower,
the executable works correctly.

This is observed on 3.4.0 and on trunk, but not on gcc-3.3.3

/* pm.cpp */
/* Testcase submitted by Hans Horn to mingw bug tracker */

#include <cstring>
#include <fstream>
using namespace std;

ostream* logfile;

//char expList[20000];

int main () {

  logfile = new ofstream("bar", ios::out);

  char expList[20000];
  strcpy(expList, "foo");

  return 0;
}

/* end pm.cpp */



The problem appears to be in scheduling pass moving
sjlj exception handling insns across call to the allocate
stack probe worker (_alloca) in the function prologue.

Here is the assembler output for main() prologue from
g++ -S -O2 -mtune=pentium4 pm.cpp

_main:
	pushl	%ebp
	movl	$20092, %eax
	movl	%esp, %ebp
	pushl	%edi
	pushl	%esi
	movl	$___gxx_personality_sj0, -20052(%ebp) <<<<
	movl	$LLSDA1484, -20048(%ebp)              <<<< 
	pushl	%ebx                                     
	movl	$L4, -20040(%ebp)                     <<<< 
	call	__alloca
	movl	$16, %eax
	andl	$-16, %esp
	call	__alloca
	leal	-24(%ebp), %eax
	movl	%eax, -20044(%ebp)
	leal	-20076(%ebp), %eax
	movl	%esp, -20036(%ebp)
	movl	%eax, (%esp)
	call	__Unwind_SjLj_Register
	call	___main
<snip>

The following patch fixes by restoring the gen_blockage
barrier in i386.c:ix86_expand_prologue that was removed at revision 1.608

http://gcc.gnu.org/ml/gcc-cvs/2003-10/msg00685.html


The patch was tested on i386-pc-mingw32 on 3.4.1 and trunk.

	* config/i386/i386.c (ix86_expand_prologue):  Ensure that scheduling 
pass does
	not move insns across __alloca call


Index: i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.668
diff -c -3 -p -r1.668 i386.c
*** i386.c	17 May 2004 15:23:12 -0000	1.668
--- i386.c	20 May 2004 07:30:16 -0000
*************** ix86_expand_prologue (void)
*** 5295,5300 ****
--- 5295,5304 ----
  	  rtx t = plus_constant (stack_pointer_rtx, allocate);
  	  emit_move_insn (eax, gen_rtx_MEM (SImode, t));
  	}
+ 
+       /* Ensure that scheduling pass does not move insns across __alloca
+          call.  */
+       emit_insn (gen_blockage (const0_rtx));
      }
  
    if (frame.save_regs_using_mov && !TARGET_RED_ZONE)
Comment 1 Mark Mitchell 2004-05-28 22:08:02 UTC
Kelley --

It looks like your patch is causing a problem.  Would you please look into this?

-- Mark
Comment 2 Kelley Cook 2004-06-01 20:44:24 UTC
I'll confirm that this is a new regression and that it is also present on Cygwin.

Furthermore I'll state that though installing a blocker insn is basically a
kludge, fixing the C++ sjlj 'feature' that fooled the compiler into thinking
that it was perfectly OK to write to memory that wasn't yet allocated is not
nearly as simple nor would it be as target specific.

So I would say go for putting the kludge back :)

I'll also confirm that Danny's patch fixes the same error under Cygwin.
Comment 3 Kelley Cook 2004-06-07 20:00:52 UTC
Created attachment 6488 [details]
A potential more correct fix

After deciphering the twisty passages.	I think this is a more correct fix. 
The problem was not really a scheduling problem, it was that the Win32 alloca
routine was marking that it clobbered memory, so gcc felt free to move stuff
above the call to __alloca.

I still need to regtest it (as that is REALLY slow on cygwin), but it survived
a c,c++ bootstrap on i686-pc-cygwin and fixed the bug (writing to unalloca
memory) as evidenced by below:

$ diff -u pm-gcc_35_orig.s pm-p4new.s
--- pm-gcc_35_orig.s	2004-06-01 14:25:47.170950500 -0400
+++ pm-p4new.s	2004-06-07 15:26:04.193903000 -0400
@@ -23,26 +23,26 @@
	movl	%esp, %ebp
	pushl	%edi
	pushl	%esi
-	movl	$___gxx_personality_sj0, -20052(%ebp)
-	movl	$LLSDA1422, -20048(%ebp)
	pushl	%ebx
-	movl	$L4, -20040(%ebp)
	call	__alloca
	movl	$16, %eax
	andl	$-16, %esp
	call	__alloca
	leal	-24(%ebp), %eax
+	movl	$___gxx_personality_sj0, -20052(%ebp)
	movl	%eax, -20044(%ebp)
	leal	-20076(%ebp), %eax
-	movl	%esp, -20036(%ebp)
+	movl	$LLSDA1422, -20048(%ebp)
	movl	%eax, (%esp)
+	movl	$L4, -20040(%ebp)
+	movl	%esp, -20036(%ebp)
	call	__Unwind_SjLj_Register
	call	___main
	movl	$256, (%esp)
	movl	$-1, -20072(%ebp)
	call	__Znwj
-	movl	%eax, -20080(%ebp)
	movl	$16, 8(%esp)
+	movl	%eax, -20080(%ebp)
	movl	$LC0, 4(%esp)
	movl	%eax, (%esp)
	movl	$1, -20072(%ebp)

This patch was against mainline, OK if it passes a complete regtest?

The patch also applies to 3.4 and I'll also regtest it there.
Mark, OK for 3.4.1 assuming that passes?
Comment 4 Kelley Cook 2004-06-10 12:47:41 UTC
Subject: Re:  [3.4/3.5? Regression] -mtune=pentium4  -O2

I previously wrote
> The patch also applies to 3.4 and passed an all languages bootstrap; 
> I'll also regtest it there.
> 
> Mark, after a week on mainline, would this be OK for 3.4.1 assuming that 
> it also passes?
> 
> 2004-06-07  Kelley Cook  <kcook@gcc.gnu.org>
> 
> 	* config/i386/i386.c (ix86_expand_prologue): Mark win32 allocate insn
> 	as prologue.
> 	* config/i386/i386.md (allocate_stack): Clobber alloca memory.
> 	(allocate_stack_worker_postreload): Likewise.
> 	(allocate_stack_worker_rex64): Likewise.
> 	(allocate_stack_worker_rex64_postreload): Likewise.
> 	(allocate_stack_worker_1): Likewise.

For what it is worth, on both mainline and 3.4.1 this patch, under 
pentium4-pc-cygwin, created no new regressions. Moreover, it fixed one of them:

g77.f-torture/execute/labug1.f execution

OK to install?


Comment 5 Mark Mitchell 2004-06-10 13:43:03 UTC
Kelley --

If this patch is approved for mainline, it's OK for 3.4.1.  Perhaps Richard
Henderson can review it for you?

-- Mark
Comment 6 cgf@gcc.gnu.org 2004-06-15 17:42:11 UTC
Subject: Re:  [3.4/3.5? Regression] -mtune=pentium4  -O2

On Thu, Jun 10, 2004 at 08:47:21AM -0400, Kelley Cook wrote:
>I previously wrote
>>The patch also applies to 3.4 and passed an all languages bootstrap; 
>>I'll also regtest it there.
>>
>>Mark, after a week on mainline, would this be OK for 3.4.1 assuming that 
>>it also passes?
>>
>>2004-06-07  Kelley Cook  <kcook@gcc.gnu.org>
>>
>>	* config/i386/i386.c (ix86_expand_prologue): Mark win32 allocate insn
>>	as prologue.
>>	* config/i386/i386.md (allocate_stack): Clobber alloca memory.
>>	(allocate_stack_worker_postreload): Likewise.
>>	(allocate_stack_worker_rex64): Likewise.
>>	(allocate_stack_worker_rex64_postreload): Likewise.
>>	(allocate_stack_worker_1): Likewise.
>
>For what it is worth, on both mainline and 3.4.1 this patch, under 
>pentium4-pc-cygwin, created no new regressions. Moreover, it fixed one of 
>them:
>
>g77.f-torture/execute/labug1.f execution
>
>OK to install?

This doesn't seem to have been added yet.

I think this particular problem has been kicked around quite a bit in
various incarnations.  I believe that this was code that Richard
Henderson added not so long ago.  Maybe it would be a good idea to have
him verify that it is correct?

cgf
Comment 7 Mark Mitchell 2004-06-19 17:47:44 UTC
Richard, there's a request for a review from you on this patch.
Comment 8 Richard Henderson 2004-06-19 19:18:20 UTC
The revised patch isn't correct.  The two correct changes I can think of are
(1) the gen_blockage as already written, or (2) change the alloc_stack_worker
unspec to be an unspec_volatile.
Comment 9 Kelley Cook 2004-06-21 16:54:31 UTC
Created attachment 6593 [details]
New patch using RTH's suggestion

I followed RTH's suggestion (2).  It also fixes the reported regression and
fixes g77.f-torture/execute/labug1.f (3.4 only since it is a G77 testcase).

If full regression passes, is it OK?
Comment 10 Richard Henderson 2004-06-21 18:38:40 UTC
Patch is ok.
Comment 11 Mark Mitchell 2004-06-21 21:26:20 UTC
Postponed until GCC 3.4.2.
Comment 12 CVS Commits 2004-06-21 22:52:08 UTC
Subject: Bug 15551

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	kcook@gcc.gnu.org	2004-06-21 22:52:04

Modified files:
	gcc            : ChangeLog 
	gcc/config/i386: i386.md 

Log message:
	2004-06-21  Kelley Cook  <kcook@gcc.gnu.org>
	
	PR target/15551
	* config/i386/i386.md: Change UNSPEC_STACK_PROBE to UNSPECV_STACK_PROBE.
	(allocate_stack_worker): Make unspec_volatile.
	(allocate_stack_worker_rex64):  Likewise.
	(allocate_stack_worker_postreload):  Likewise.
	(allocate_stack_worker_rex64_postreload):  Likewise.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4067&r2=2.4068
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.544&r2=1.545

Comment 13 CVS Commits 2004-06-22 21:47:51 UTC
Subject: Bug 15551

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	kcook@gcc.gnu.org	2004-06-22 21:47:47

Modified files:
	gcc            : ChangeLog 
	gcc/testsuite  : ChangeLog 
	gcc/config/i386: i386.md 
Added files:
	gcc/testsuite/g++.dg/opt: pr15551.c 

Log message:
	2004-06-22 Kelley Cook <kcook@gcc.gnu.org>
	
	PR target/15551
	* config/i386/i386.md: Change UNSPEC_STACK_PROBE to UNSPECV_STACK_PROBE.
	(allocate_stack_worker): Make unspec_volatile.
	(allocate_stack_worker_rex64):  Likewise.
	(allocate_stack_worker_postreload):  Likewise.
	(allocate_stack_worker_rex64_postreload):  Likewise.
	
	[testsuite]
	2004-06-22 Kelley Cook <kcook@gcc.gnu.org>
	
	* g++.dg/opt/pr15551.C: New testcase.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.518&r2=2.2326.2.519
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3389.2.213&r2=1.3389.2.214
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.502.2.4&r2=1.502.2.5
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/opt/pr15551.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=NONE&r2=1.1.2.1

Comment 14 Andrew Pinski 2004-06-22 22:16:07 UTC
Fixed.