When following testcase is compiled with g++ -O2 -mtune=pentium4 for a cygwin or mingw target, the executable segfaults. When tuning switch is set to pentium3 or lower, the executable works correctly. This is observed on 3.4.0 and on trunk, but not on gcc-3.3.3 /* pm.cpp */ /* Testcase submitted by Hans Horn to mingw bug tracker */ #include <cstring> #include <fstream> using namespace std; ostream* logfile; //char expList[20000]; int main () { logfile = new ofstream("bar", ios::out); char expList[20000]; strcpy(expList, "foo"); return 0; } /* end pm.cpp */ The problem appears to be in scheduling pass moving sjlj exception handling insns across call to the allocate stack probe worker (_alloca) in the function prologue. Here is the assembler output for main() prologue from g++ -S -O2 -mtune=pentium4 pm.cpp _main: pushl %ebp movl $20092, %eax movl %esp, %ebp pushl %edi pushl %esi movl $___gxx_personality_sj0, -20052(%ebp) <<<< movl $LLSDA1484, -20048(%ebp) <<<< pushl %ebx movl $L4, -20040(%ebp) <<<< call __alloca movl $16, %eax andl $-16, %esp call __alloca leal -24(%ebp), %eax movl %eax, -20044(%ebp) leal -20076(%ebp), %eax movl %esp, -20036(%ebp) movl %eax, (%esp) call __Unwind_SjLj_Register call ___main <snip> The following patch fixes by restoring the gen_blockage barrier in i386.c:ix86_expand_prologue that was removed at revision 1.608 http://gcc.gnu.org/ml/gcc-cvs/2003-10/msg00685.html The patch was tested on i386-pc-mingw32 on 3.4.1 and trunk. * config/i386/i386.c (ix86_expand_prologue): Ensure that scheduling pass does not move insns across __alloca call Index: i386.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v retrieving revision 1.668 diff -c -3 -p -r1.668 i386.c *** i386.c 17 May 2004 15:23:12 -0000 1.668 --- i386.c 20 May 2004 07:30:16 -0000 *************** ix86_expand_prologue (void) *** 5295,5300 **** --- 5295,5304 ---- rtx t = plus_constant (stack_pointer_rtx, allocate); emit_move_insn (eax, gen_rtx_MEM (SImode, t)); } + + /* Ensure that scheduling pass does not move insns across __alloca + call. */ + emit_insn (gen_blockage (const0_rtx)); } if (frame.save_regs_using_mov && !TARGET_RED_ZONE)
Kelley -- It looks like your patch is causing a problem. Would you please look into this? -- Mark
I'll confirm that this is a new regression and that it is also present on Cygwin. Furthermore I'll state that though installing a blocker insn is basically a kludge, fixing the C++ sjlj 'feature' that fooled the compiler into thinking that it was perfectly OK to write to memory that wasn't yet allocated is not nearly as simple nor would it be as target specific. So I would say go for putting the kludge back :) I'll also confirm that Danny's patch fixes the same error under Cygwin.
Created attachment 6488 [details] A potential more correct fix After deciphering the twisty passages. I think this is a more correct fix. The problem was not really a scheduling problem, it was that the Win32 alloca routine was marking that it clobbered memory, so gcc felt free to move stuff above the call to __alloca. I still need to regtest it (as that is REALLY slow on cygwin), but it survived a c,c++ bootstrap on i686-pc-cygwin and fixed the bug (writing to unalloca memory) as evidenced by below: $ diff -u pm-gcc_35_orig.s pm-p4new.s --- pm-gcc_35_orig.s 2004-06-01 14:25:47.170950500 -0400 +++ pm-p4new.s 2004-06-07 15:26:04.193903000 -0400 @@ -23,26 +23,26 @@ movl %esp, %ebp pushl %edi pushl %esi - movl $___gxx_personality_sj0, -20052(%ebp) - movl $LLSDA1422, -20048(%ebp) pushl %ebx - movl $L4, -20040(%ebp) call __alloca movl $16, %eax andl $-16, %esp call __alloca leal -24(%ebp), %eax + movl $___gxx_personality_sj0, -20052(%ebp) movl %eax, -20044(%ebp) leal -20076(%ebp), %eax - movl %esp, -20036(%ebp) + movl $LLSDA1422, -20048(%ebp) movl %eax, (%esp) + movl $L4, -20040(%ebp) + movl %esp, -20036(%ebp) call __Unwind_SjLj_Register call ___main movl $256, (%esp) movl $-1, -20072(%ebp) call __Znwj - movl %eax, -20080(%ebp) movl $16, 8(%esp) + movl %eax, -20080(%ebp) movl $LC0, 4(%esp) movl %eax, (%esp) movl $1, -20072(%ebp) This patch was against mainline, OK if it passes a complete regtest? The patch also applies to 3.4 and I'll also regtest it there. Mark, OK for 3.4.1 assuming that passes?
Subject: Re: [3.4/3.5? Regression] -mtune=pentium4 -O2 I previously wrote > The patch also applies to 3.4 and passed an all languages bootstrap; > I'll also regtest it there. > > Mark, after a week on mainline, would this be OK for 3.4.1 assuming that > it also passes? > > 2004-06-07 Kelley Cook <kcook@gcc.gnu.org> > > * config/i386/i386.c (ix86_expand_prologue): Mark win32 allocate insn > as prologue. > * config/i386/i386.md (allocate_stack): Clobber alloca memory. > (allocate_stack_worker_postreload): Likewise. > (allocate_stack_worker_rex64): Likewise. > (allocate_stack_worker_rex64_postreload): Likewise. > (allocate_stack_worker_1): Likewise. For what it is worth, on both mainline and 3.4.1 this patch, under pentium4-pc-cygwin, created no new regressions. Moreover, it fixed one of them: g77.f-torture/execute/labug1.f execution OK to install?
Kelley -- If this patch is approved for mainline, it's OK for 3.4.1. Perhaps Richard Henderson can review it for you? -- Mark
Subject: Re: [3.4/3.5? Regression] -mtune=pentium4 -O2 On Thu, Jun 10, 2004 at 08:47:21AM -0400, Kelley Cook wrote: >I previously wrote >>The patch also applies to 3.4 and passed an all languages bootstrap; >>I'll also regtest it there. >> >>Mark, after a week on mainline, would this be OK for 3.4.1 assuming that >>it also passes? >> >>2004-06-07 Kelley Cook <kcook@gcc.gnu.org> >> >> * config/i386/i386.c (ix86_expand_prologue): Mark win32 allocate insn >> as prologue. >> * config/i386/i386.md (allocate_stack): Clobber alloca memory. >> (allocate_stack_worker_postreload): Likewise. >> (allocate_stack_worker_rex64): Likewise. >> (allocate_stack_worker_rex64_postreload): Likewise. >> (allocate_stack_worker_1): Likewise. > >For what it is worth, on both mainline and 3.4.1 this patch, under >pentium4-pc-cygwin, created no new regressions. Moreover, it fixed one of >them: > >g77.f-torture/execute/labug1.f execution > >OK to install? This doesn't seem to have been added yet. I think this particular problem has been kicked around quite a bit in various incarnations. I believe that this was code that Richard Henderson added not so long ago. Maybe it would be a good idea to have him verify that it is correct? cgf
Richard, there's a request for a review from you on this patch.
The revised patch isn't correct. The two correct changes I can think of are (1) the gen_blockage as already written, or (2) change the alloc_stack_worker unspec to be an unspec_volatile.
Created attachment 6593 [details] New patch using RTH's suggestion I followed RTH's suggestion (2). It also fixes the reported regression and fixes g77.f-torture/execute/labug1.f (3.4 only since it is a G77 testcase). If full regression passes, is it OK?
Patch is ok.
Postponed until GCC 3.4.2.
Subject: Bug 15551 CVSROOT: /cvs/gcc Module name: gcc Changes by: kcook@gcc.gnu.org 2004-06-21 22:52:04 Modified files: gcc : ChangeLog gcc/config/i386: i386.md Log message: 2004-06-21 Kelley Cook <kcook@gcc.gnu.org> PR target/15551 * config/i386/i386.md: Change UNSPEC_STACK_PROBE to UNSPECV_STACK_PROBE. (allocate_stack_worker): Make unspec_volatile. (allocate_stack_worker_rex64): Likewise. (allocate_stack_worker_postreload): Likewise. (allocate_stack_worker_rex64_postreload): Likewise. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4067&r2=2.4068 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.544&r2=1.545
Subject: Bug 15551 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_4-branch Changes by: kcook@gcc.gnu.org 2004-06-22 21:47:47 Modified files: gcc : ChangeLog gcc/testsuite : ChangeLog gcc/config/i386: i386.md Added files: gcc/testsuite/g++.dg/opt: pr15551.c Log message: 2004-06-22 Kelley Cook <kcook@gcc.gnu.org> PR target/15551 * config/i386/i386.md: Change UNSPEC_STACK_PROBE to UNSPECV_STACK_PROBE. (allocate_stack_worker): Make unspec_volatile. (allocate_stack_worker_rex64): Likewise. (allocate_stack_worker_postreload): Likewise. (allocate_stack_worker_rex64_postreload): Likewise. [testsuite] 2004-06-22 Kelley Cook <kcook@gcc.gnu.org> * g++.dg/opt/pr15551.C: New testcase. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.518&r2=2.2326.2.519 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.3389.2.213&r2=1.3389.2.214 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.502.2.4&r2=1.502.2.5 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/opt/pr15551.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=NONE&r2=1.1.2.1
Fixed.