Bug 38130 - [4.4 regression]__builtin_alloca (vs IRA?) testsuite failures on mingw32
Summary: [4.4 regression]__builtin_alloca (vs IRA?) testsuite failures on mingw32
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.0
: P1 normal
Target Milestone: 4.4.0
Assignee: Jakub Jelinek
URL:
Keywords: ra, wrong-code
Depends on:
Blocks:
 
Reported: 2008-11-15 09:26 UTC by Danny Smith
Modified: 2008-11-18 12:37 UTC (History)
4 users (show)

See Also:
Host: i686-pc-mingw32
Target: i686-pc-mingw32
Build: i686-pc-mingw32
Known to work:
Known to fail:
Last reconfirmed:


Attachments
gcc44-pr38130.patch (715 bytes, patch)
2008-11-17 22:12 UTC, Jakub Jelinek
Details | Diff
gcc44-pr38130.patch (854 bytes, patch)
2008-11-17 22:28 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Danny Smith 2008-11-15 09:26:25 UTC
Execution testcases c-torture/execute/920929-1.c and
c-torture/execute/built-in-setjmp.c began failing on mingw32 in late August, coincident with merge of IRA into trunk. 

In both cases, the execution tests pass if -fno-ira is added to command line.

The problem appears to be in the call of the target stack-probing code
(__chkstk) in cygwin.asm from allocate_stack_worker instruction. __chkstk has
unusual calling convention, with the input argument as well as the
output passed in eax.

From i386.md:

(define_insn "allocate_stack_worker_32"
  [(set (match_operand:SI 0 "register_operand" "+a")
	(unspec_volatile:SI [(match_dup 0)] UNSPECV_STACK_PROBE))
   (set (reg:SI SP_REG) (minus:SI (reg:SI SP_REG) (match_dup 0)))
   (clobber (reg:CC FLAGS_REG))]
  "!TARGET_64BIT && TARGET_STACK_PROBE"
  "call\t___chkstk"
  [(set_attr "type" "multi")
   (set_attr "length" "5")])


The relevant part of the output of 
gcc  -S  -O -funroll-all-loops  920929-1.c -o 920929-1-IRA.s:

f:
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	subl	$4, %esp
	movl	8(%ebp), %edx
	call	___chkstk
	leal	15(%esp), %ecx
	andl	$-16, %ecx
	testl	%edx, %edx
        ...

__chkstk allocates only 1 byte and the code segfault on the first
attempt to assign a double to the allocated array.


The output, with -fno-ira 
gcc  -S  -O -funroll-all-loops -fno-ira 920929-1.c -o 920929-1-NOIRA.s:
_f:
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	subl	$4, %esp
	movl	8(%ebp), %ebx
	leal	30(,%ebx,8), %eax
	andl	$-16, %eax
	call	___chkstk
	leal	15(%esp), %eax
	movl	%eax, %edx
	andl	$-16, %edx
	testl	%ebx, %ebx

__chkstk allocates 816 bytes


A comparison of built-in-setjmp.c with amd without -fno-ira switch also shows
incorrect input to __chkstk

Probably related to these failure is miscompilation of the C++ compiler
code cp/pt.c, which segfaults following call to alloca in
process_partial_specialization when building libstdc++. If cp/pt.c is
compiled with -fno-ira, libstdc++ builds successfully.

Danny
Comment 1 Kai Tietz 2008-11-16 09:03:50 UTC
This problem exists also for x86_64 based mingw.
Comment 2 Jakub Jelinek 2008-11-17 09:21:27 UTC
Web pass changes:
(insn 14 13 16 2 920929-1.c:5 (parallel [
            (set (reg:SI 70)
                (unspec_volatile:SI [
                        (reg:SI 70)
                    ] 1))
            (set (reg/f:SI 7 sp)
                (minus:SI (reg/f:SI 7 sp)
                    (reg:SI 70)))
            (clobber (reg:CC 17 flags))
        ]) 672 {allocate_stack_worker_32} (nil))
into:
(insn 14 13 16 2 920929-1.c:5 (parallel [
            (set (reg:SI 80)
                (unspec_volatile:SI [
                        (reg:SI 70)
                    ] 1))
            (set (reg/f:SI 7 sp)
                (minus:SI (reg/f:SI 7 sp)
                    (reg:SI 70)))
            (clobber (reg:CC 17 flags))
        ]) 672 {allocate_stack_worker_32} (nil))
Is this correct (given that the pattern uses match_dup)?  Also, is there a reason why the allocate_stack_worker_{32,64} patterns use match_dup for between the output register and input register?  I'd say using
(match_operand:SI 0 "register_operand" "=a") for the output reg and
(match_operand:SI 1 "register_operand" "a") for the input reg (match_dup 1)ed into the sp adjustment would give the RA more freedom.
Comment 3 Jakub Jelinek 2008-11-17 22:12:29 UTC
Created attachment 16712 [details]
gcc44-pr38130.patch

This patch fixes the testcase from quick look at generated assembly.
Could anybody please bootstrap/regtest it on mingw?
Comment 4 Jakub Jelinek 2008-11-17 22:28:35 UTC
Created attachment 16713 [details]
gcc44-pr38130.patch

I've talked about this with Honza on IRC:
<honza> I would just use "0" constraint. At least in old times with regmove, regmove was able to discover that it should use same pseudo for input and output.  If you use pair of "=a" and "a" only reload sees this fact
<honza> so patch is OK with that change.
This is an updated patch which uses "0" instead of "a" constraint for operand 1.
So, can anyone please test it?  Thanks.
Comment 5 Danny Smith 2008-11-18 05:55:45 UTC
(In reply to comment #4)
> Created an attachment (id=16713) [edit]
> gcc44-pr38130.patch
> 
> I've talked about this with Honza on IRC:
> <honza> I would just use "0" constraint. At least in old times with regmove,
> regmove was able to discover that it should use same pseudo for input and
> output.  If you use pair of "=a" and "a" only reload sees this fact
> <honza> so patch is OK with that change.
> This is an updated patch which uses "0" instead of "a" constraint for operand
> 1.
> So, can anyone please test it?  Thanks.
> 

This patch  bootstraps on mingw32 (c,c++,gfortran) successfully,  libstcd++ now compiles and the two testcases reported as failing now pass.

A full regtest is in progress and so far looks good.

Thanks

Comment 6 Jakub Jelinek 2008-11-18 12:34:59 UTC
Subject: Bug 38130

Author: jakub
Date: Tue Nov 18 12:33:38 2008
New Revision: 141965

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=141965
Log:
	PR target/38130
	* config/i386/i386.md (allocate_stack_worker_32,
	allocate_stack_worker_64): Don't use match_dup between input and
	output operand.
	(allocate_stack): Adjust gen_stack_worker_{32,64} caller.
	* config/i386/i386.c (ix86_expand_prologue): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.md

Comment 7 Jakub Jelinek 2008-11-18 12:37:56 UTC
Fixed.