Bug 77333 - Incorrect stack adjust in epilogue when targeting i686-w64-mingw32
Summary: Incorrect stack adjust in epilogue when targeting i686-w64-mingw32
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 6.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL: https://gcc.gnu.org/ml/gcc-patches/20...
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-22 23:03 UTC by Keno Fischer
Modified: 2017-04-11 13:35 UTC (History)
1 user (show)

See Also:
Host: x86_64-linux
Target: i686-w64-mingw32
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-01-27 00:00:00


Attachments
Dockerfile (504 bytes, text/plain)
2016-09-09 09:45 UTC, Tony Kelman
Details
bisect-run.sh (692 bytes, text/plain)
2016-09-09 09:47 UTC, Tony Kelman
Details
bisect-log.txt (1.35 KB, text/plain)
2016-09-09 09:48 UTC, Tony Kelman
Details
smaller copy of SLPVectorizer.cpp that reproduces issue (1.32 KB, text/plain)
2017-01-27 22:03 UTC, Tony Kelman
Details
A small self-contained testcase (455 bytes, text/plain)
2017-03-01 18:03 UTC, Martin Jambor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Keno Fischer 2016-08-22 23:03:37 UTC
This is reduced from https://github.com/JuliaLang/julia/issues/18162.
It appears that GCC miscompiles LLVM when targeting i686-w64-mingw32.
This behavior was observed on at least GCC 5.4.0 and 6.1.0.
GCC 4.9 was confirmed to not have this problem.

To reproduce:
```
git clone https://github.com/llvm-mirror/llvm
(cd llvm && git checkout release_37)
mkdir llvm-build
(cd llvm-build && \
../llvm/configure --build=x86_64-linux-gnu --host=i686-w64-mingw32 --enable-optimized CC=i686-w64-mingw32-gcc CXX="i686-w64-mingw32-g++ -g1 -funwind-tables -fasynchronous-unwind-tables" --enable-shared && \
make -j8)
cp llvm-build/Release+Asserts/bin/opt.exe llvm-build/Release+Asserts/bin/LLVM-3.7.dll .
cp ~/cross-w32/i686-w64-mingw32/lib/libgcc_s_sjlj-1.dll ~/cross-w32/i686-w64-mingw32/lib/libstdc++-6.dll  .
wine ./opt.exe -slp-vectorizer -S ~/llvm/test/Transforms/SLPVectorizer/X86/vector.ll
```
(-g and unwind tables are only to make debugging easier, also happens without)

The problem appears to be the following:
```
Stopped in function __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizePairEPN4llvm5ValueES3_RNS_7BoUpSLPE.part.774
   0x00000000025cf853<+51>:	leal	24(%esp), %eax
   0x00000000025cf857<+55>:	calll	-2284
=> 0x00000000025cf85c<+60>:	subl	$12, %esp
   0x00000000025cf85f<+63>:	addl	$44, %esp
   0x00000000025cf862<+66>:	retl	$12
```
This code jumps to an invalid address once it hits the `retl`.
I believe the `subl 12` is incorrect, perhaps supposed to clean up a corresponding retl $12` in the callee, when there actually isn't one [^1]. Moreover, stepping back one instruction (to the end of function being called here) shows the stack intact and a plausible backtrace, lending further credibility to this theory:
```
1|debug > rsi
Stopped in function __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizeListEN4llvm8ArrayRefIPNS1_5ValueEEERNS_7BoUpSLPES5_b.constprop.793
   0x00000000025cf051<+225>:	popl	%edi
   0x00000000025cf052<+226>:	popl	%ebp
=> 0x00000000025cf053<+227>:	retl
   0x00000000025cf054<+228>:	addl	$4, %ebx
   0x00000000025cf057<+231>:	cmpl	%ebx, %esi
1|debug > bt
[1] __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizeListEN4llvm8ArrayRefIPNS1_5ValueEEERNS_7BoUpSLPES5_b.constprop.793
[2] __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizePairEPN4llvm5ValueES3_RNS_7BoUpSLPE.part.774
[3] __ZN12_GLOBAL__N_113SLPVectorizer22vectorizeChainsInBlockEPN4llvm10BasicBlockERNS_7BoUpSLPE
[4] __ZN12_GLOBAL__N_113SLPVectorizer13runOnFunctionERN4llvm8FunctionE.part.786
[5] __ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE
[6] __ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE
[7] __ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE
[8] Most likely <Unknown Module>
```
(full assembly of that function is here: https://gist.github.com/Keno/cb45c3f1e925c6ae3484ca98f7300a4a)

Lastly, looking at the CFI of the crashing frame shows that it indeed expected the stack pointer to have been adjusted up by 12 bytes after the call:
```
Stopped in function __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizePairEPN4llvm5ValueES3_RNS_7BoUpSLPE.part.774
   0x00000000025cf853<+51>:	leal	24(%esp), %eax
   0x00000000025cf857<+55>:	calll	-2284
=> 0x00000000025cf85c<+60>:	subl	$12, %esp
   0x00000000025cf85f<+63>:	addl	$44, %esp
   0x00000000025cf862<+66>:	retl	$12
1|debug > cfi
DW_CFA_def_cfa %esp 4
DW_CFA_offset %eip 1
DW_CFA_nop
DW_CFA_nop
--------------

DW_CFA_advance_loc 3 (=> 3)
DW_CFA_def_cfa_offset 48
DW_CFA_advance_loc 57 (=> 60)
DW_CFA_def_cfa_offset 36
DW_CFA_advance_loc 3 (=> 63)
--------------
DW_CFA_def_cfa_offset 48
DW_CFA_advance_loc 3 (=> 66)
DW_CFA_def_cfa_offset 4
```

[^1]: such cleanup happens when calling __ZN12_GLOBAL__N_113SLPVectorizer18tryToVectorizePairEPN4llvm5ValueES3_RNS_7BoUpSLPE.part.774
```
1|debug > f 3
Stopped in function __ZN12_GLOBAL__N_113SLPVectorizer22vectorizeChainsInBlockEPN4llvm10BasicBlockERNS_7BoUpSLPE
   0x00000000025d0727<+3319>:	movl	%ecx, 8(%esp)
   0x00000000025d072b<+3323>:	movl	40(%esp), %ecx
=> 0x00000000025d072f<+3327>:	calll	-3860
   0x00000000025d0734<+3332>:	subl	$12, %esp
   0x00000000025d0737<+3335>:	testb	%al, %al
```

Please let me know if I can provide any further information.
Comment 1 Richard Biener 2016-08-23 09:52:54 UTC
Try -fno-lifetime-dse and/or -fno-delete-null-pointer-checks.
Comment 2 Keno Fischer 2016-08-23 19:47:52 UTC
Still broken adding both options to the compile.
Comment 3 Tony Kelman 2016-09-09 09:45:43 UTC
Created attachment 39588 [details]
Dockerfile
Comment 4 Tony Kelman 2016-09-09 09:47:27 UTC
I managed to bisect this to SVN revision 217587 by
Martin Jambor <mjambor@suse.cz> using the attached
Dockerfile and bisect-run.sh. This was cross-compiling
from openSUSE 42.1 which happened to have most of the
build-deps easily available but should be reproducible
with a cross-compiler most places you can get wine.
Comment 5 Tony Kelman 2016-09-09 09:47:46 UTC
Created attachment 39589 [details]
bisect-run.sh
Comment 6 Tony Kelman 2016-09-09 09:48:41 UTC
Created attachment 39590 [details]
bisect-log.txt
Comment 7 Andrew Pinski 2016-09-09 09:55:05 UTC
Try -fno-derivization (I think that is the spelling).
Comment 8 Tony Kelman 2016-09-09 10:49:28 UTC
`git grep -n deriviz` returns nothing anywhere in the gcc source code
and -fno-derivization gives an error: unrecognized command line option.
Comment 9 Tony Kelman 2017-01-19 23:28:54 UTC
How can we help get this moving towards resolution? This has kept us stuck on GCC 4.9, which is getting increasingly problematic. We can attempt to reduce this to "minimal working piece of opt.exe with gcc 4.9" vs "broken equivalent piece with gcc 5+," but given the IPA nature of r217587 I'm not sure how that will go.
Comment 10 Martin Jambor 2017-01-24 17:40:25 UTC
Interesting to see stack issues after IPA-CP changes.

Please try compiling with -fno-devirtualize

If it does not help, try -fno-ipa-cp (or both).

If any of the above does help, adding it to just the file in which the
miscompiled function is should help, so please verify that.

If devirtualization causes the weird behavior, trying what
-fsanitize-undefined-trap-on-error does might also shed some light on
it.

After we have identified the miscompiled source file and confirmed
that an IPA option causes it, we may try to minimize the testcase or
inspect dumps (-fump-tree-release_ssa -fdump-ipa-all-details
-fdump-tree-fixup_cfg4) for suspect behavior.

Also, we will need to confirm this is indeed a compiler bug and not
source code with undefined behavior.
Comment 11 Tony Kelman 2017-01-25 09:14:11 UTC
Thank you!

-fno-devirtualize on its own did not help.

-fno-ipa-cp on its own did fix the problem.

Adding -fno-ipa-cp only when compiling lib/Transforms/Vectorize/SLPVectorizer.cpp was enough to fix the problem. Adding -fsanitize-undefined-trap-on-error as well for that same file didn't make a difference.

I can post dump output for compiling that same file, if that's what you meant.

In terms of whether or not this is caused by undefined behavior in the source being compiled, the commit https://github.com/llvm-mirror/llvm/commit/225dd82d634ca277 made a small modification to the file in question that prevented the problem from manifesting. If anything jumps out at you as being definitely undefined behavior prior to that commit, then we could look into carrying that patch, but it's very strange that this only happens with this one particular target. That commit was even intended to not be changing any behavior except when a newly-added feature is used.
Comment 12 Tony Kelman 2017-01-25 09:35:32 UTC
Output of dump options uploaded here (8.2 MB file): https://github.com/tkelman/docker-gcc-bisect/raw/07f6fa56e2f6d60ff90613b9c036f830fb8a422a/LLVMVectorize.tar.gz
Comment 13 Martin Jambor 2017-01-27 18:15:57 UTC
So although I have not gone as far building my own cross-compiler and
just used the gcc 6 based one from OpenSUSE, I managed to reproduce
the issue myself and can confirm that just marking method
SLPVectorizer::tryToVectorizeList with noclone attribute avoids the
issue.

But, from the logs I gather that the only reason why IPA-CP did that
was to remove the unused this pointer.  This makes me wonder whether
this somehow could push its callers and the callee out of sync
regarding what ABI they use for the call?

Apart from that, I have no idea what else could be wrong.  I have
quickly skimmed through the diff of the gimple dumps of the function
before and after cloning and did not see any meaningful difference
(lot's of UIDs were different so something might have escaped me but
since this confirmed my observations from the IPA-CP dump, I doubt
it).

One more thing, IPA-CP does this (clone to remove a parameter when
there are no real constants coming from all callers) only since
r231540 (i.e. gcc 6), so you should not really see the issue with gcc
5, at least not because of this.
Comment 14 Tony Kelman 2017-01-27 22:02:33 UTC
I'll look into whether the same flag changes the behavior in the same way on gcc 5.

Using the opensuse repo's current cross compiler version, I reduced SLPVectorizer.cpp down to the attached version. It's maybe so small that there's UB making it useless to look at, but replacing SLPVectorizer.cpp with this copy reproduces the known-good behavior with -fno-ipa-cp and the segfault without.
Comment 15 Tony Kelman 2017-01-27 22:03:51 UTC
Created attachment 40608 [details]
smaller copy of SLPVectorizer.cpp that reproduces issue
Comment 16 Martin Jambor 2017-02-03 10:57:34 UTC
Please check whether the following patch (it is against trunk should
apply on top of GCC 6 too) fixes the issue for you.

It can potentially open a whole can of worms so I want to make sure it
helps before taking it further.  Even if it does, I would also like to
ask you to run the whole gcc testsuite (as many languages as you can)
with and without the patch on i686-w64-mingw32 and compare the results
before I propose it on the mailing list.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ef2dc50282c..29ec61b3d0d 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1425,7 +1425,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 	new_stmt = chkp_copy_call_skip_bounds (new_stmt);
 
       gimple_call_set_fndecl (new_stmt, e->callee->decl);
-      gimple_call_set_fntype (new_stmt, gimple_call_fntype (e->call_stmt));
+      gimple_call_set_fntype (new_stmt, TREE_TYPE (e->callee->decl));
 
       if (gimple_vdef (new_stmt)
 	  && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)
Comment 17 Martin Jambor 2017-02-03 16:25:42 UTC
OK, so it turns out I botched my testing and the patch from the
previous commit cause PR57330 and we would need something like the
patch below (with a big fat comment why the condition is necessary, if
we go this way at all, that is).  Still, I need someone to test this
really helps to fix the problem and to do a testsuite run.

[PR 77333] Set clone call fntype to callee type

2017-02-03  Martin Jambor  <mjambor@suse.cz>

        PR target/77333
        * cgraph.c (redirect_call_stmt_to_callee): Set call stmt fntype to
        type of the decl if lhs has compatible type.

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ef2dc50282c..b91de6788c7 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1425,7 +1425,15 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 	new_stmt = chkp_copy_call_skip_bounds (new_stmt);
 
       gimple_call_set_fndecl (new_stmt, e->callee->decl);
-      gimple_call_set_fntype (new_stmt, gimple_call_fntype (e->call_stmt));
+
+      tree lhs = gimple_call_lhs (new_stmt);
+      tree decl_restype = TREE_TYPE (TREE_TYPE (e->callee->decl));
+      if (lhs
+	  && (VOID_TYPE_P (decl_restype)
+	      || !useless_type_conversion_p (TREE_TYPE (lhs), decl_restype)))
+	gimple_call_set_fntype (new_stmt, gimple_call_fntype (e->call_stmt));
+      else
+	gimple_call_set_fntype (new_stmt, TREE_TYPE (e->callee->decl));
 
       if (gimple_vdef (new_stmt)
 	  && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME)
Comment 18 Tony Kelman 2017-02-18 00:38:33 UTC
Sorry for the delay in getting to trying out your patch. Yes, it looks like it does fix the test case here when applied to the gcc 6 branch.

Can running the test suite provide meaningful information from a cross-compile build?
Comment 19 Tony Kelman 2017-02-20 15:13:20 UTC
The patch also applies and fixes the problem on the gcc-5-branch. I haven't tried with trunk.
Comment 20 Martin Jambor 2017-03-01 18:03:54 UTC
Created attachment 40864 [details]
A small self-contained testcase

I have finally managed to put together a small self-contained
testcase.  Compile with -O2 -fno-ipa-sra to trigger the bug (under
Wine), add -fno-ipa-cp to make it disappear.
Comment 21 Martin Jambor 2017-03-10 15:30:42 UTC
I have proposed two ways to fix this on the mailing list:

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00535.html

(I have not yet looked at how much back-portable they are but I do not
expect any issues.)
Comment 22 Martin Jambor 2017-03-30 13:51:34 UTC
Author: jamborm
Date: Thu Mar 30 13:51:02 2017
New Revision: 246589

URL: https://gcc.gnu.org/viewcvs?rev=246589&root=gcc&view=rev
Log:
[PR 77333] Fixup fntypes of gimple calls of clones

2017-03-30  Martin Jambor  <mjambor@suse.cz>

	PR ipa/77333
	* cgraph.h (cgraph_build_function_type_skip_args): Declare.
	* cgraph.c (redirect_call_stmt_to_callee): Set gimple fntype so that
	it reflects the signature changes performed at the callee side.
	* cgraphclones.c (build_function_type_skip_args): Make public, renamed
	to cgraph_build_function_type_skip_args.
	(build_function_decl_skip_args): Adjust call to the above function.

testsuite/
	* g++.dg/ipa/pr77333.C: New test.


Added:
    trunk/gcc/testsuite/g++.dg/ipa/pr77333.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/cgraph.h
    trunk/gcc/cgraphclones.c
    trunk/gcc/testsuite/ChangeLog
Comment 23 Martin Jambor 2017-03-30 13:53:33 UTC
Fixed on trunk so far.  I will prepare & test backports.
Comment 24 Martin Jambor 2017-04-11 13:24:20 UTC
Author: jamborm
Date: Tue Apr 11 13:23:48 2017
New Revision: 246838

URL: https://gcc.gnu.org/viewcvs?rev=246838&root=gcc&view=rev
Log:
[PR 77333] Fixup fntypes of gimple calls of clones

2017-04-11  Martin Jambor  <mjambor@suse.cz>

	Backport from mainline
	2017-03-30  Martin Jambor  <mjambor@suse.cz>

        PR ipa/77333
        * cgraph.h (cgraph_build_function_type_skip_args): Declare.
        * cgraph.c (redirect_call_stmt_to_callee): Set gimple fntype so that
        it reflects the signature changes performed at the callee side.
        * cgraphclones.c (build_function_type_skip_args): Make public, renamed
        to cgraph_build_function_type_skip_args.
        (build_function_decl_skip_args): Adjust call to the above function.

testsuite/
        * g++.dg/ipa/pr77333.C: New test.



Added:
    branches/gcc-6-branch/gcc/testsuite/g++.dg/ipa/pr77333.C
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/cgraph.c
    branches/gcc-6-branch/gcc/cgraph.h
    branches/gcc-6-branch/gcc/cgraphclones.c
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 25 Martin Jambor 2017-04-11 13:31:48 UTC
Author: jamborm
Date: Tue Apr 11 13:31:16 2017
New Revision: 246839

URL: https://gcc.gnu.org/viewcvs?rev=246839&root=gcc&view=rev
Log:
[PR 77333] Fixup fntypes of gimple calls of clones

2017-04-11  Martin Jambor  <mjambor@suse.cz>

	Backport from mainline
	2017-03-30  Martin Jambor  <mjambor@suse.cz>

        PR ipa/77333
        * cgraph.h (cgraph_build_function_type_skip_args): Declare.
        * cgraph.c (redirect_call_stmt_to_callee): Set gimple fntype so that
        it reflects the signature changes performed at the callee side.
        * cgraphclones.c (build_function_type_skip_args): Make public, renamed
        to cgraph_build_function_type_skip_args.
        (build_function_decl_skip_args): Adjust call to the above function.

testsuite/
        * g++.dg/ipa/pr77333.C: New test.



Added:
    branches/gcc-5-branch/gcc/testsuite/g++.dg/ipa/pr77333.C
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/cgraph.c
    branches/gcc-5-branch/gcc/cgraph.h
    branches/gcc-5-branch/gcc/cgraphclones.c
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 26 Martin Jambor 2017-04-11 13:35:12 UTC
Fixed everywhere.