Bug 58372 - internal compiler error: ix86_compute_frame_layout
Summary: internal compiler error: ix86_compute_frame_layout
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 4.8.1
: P3 normal
Target Milestone: 7.4
Assignee: Uroš Bizjak
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-09 15:41 UTC by sonoro
Modified: 2018-11-11 17:50 UTC (History)
5 users (show)

See Also:
Host:
Target: x86 sjlj
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-09-09 00:00:00


Attachments
ii file compressed with rar (165.51 KB, application/x-rar-compressed)
2013-09-12 11:40 UTC, sonoro
Details
Proposed patch (864 bytes, patch)
2018-10-30 11:56 UTC, Uroš Bizjak
Details | Diff
case to reproduce problem related to sanitize (74.82 KB, text/plain)
2018-10-31 02:59 UTC, Terry Guo
Details
Proposed patch (865 bytes, patch)
2018-11-01 10:12 UTC, Uroš Bizjak
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description sonoro 2013-09-09 15:41:31 UTC
I get this with mingw32-4.8.1-posix-sjlj but not with 
mingw32-4.8.1-posix-dwarf

C:\SupercolliderRepos\master4\supercollider\server\plugins\DelayUGens.cpp: 
In function 'void BufSampleRate_next(BufInfoUnit*,
 int)':
C:\SupercolliderRepos\master4\supercollider\server\plugins\DelayUGens.cpp:570:1: 
internal compiler error: in ix86_compute_frame_layout, at config/i386/i386.c:8981
 }

I post the ii file but I dont know if it will succed as it is bigger than 1000k

also happens with 4.8.0 and not with 4.7.2
Comment 1 Paolo Carlini 2013-09-09 16:12:33 UTC
In any case a self contained reproducer is a requirement. Please do your best to reduce it to a manageable size

  http://gcc.gnu.org/wiki/A_guide_to_testcase_reduction

and otherwise compress it (with, say, gzip or bzip2).
Comment 2 Richard Biener 2013-09-10 08:53:59 UTC
Please also specify how you configured the compiler(s).
Comment 3 sonoro 2013-09-12 11:40:04 UTC
Created attachment 30808 [details]
ii file compressed with rar
Comment 4 sonoro 2013-09-12 11:40:56 UTC
this are the compiler flags for the project

CXX_FLAGS = -save-temps   -msse -msse -mfpmath=sse -msse2 -std=c++11 -mstackrealign  -O2 -g -DNDEBUG @CMakeFiles/DelayUGens_supernova.dir/includes_CXX.rsp   -fschedule-insns2 -fomit-frame-pointer -Wreturn-type -fvisibility=hidden -fno-finite-math-only

CXX_DEFINES = -DBOOST_ALL_NO_LIB -DBOOST_CHRONO_HEADER_ONLY -DBOOST_DATE_TIME_NO_LIB -DBOOST_THREAD_USE_LIB -DDelayUGens_supernova_EXPORTS -DNOMINMAX -DNOVA_SIMD -DSC_FFT_FFTW -DSC_WIN32 -DSUPERNOVA -DWIN32_LEAN_AND_MEAN -D_WIN32_WINNT=0x0500
Comment 5 sonoro 2013-10-01 17:33:13 UTC
Is there anything else I must provide in order to solve this issue?
Comment 6 sonoro 2013-10-01 17:58:22 UTC
(In reply to Paolo Carlini from comment #1)
> In any case a self contained reproducer is a requirement. Please do your
> best to reduce it to a manageable size
> 
>   http://gcc.gnu.org/wiki/A_guide_to_testcase_reduction
> 
> and otherwise compress it (with, say, gzip or bzip2).

Done!!
Comment 7 Paolo Carlini 2013-10-01 18:43:17 UTC
Please, if you really want to see progress on this issue, do your best to reduce the reproducer to a manageable size, normally less than, say, 100 lines are more than enough.
Comment 8 Kai Tietz 2013-12-10 19:06:01 UTC
So assert trigger here is:
'gcc_assert (preferred_alignment <= stack_alignment_needed);'

Caused because because
(gdb) print preferred_alignment
$1 = 16
(gdb) print stack_alignment_needed
$2 = 4

So hacky variant to fix that would be to enforce that preferred and stack-alignment_needed are set identical

Index: i386.c
===================================================================
--- i386.c      (Revision 205860)
+++ i386.c      (Arbeitskopie)
@@ -9362,6 +9362,14 @@ ix86_compute_frame_layout (struct ix86_frame *fram
       crtl->stack_alignment_needed = 128;
     }

+  /* If preferred_alignment is bigger then stack_alignment_needed
+     make both sizes equal.  */
+  if (preferred_alignment > stack_alignment_needed)
+    {
+      stack_alignment_needed = preferred_alignment;
+      crtl->stack_alignment_needed = crtl->preferred_stack_boundary;
+    }
+
   gcc_assert (!size || stack_alignment_needed);
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
   gcc_assert (preferred_alignment <= stack_alignment_needed);
Comment 9 H.J. Lu 2013-12-10 19:19:38 UTC
My understanding is stack realignment doesn't work on Windows.
There was a attempt to do it at a time.  But we didn't know
enough about Windows to do it.
Comment 10 Kai Tietz 2013-12-10 19:22:25 UTC
Well, for x64 we can't realign stack due issue about prologue layout enforced by SEH stuff.

For x86 I see actually no good reason why this shouldn't work.  I checked generated assembly and it looks fine AFAICS.
Do you recall what the issue for x86 windows was?
Comment 11 sonoro 2013-12-14 09:45:55 UTC
So it seems you solved the problem in sjlj
Are you going to push it?
Thanks
victor
Comment 12 Bitterblue 2018-04-21 14:45:43 UTC
Hi.

This bug still exists in GCC 7.3.0. It comes up when cross-compiling Qt 5.10.1 from 64 bit Linux for 32 bit Windows.

Well, I assume it's the same bug because of comments seen online.

https://github.com/mxe/mxe/issues/2011
https://bugreports.qt.io/browse/QTBUG-64707

The offending function can be seen here: https://github.com/qt/qtbase/blob/6c6ace9d23f90845fd424e474d38fe30f070775e/src/corelib/global/qrandom.cpp#L104

% i686-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=i686-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-w64-mingw32/7.3.0/lto-wrapper
Target: i686-w64-mingw32
Configured with: /build/mingw-w64-gcc/src/gcc/configure --prefix=/usr --libexecdir=/usr/lib --target=i686-w64-mingw32 --enable-languages=c,lto,c++,objc,obj-c++,fortran,ada --enable-shared --enable-static --enable-threads=posix --enable-fully-dynamic-string --enable-libstdcxx-time=yes --with-system-zlib --enable-cloog-backend=isl --enable-lto --disable-dw2-exceptions --enable-libgomp --disable-multilib --enable-checking=release
Thread model: posix
gcc version 7.3.0 (GCC) 

No problem with x86_64-w64-mingw32, of course.
Comment 13 martchus 2018-05-26 19:30:44 UTC
I also came across this issue when compiling Qt 5.11.0. The error message you get when compiling Qt looks very similar, indeed. I configured my compiler similar to  Bitterblue, but I'm already using GCC 8.1.0.

Since the official Qt binaries are apparently not affected, I assume that problem is only present when using SJLJ.
Comment 14 David Grayson 2018-10-26 15:46:26 UTC
To recap: there is an internal compiler error in ix86_compute_frame_layout when using rdrnd, C++, noexcept, and i686-w64-mingw32.  And it probably affects x86_64-w64-mingw32 too, but I am most interested in the i686 case so I will focus on that.

Here is a minimal test case the reproduces the issue:

    __attribute__((__target__("rdrnd")))
	void f(unsigned int * b) noexcept {
      __builtin_ia32_rdrand32_step(b);
    }

Here are some workarounds I found:

1) Remove the "noexcept" specifier from the function.
2) Don't use the rdrand built-ins.
3) Compile the function as C code.
4) I'm not sure if this is safe, but you can use the compiler option -mpreferred-stack-boundary=2, which tells GCC to prefer 4-aligning the stack.

I am running GCC as a cross-compiler on Linux, targeting i686-w64-mingw32.  I have seen this bug in GCC 7.3.0, 8.2.0, and the 8-20181019 snapshot.  For the remainder of this bug report, I will just focus on the behavior of the 8-20181019 snapshot.

Here is the command I use to compile the program above, and the full output of the compiler:

----

$ ../prefix/bin/i686-w64-mingw32-g++ -c ../david.cpp 
during RTL pass: ira
../david.cpp: In function 'void f(unsigned int*)':
../david.cpp:3:1: internal compiler error: in ix86_compute_frame_layout, at config/i386/i386.c:11734
 }
 ^
0xe0b414 ix86_compute_frame_layout
        ../../gcc-8-20181019/gcc/config/i386/i386.c:11734
0xaf65ed set_initial_elim_offsets
        ../../gcc-8-20181019/gcc/reload1.c:3858
0xb0131f calculate_elim_costs_all_insns()
        ../../gcc-8-20181019/gcc/reload1.c:1562
0x9e536f ira_costs()
        ../../gcc-8-20181019/gcc/ira-costs.c:2249
0x9decc6 ira_build()
        ../../gcc-8-20181019/gcc/ira-build.c:3427
0x9d64e3 ira
        ../../gcc-8-20181019/gcc/ira.c:5295
0x9d64e3 execute
        ../../gcc-8-20181019/gcc/ira.c:5606
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

----

The assertion that is failing in ix86_compute_frame_layout (i386.c, line 11734) is this one:

    gcc_assert (preferred_alignment <= stack_alignment_needed);

By adding my own fprintf statement, I determined that in the problematic case we have:

- preferred_alignment is 16
- stack_alignment_needed is 4

I did some experiments and I found that the problematic case (using an rdrand intrinsic in a noexcept C++ function) seems to be the only case where the preferred_alignment variable is affected by the -mpreferred-stack-boundary command-line option.  In every other case, preferred_alignment is always just 4.

Isn't that weird?  Just looking at variable names, I would expect -mpreferred-stack-boundary to always affect crtl->preferred_stack_boundary and crtl->stack_alignment_needed (which are where the values in the assertion above come from).  My guess is that someone decided to force the preferred_stack_boundary and stack_alignment_needed to 4 on Windows for some reason, and they succeeded in doing that for most cases, but they failed to do it for the rdrand/C++/noexcept case.

Does anyone have an idea of how to fix this bug for real?  What values should crtl->preferred_stack_boundary crtl->stack_alignment_needed really have  on Windows, and should they be controllable from the command-line?  Where do they get set?

--David
Comment 15 Uroš Bizjak 2018-10-29 09:11:42 UTC
(In reply to David Grayson from comment #14)

> Does anyone have an idea of how to fix this bug for real?  What values
> should crtl->preferred_stack_boundary crtl->stack_alignment_needed really
> have  on Windows, and should they be controllable from the command-line? 
> Where do they get set?
> 
> --David

Reading symbols from /ssd/uros/gcc-build-xxx/gcc/cc1plus...done.
(gdb) watch x_rtl.preferred_stack_boundary 

Hardware watchpoint 1: x_rtl.preferred_stack_boundary

(gdb) r
Starting program: /ssd/uros/gcc-build-xxx/gcc/cc1plus pr58372.C
Hardware watchpoint 1: x_rtl.preferred_stack_boundary

Old value = 0
New value = 32
0x0000000000b930dc in rtl_data::init_stack_alignment (this=0x23c8ac0 <x_rtl>) at /home/uros/gcc-svn/trunk/gcc/emit-rtl.c:6636
6636      preferred_stack_boundary = STACK_BOUNDARY;

Hardware watchpoint 1: x_rtl.preferred_stack_boundary

Old value = 32
New value = 128
emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*) ()
    at /home/uros/gcc-svn/trunk/gcc/calls.c:4744
4744      if (outmode != VOIDmode)

(gdb) list
4739      if (crtl->preferred_stack_boundary < PREFERRED_STACK_BOUNDARY)
4740        crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
4741
4742      /* If this kind of value comes back in memory,
4743         decide where in memory it should come back.  */
4744      if (outmode != VOIDmode)
4745        {
4746          tfom = lang_hooks.types.type_for_mode (outmode, 0);
4747          if (aggregate_value_p (tfom, 0))
4748            {

This happen when building SjLj landing pads:

#0  emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*)
    () at /home/uros/gcc-svn/trunk/gcc/calls.c:4744
#1  0x0000000000b9b36c in emit_library_call (arg1_mode=<optimized out>, arg1=<optimized out>, outmode=E_VOIDmode, 
    fn_type=LCT_NORMAL, fun=<optimized out>) at /home/uros/gcc-svn/trunk/gcc/rtl.h:4123
#2  sjlj_emit_function_enter(rtx_code_label*) () at /home/uros/gcc-svn/trunk/gcc/except.c:1202
#3  0x0000000000b9f20a in sjlj_build_landing_pads () at /home/uros/gcc-svn/trunk/gcc/except.c:1481
#4  finish_eh_generation() () at /home/uros/gcc-svn/trunk/gcc/except.c:1510

Adding -mpreferred-stack-boundary=2 "fixes" compilation.

If you want to experiment, try the following patch:

Index: config/i386/mingw32.h
===================================================================
--- config/i386/mingw32.h       (revision 265582)
+++ config/i386/mingw32.h       (working copy)
@@ -251,6 +251,10 @@
 #undef  CHECK_EXECUTE_STACK_ENABLED
 #define CHECK_EXECUTE_STACK_ENABLED flag_setstackexecutable
 
+#undef PREFERRED_STACK_BOUNDARY_DEFAULT
+#define PREFERRED_STACK_BOUNDARY_DEFAULT       \
+  (TARGET_64BIT ? 128 : MIN_STACK_BOUNDARY)
+
 /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygming. */
 /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygwin. */
 #if DWARF2_UNWIND_INFO

But I don't know the details of MS ABI, so I can't tell if the patch is correct.
Comment 16 Kai Tietz 2018-10-29 09:37:11 UTC
(In reply to Uroš Bizjak from comment #15)
> (In reply to David Grayson from comment #14)
> 
> > Does anyone have an idea of how to fix this bug for real?  What values
> > should crtl->preferred_stack_boundary crtl->stack_alignment_needed really
> > have  on Windows, and should they be controllable from the command-line? 
> > Where do they get set?
> > 
> > --David
> 
> Reading symbols from /ssd/uros/gcc-build-xxx/gcc/cc1plus...done.
> (gdb) watch x_rtl.preferred_stack_boundary 
> 
> Hardware watchpoint 1: x_rtl.preferred_stack_boundary
> 
> (gdb) r
> Starting program: /ssd/uros/gcc-build-xxx/gcc/cc1plus pr58372.C
> Hardware watchpoint 1: x_rtl.preferred_stack_boundary
> 
> Old value = 0
> New value = 32
> 0x0000000000b930dc in rtl_data::init_stack_alignment (this=0x23c8ac0
> <x_rtl>) at /home/uros/gcc-svn/trunk/gcc/emit-rtl.c:6636
> 6636      preferred_stack_boundary = STACK_BOUNDARY;
> 
> Hardware watchpoint 1: x_rtl.preferred_stack_boundary
> 
> Old value = 32
> New value = 128
> emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type,
> machine_mode, int, std::pair<rtx_def*, machine_mode>*) ()
>     at /home/uros/gcc-svn/trunk/gcc/calls.c:4744
> 4744      if (outmode != VOIDmode)
> 
> (gdb) list
> 4739      if (crtl->preferred_stack_boundary < PREFERRED_STACK_BOUNDARY)
> 4740        crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
> 4741
> 4742      /* If this kind of value comes back in memory,
> 4743         decide where in memory it should come back.  */
> 4744      if (outmode != VOIDmode)
> 4745        {
> 4746          tfom = lang_hooks.types.type_for_mode (outmode, 0);
> 4747          if (aggregate_value_p (tfom, 0))
> 4748            {
> 
> This happen when building SjLj landing pads:
> 
> #0  emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type,
> machine_mode, int, std::pair<rtx_def*, machine_mode>*)
>     () at /home/uros/gcc-svn/trunk/gcc/calls.c:4744
> #1  0x0000000000b9b36c in emit_library_call (arg1_mode=<optimized out>,
> arg1=<optimized out>, outmode=E_VOIDmode, 
>     fn_type=LCT_NORMAL, fun=<optimized out>) at
> /home/uros/gcc-svn/trunk/gcc/rtl.h:4123
> #2  sjlj_emit_function_enter(rtx_code_label*) () at
> /home/uros/gcc-svn/trunk/gcc/except.c:1202
> #3  0x0000000000b9f20a in sjlj_build_landing_pads () at
> /home/uros/gcc-svn/trunk/gcc/except.c:1481
> #4  finish_eh_generation() () at /home/uros/gcc-svn/trunk/gcc/except.c:1510
> 
> Adding -mpreferred-stack-boundary=2 "fixes" compilation.
> 
> If you want to experiment, try the following patch:
> 
> Index: config/i386/mingw32.h
> ===================================================================
> --- config/i386/mingw32.h       (revision 265582)
> +++ config/i386/mingw32.h       (working copy)
> @@ -251,6 +251,10 @@
>  #undef  CHECK_EXECUTE_STACK_ENABLED
>  #define CHECK_EXECUTE_STACK_ENABLED flag_setstackexecutable
>  
> +#undef PREFERRED_STACK_BOUNDARY_DEFAULT
> +#define PREFERRED_STACK_BOUNDARY_DEFAULT       \
> +  (TARGET_64BIT ? 128 : MIN_STACK_BOUNDARY)
> +
>  /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygming. */
>  /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygwin. */
>  #if DWARF2_UNWIND_INFO> 
> But I don't know the details of MS ABI, so I can't tell if the patch is
> correct.

The x64 ABI part is correct, as for x64 abit there is just 16-byte stack alignment. For x86 ms abi things aren't that strict AFAIK. The common stack alignment on function entry is the natural one, but there is in general no strict requirement for it. So the patch looks fine, but should be commented in the release notes, as there might be fallout seen caused by it.
Comment 17 Uroš Bizjak 2018-10-29 20:40:24 UTC
(In reply to Kai Tietz from comment #16)
> > If you want to experiment, try the following patch:
> > 
> > Index: config/i386/mingw32.h
> > ===================================================================
> > --- config/i386/mingw32.h       (revision 265582)
> > +++ config/i386/mingw32.h       (working copy)
> > @@ -251,6 +251,10 @@
> >  #undef  CHECK_EXECUTE_STACK_ENABLED
> >  #define CHECK_EXECUTE_STACK_ENABLED flag_setstackexecutable
> >  
> > +#undef PREFERRED_STACK_BOUNDARY_DEFAULT
> > +#define PREFERRED_STACK_BOUNDARY_DEFAULT       \
> > +  (TARGET_64BIT ? 128 : MIN_STACK_BOUNDARY)
> > +
> >  /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygming. */
> >  /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygwin. */
> >  #if DWARF2_UNWIND_INFO> 
> > But I don't know the details of MS ABI, so I can't tell if the patch is
> > correct.
> 
> The x64 ABI part is correct, as for x64 abit there is just 16-byte stack
> alignment. For x86 ms abi things aren't that strict AFAIK. The common stack
> alignment on function entry is the natural one, but there is in general no
> strict requirement for it. So the patch looks fine, but should be commented
> in the release notes, as there might be fallout seen caused by it.

Unfortunately, I have no way of testing the patch on windows targets.

Please also note that the testcase still ICEs with patched compiler when -mpreferred-stack-bonudary=4 option is used. Based on the comments in emit-rtl.h:

  /* The largest alignment needed on the stack, including requirement
     for outgoing stack alignment.  */
  unsigned int stack_alignment_needed;

  /* Preferred alignment of the end of stack frame, which is preferred
     to call other functions.  */
  unsigned int preferred_stack_boundary;

it looks that somewhere stack_alignment_needed should be set to at least the value of preferred_stack_boundary. Just above the assert in i386.c, there is:

  /* 64-bit MS ABI seem to require stack alignment to be always 16,
     except for function prologues, leaf functions and when the defult
     incoming stack boundary is overriden at command line or via
     force_align_arg_pointer attribute.  */
  if ((TARGET_64BIT_MS_ABI && crtl->preferred_stack_boundary < 128)
      && (!crtl->is_leaf || cfun->calls_alloca != 0
	  || ix86_current_function_calls_tls_descriptor
	  || ix86_incoming_stack_boundary < 128))
    {
      crtl->preferred_stack_boundary = 128;
      crtl->stack_alignment_needed = 128;
    }

so this maybe hints what to do in x86 case. Perhaps the patch in Comment #8 is the way to go.
Comment 18 Uroš Bizjak 2018-10-29 22:40:01 UTC
Digging a bit deeper: the testcase from Comment #14:

    __attribute__((__target__("rdrnd")))
	void f(unsigned int * b) noexcept {
      __builtin_ia32_rdrand32_step(b);
    }

also fails for all x86 targets when the compiler is configured with --enable-sjlj-exceptions, so the failure is not mingw specific.

Following is x86_64-linux-gnu target:

cc1plus -O2 -mpreferred-stack-boundary=4 -quiet -fpreprocessed pr58372.C
during RTL pass: ira
pr58372.C: In function ‘void f(unsigned int*)’:
pr58372.C:4:5: internal compiler error: in ix86_compute_frame_layout, at config/i386/i386.c:11159
    4 |     }
      |     ^
0x1821b24 ix86_compute_frame_layout
        /home/uros/gcc-svn/trunk/gcc/config/i386/i386.c:11159
0x128b447 set_initial_elim_offsets
...

-mpreferred-stack-boundary=3 (instead of 4) again "fixes" the ICE.

It looks that stack realignment doesn't work with SjLj exceptions. As shown in Comment #15, building the call to _Unwind_SjLj_Register sets crtl->preferred_stack_boundary to 128, but nothing updates crtl->stack_alignment_needed. This leads to the shown assert failure.

HJ, do you have an idea what is going wrong with realignment here?
Comment 19 H.J. Lu 2018-10-30 11:10:24 UTC
Xuepeng, we investigated stack realignment on Windows when working on
stack realignment.  Can we take another look?
Comment 20 Uroš Bizjak 2018-10-30 11:56:48 UTC
Created attachment 44928 [details]
Proposed patch

It turned out that functions, called directly through emit_library_call (as the above testcase, which builds call to _Unwind_SjLj_Register from sjlj_emit_function_enter) miss a whole lot of stack realignmnet setup. There is an update to crtl->preferred_stack_boundary, but several updates for SUPPORTS_STACK_ALIGNMENT targets are missing, including eventual DRAP setup.

Attached patch fixes the path through emit_library_call.

Can someone please test the patch on SJLJ target?
Comment 21 Terry Guo 2018-10-30 12:00:38 UTC
Thanks for fix. I am glad to help to test it out.
Comment 22 Terry Guo 2018-10-30 13:50:53 UTC
(In reply to Uroš Bizjak from comment #20)
> Created attachment 44928 [details]
> Proposed patch
> 
> It turned out that functions, called directly through emit_library_call (as
> the above testcase, which builds call to _Unwind_SjLj_Register from
> sjlj_emit_function_enter) miss a whole lot of stack realignmnet setup. There
> is an update to crtl->preferred_stack_boundary, but several updates for
> SUPPORTS_STACK_ALIGNMENT targets are missing, including eventual DRAP setup.
> 
> Attached patch fixes the path through emit_library_call.
> 
> Can someone please test the patch on SJLJ target?

On an x86_64 Linux platform, I simply configured gcc with command:
../gcc/configure --enable-sjlj-exceptions

Then with Uroš's patch, the gcc bootstrap has no problem and the case can be successfully compiled. I am doing the gcc regression test to make sure there is no regression with Uroš's patch, it will take some time to finish. I am also attempting to cross-build for i686-w64-mingw32 to verify for Comment #14.

If I am missing something, please let me know. Thanks very much.
Comment 23 Terry Guo 2018-10-31 02:57:24 UTC
Hi Uroš:

With your fix, I identified two regressions so far: one is that we should run the case you provided with c++ standard newer than c++11. The 'noexcept' was introduced in c++14. Guess we need a directive like "{ ! target c++14_down }".
Another regression is related to -fsanitize=address shown as below:

./gcc/cc1plus use-after-scope-types-5.ii  -quiet -m32 -O0 -fsanitize=address  -o use-after-scope-types-5.s
during RTL pass: expand
In file included from /export/users/xuepengg/58372-ice-stack-alignment/gcc/gcc/testsuite/g++.dg/asan/use-after-scope-types-5.C:4:
/export/users/xuepengg/58372-ice-stack-alignment/gcc/gcc/testsuite/g++.dg/asan/use-after-scope-types.h: In function ‘void test() [with T = char [1000]]’:
/export/users/xuepengg/58372-ice-stack-alignment/gcc/gcc/testsuite/g++.dg/asan/use-after-scope-types.h:22:51: internal compiler error: in safe_as_a, at is-a.h:210
   22 | template <class T> __attribute__((noinline)) void test() {
      |                                                   ^~~~
0x11d5b1f rtx_insn* safe_as_a<rtx_insn*, rtx_def>(rtx_def*)
        ../../gcc/gcc/is-a.h:210
0x11d5b1f NEXT_INSN(rtx_insn const*)
        ../../gcc/gcc/rtl.h:1461
0x11d5b1f ix86_get_drap_rtx
        ../../gcc/gcc/config/i386/i386.c:12050
0xa92e12 emit_library_call_value_1(int, rtx_def*, rtx_def*, libcall_type, machine_mode, int, std::pair<rtx_def*, machine_mode>*)
        ../../gcc/gcc/calls.c:4757
0xecc975 emit_library_call(rtx_def*, libcall_type, machine_mode, rtx_def*, machine_mode, rtx_def*, machine_mode, rtx_def*, machine_mode)
        ../../gcc/gcc/rtl.h:4149
0xecc975 asan_emit_stack_protection(rtx_def*, rtx_def*, unsigned int, long*, tree_node**, int)
        ../../gcc/gcc/asan.c:1500
0xaa53ff expand_used_vars
        ../../gcc/gcc/cfgexpand.c:2273
0xaa6d13 execute
        ../../gcc/gcc/cfgexpand.c:6268
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Comment 24 Terry Guo 2018-10-31 02:59:14 UTC
Created attachment 44934 [details]
case to reproduce problem related to sanitize
Comment 25 Terry Guo 2018-10-31 11:49:02 UTC
Debugged the ICE further and found that below line in function ix86_get_drap_rtx is causing ICE:

12050         insn = emit_insn_before (seq, NEXT_INSN (entry_of_function ()));

It is called when generating call to __asan_stack_free_5 via emit_library_call_value_1. The entry_of_function() is returned something invalid.
Comment 26 Terry Guo 2018-10-31 14:08:32 UTC
Hi Uroš:

I think I found why your proposed patch causes problem in Comment 23. It is all about timing. The below code from patch is trying to set up DRAP reg in a rather early stage when the function is not fully expanded to RTL.

+      if (crtl->drap_reg == NULL_RTX)
+	{
+	  rtx drap_rtx = targetm.calls.get_drap_rtx ();

The targetm.calls.get_drap_rtx () will be hooked to ix86_get_drap_rtx () where we will have code:

12046         drap_vreg = copy_to_reg (arg_ptr);
(gdb) 
12047         seq = get_insns ();
(gdb) 
12048         end_sequence ();
(gdb) 
12050         insn = emit_insn_before (seq, NEXT_INSN (entry_of_function ()));

At this stage, what returned from (entry_of_function ()) is actually GIMPLE form of the function, not the RTL form we are expecting. Then NEXT_INSN (something_in_gimple) goes wrong.
Comment 27 David Grayson 2018-10-31 15:34:11 UTC
Thanks to everyone who is working on this.  I can confirm that the patch in comment #20 by Uroš Bizjak applies cleanly to GCC 7.3.0, and I successfully used the resulting toolchain targeting i686-w64-mingw32 to build Qt and several Qt GUI examples, all of which run correctly.

Just in case it helps you find more bugs: I noticed there are several other places in the code (of gcc-8-20181019) where ctrl->preferred_stack_boundary gets updated without any obvious update of ctrl->stack_alignment_needed:

gcc/explow.c:1247 in get_dynamic_stack_size
gcc/explow.c:1595 in get_dynamic_stack_base
gcc/calls.c:3811 in expand_call
gcc/config/i386/i386.c:12593 in ix86_update_stack_boundary
Comment 28 Uroš Bizjak 2018-10-31 17:16:54 UTC
(In reply to Terry Guo from comment #25)
> Debugged the ICE further and found that below line in function
> ix86_get_drap_rtx is causing ICE:
> 
> 12050         insn = emit_insn_before (seq, NEXT_INSN (entry_of_function
> ()));
> 
> It is called when generating call to __asan_stack_free_5 via
> emit_library_call_value_1. The entry_of_function() is returned something
> invalid.

I wonder if it is correct for asan to emit the call without setting RTL function framework first. The DRAP generation needs function in RTL form, so it is able to emit DRAP setup.

Let's ask Jakub about asan, if it is possible to move generation of the call after the function is already expanded to RTL.
Comment 29 Jakub Jelinek 2018-10-31 17:19:42 UTC
(In reply to Uroš Bizjak from comment #28)
> Let's ask Jakub about asan, if it is possible to move generation of the call
> after the function is already expanded to RTL.

I'm afraid no.
Comment 30 Uroš Bizjak 2018-10-31 18:13:03 UTC
(In reply to Jakub Jelinek from comment #29)
> > Let's ask Jakub about asan, if it is possible to move generation of the call
> > after the function is already expanded to RTL.
> 
> I'm afraid no.

Hm...

... maybe we could go with following patch:

+  if (SUPPORTS_STACK_ALIGNMENT)
+    {
+      if (preferred_stack_boundary > crtl->stack_alignment_estimated)
+	crtl->stack_alignment_estimated = preferred_stack_boundary;
+      if (preferred_stack_boundary > crtl->stack_alignment_needed)
+	crtl->stack_alignment_needed = preferred_stack_boundary;
+    }

This means that for functions, emitted through emit_library_call, stack won't be realigned. This would cure the assert (and would follow a bit more expand_stack_alignment from cfgrtl.c).
Comment 31 Terry Guo 2018-11-01 01:15:28 UTC
(In reply to Uroš Bizjak from comment #30)
> (In reply to Jakub Jelinek from comment #29)
> > > Let's ask Jakub about asan, if it is possible to move generation of the call
> > > after the function is already expanded to RTL.
> > 
> > I'm afraid no.
> 
> Hm...
> 
> ... maybe we could go with following patch:
> 
> +  if (SUPPORTS_STACK_ALIGNMENT)
> +    {
> +      if (preferred_stack_boundary > crtl->stack_alignment_estimated)
> +	crtl->stack_alignment_estimated = preferred_stack_boundary;
> +      if (preferred_stack_boundary > crtl->stack_alignment_needed)
> +	crtl->stack_alignment_needed = preferred_stack_boundary;
> +    }
> 
> This means that for functions, emitted through emit_library_call, stack
> won't be realigned. This would cure the assert (and would follow a bit more
> expand_stack_alignment from cfgrtl.c).

I have same thought. I will test this one.
Comment 32 Terry Guo 2018-11-01 01:20:18 UTC
(In reply to David Grayson from comment #27)
> Thanks to everyone who is working on this.  I can confirm that the patch in
> comment #20 by Uroš Bizjak applies cleanly to GCC 7.3.0, and I successfully
> used the resulting toolchain targeting i686-w64-mingw32 to build Qt and
> several Qt GUI examples, all of which run correctly.
> 
> Just in case it helps you find more bugs: I noticed there are several other
> places in the code (of gcc-8-20181019) where ctrl->preferred_stack_boundary
> gets updated without any obvious update of ctrl->stack_alignment_needed:
> 
> gcc/explow.c:1247 in get_dynamic_stack_size
> gcc/explow.c:1595 in get_dynamic_stack_base
> gcc/calls.c:3811 in expand_call
> gcc/config/i386/i386.c:12593 in ix86_update_stack_boundary

Hello David,
Do you have instructions about how to build toolchain targeting i686-w64-mingw32? I searched around and just found:
https://sourceforge.net/p/mingw-w64/code/HEAD/tree/trunk/mingw-w64-doc/howto-build/mingw-w64-howto-build.txt
Comment 33 David Grayson 2018-11-01 03:21:26 UTC
Hello, Terry.  I'd be happy to help.  I hope you have access to a Linux computer.  I've actually spent a lot of time working on build scripts for cross-compilers running on Linux and here's what I have come up with for you:

https://gist.github.com/DavidEGrayson/d5ca447cca1ea23d5adca2f353dbb67a
Comment 34 Terry Guo 2018-11-01 06:02:35 UTC
(In reply to David Grayson from comment #33)
> Hello, Terry.  I'd be happy to help.  I hope you have access to a Linux
> computer.  I've actually spent a lot of time working on build scripts for
> cross-compilers running on Linux and here's what I have come up with for you:
> 
> https://gist.github.com/DavidEGrayson/d5ca447cca1ea23d5adca2f353dbb67a

Thanks David. I will give it a try.
Comment 35 Uroš Bizjak 2018-11-01 09:01:54 UTC
(In reply to Terry Guo from comment #31)
> (In reply to Uroš Bizjak from comment #30)
> > (In reply to Jakub Jelinek from comment #29)
> > > > Let's ask Jakub about asan, if it is possible to move generation of the call
> > > > after the function is already expanded to RTL.
> > > 
> > > I'm afraid no.
> > 
> > Hm...
> > 
> > ... maybe we could go with following patch:
> > 
> > +  if (SUPPORTS_STACK_ALIGNMENT)
> > +    {
> > +      if (preferred_stack_boundary > crtl->stack_alignment_estimated)
> > +	crtl->stack_alignment_estimated = preferred_stack_boundary;
> > +      if (preferred_stack_boundary > crtl->stack_alignment_needed)
> > +	crtl->stack_alignment_needed = preferred_stack_boundary;
> > +    }
> > 
> > This means that for functions, emitted through emit_library_call, stack
> > won't be realigned. This would cure the assert (and would follow a bit more
> > expand_stack_alignment from cfgrtl.c).
> 
> I have same thought. I will test this one.

Actually, we can use crtl->stack_realign_processed to delay DRAP generation. The condition in the patch should be changed to:

      crtl->stack_realign_needed
	= INCOMING_STACK_BOUNDARY < crtl->stack_alignment_estimated;
      crtl->stack_realign_tried = crtl->stack_realign_needed;

--->  if (crtl->stack_realign_processed && crtl->drap_reg == NULL_RTX)
	{
	  rtx drap_rtx = targetm.calls.get_drap_rtx ();

Can you please test this change? The testcase from Comment #23 does not fail for me.
Comment 36 Terry Guo 2018-11-01 09:08:29 UTC
(In reply to Uroš Bizjak from comment #35)
> 
> Actually, we can use crtl->stack_realign_processed to delay DRAP generation.
> The condition in the patch should be changed to:
> 
>       crtl->stack_realign_needed
> 	= INCOMING_STACK_BOUNDARY < crtl->stack_alignment_estimated;
>       crtl->stack_realign_tried = crtl->stack_realign_needed;
> 
> --->  if (crtl->stack_realign_processed && crtl->drap_reg == NULL_RTX)
> 	{
> 	  rtx drap_rtx = targetm.calls.get_drap_rtx ();
> 
> Can you please test this change? The testcase from Comment #23 does not fail
> for me.

OK. Do it right now.
Comment 37 Uroš Bizjak 2018-11-01 10:12:39 UTC
Created attachment 44940 [details]
Proposed patch

I think that attached patch is close to final. We can use:

+  if (SUPPORTS_STACK_ALIGNMENT
+      && crtl->stack_realign_processed)

to avoid unnecessary updates before expand_stack_alignment is called.
Comment 38 Uroš Bizjak 2018-11-01 10:15:42 UTC
(In reply to Terry Guo from comment #36)

> OK. Do it right now.

I think that latest attachment is the one that should be tested. Functionally it is the same, but avoids unnecessary variable updates before expand_stack_alignment is called. expand_stack_alignment will do everything for us.
Comment 39 Terry Guo 2018-11-01 13:30:52 UTC
(In reply to Uroš Bizjak from comment #38)
> (In reply to Terry Guo from comment #36)
> 
> > OK. Do it right now.
> 
> I think that latest attachment is the one that should be tested.
> Functionally it is the same, but avoids unnecessary variable updates before
> expand_stack_alignment is called. expand_stack_alignment will do everything
> for us.

Yes. The latest one works perfectly. Bootstrap and regression test on x86_64 show no problem. I also managed to build a gcc for i686-w64-mingw32 with SJLJ enabled, the case can be compiled successfully.
Comment 40 Uroš Bizjak 2018-11-01 15:46:09 UTC
(In reply to Terry Guo from comment #39)
> (In reply to Uroš Bizjak from comment #38)
> > (In reply to Terry Guo from comment #36)
> > 
> > > OK. Do it right now.
> > 
> > I think that latest attachment is the one that should be tested.
> > Functionally it is the same, but avoids unnecessary variable updates before
> > expand_stack_alignment is called. expand_stack_alignment will do everything
> > for us.
> 
> Yes. The latest one works perfectly. Bootstrap and regression test on x86_64
> show no problem. I also managed to build a gcc for i686-w64-mingw32 with
> SJLJ enabled, the case can be compiled successfully.

Then we can just move the call to finish_eh_generation in pass_expand::execute in front of expand_stack_alignment:

--cut here--
Index: cfgexpand.c
===================================================================
--- cfgexpand.c (revision 265582)
+++ cfgexpand.c (working copy)
@@ -6510,6 +6510,12 @@ pass_expand::execute (function *fun)
   find_many_sub_basic_blocks (blocks);
   purge_all_dead_edges ();
 
+  /* After initial rtl generation, call back to finish generating
+     exception support code.  We need to do this before cleaning up
+     the CFG as the code does not expect dead landing pads.  */
+  if (fun->eh->region_tree != NULL)
+    finish_eh_generation ();
+
   expand_stack_alignment ();
 
   /* Fixup REG_EQUIV notes in the prologue if there are tailcalls in this
@@ -6517,12 +6523,6 @@ pass_expand::execute (function *fun)
   if (crtl->tail_call_emit)
     fixup_tail_calls ();
 
-  /* After initial rtl generation, call back to finish generating
-     exception support code.  We need to do this before cleaning up
-     the CFG as the code does not expect dead landing pads.  */
-  if (fun->eh->region_tree != NULL)
-    finish_eh_generation ();
-
   /* BB subdivision may have created basic blocks that are are only reachable
      from unlikely bbs but not marked as such in the profile.  */
   if (optimize)
--cut here--

And indeed, the above patch works without problems for me.
Comment 41 uros 2018-11-04 19:23:21 UTC
Author: uros
Date: Sun Nov  4 19:22:50 2018
New Revision: 265776

URL: https://gcc.gnu.org/viewcvs?rev=265776&root=gcc&view=rev
Log:
	PR middle-end/58372
	* cfgexpand.c (pass_expand::execute): Move the call to
	finish_eh_generation in front of the call to expand_stack_alignment.

testsuite/ChangeLog:

	PR middle-end/58372
	* g++.target/i386/pr58372.C: New test.


Added:
    trunk/gcc/testsuite/g++.target/i386/pr58372.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c
    trunk/gcc/testsuite/ChangeLog
Comment 42 uros 2018-11-11 17:36:55 UTC
Author: uros
Date: Sun Nov 11 17:36:23 2018
New Revision: 266014

URL: https://gcc.gnu.org/viewcvs?rev=266014&root=gcc&view=rev
Log:
	Backport from mainline
	2018-11-04  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/58372
	* cfgexpand.c (pass_expand::execute): Move the call to
	finish_eh_generation in front of the call to expand_stack_alignment.

testsuite/ChangeLog:

	Backport from mainline
	2018-11-04  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/58372
	* g++.dg/pr58372.C: New test.


Added:
    branches/gcc-8-branch/gcc/testsuite/g++.dg/pr58372.C
Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/cfgexpand.c
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
Comment 43 uros 2018-11-11 17:45:23 UTC
Author: uros
Date: Sun Nov 11 17:44:43 2018
New Revision: 266015

URL: https://gcc.gnu.org/viewcvs?rev=266015&root=gcc&view=rev
Log:
	Backport from mainline
	2018-11-04  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/58372
	* cfgexpand.c (pass_expand::execute): Move the call to
	finish_eh_generation in front of the call to expand_stack_alignment.

testsuite/ChangeLog:

	Backport from mainline
	2018-11-04  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/58372
	* g++.dg/pr58372.C: New test.


Added:
    branches/gcc-7-branch/gcc/testsuite/g++.dg/pr58372.C
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/cfgexpand.c
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
Comment 44 Uroš Bizjak 2018-11-11 17:50:48 UTC
Fixed everywhere.