The current gcc trunk miscompiles the following code on x86_64-linux-gnu at -O3 in the 32-bit mode (but not in the 64-bit mode). This is a regression from 5.3.x. -fno-tree-vrp makes the miscompilation disappear. $ gcc-trunk -v Using built-in specs. COLLECT_GCC=gcc-trunk COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib Thread model: posix gcc version 6.0.0 20151204 (experimental) [trunk revision 231299] (GCC) $ $ gcc-trunk -m32 -O2 small.c; ./a.out -28479 -28479 1 $ gcc-trunk -m64 -O3 small.c; ./a.out -28479 -28479 1 $ gcc-5.3 -m32 -O3 small.c; ./a.out -28479 -28479 1 $ $ gcc-trunk -m32 -O3 -fno-tree-vrp small.c; ./a.out -28479 -28479 1 $ $ gcc-trunk -m32 -O3 small.c $ timeout -s 9 10 ./a.out -28479 -28479 -32319 Killed $ --------------------------------------------------- int printf (const char *, ...); int b, d, e; unsigned long long c = 4100543410106915; void fn1 () { short f, g = 4 % c; int h = c; if (h) { int i = ~c; if (~c) i = 25662; f = g = i; h = c - g + ~-f; c = ~(c * h - f); } f = g; unsigned long long k = g || c; short l = c ^ g ^ k; if (g > 25662 || c == 74074520320 || !(g < 2)) { k = c; l = g; c = ~((k && c) + ~l); f = ~(f * (c ^ k) | l); if (c > k) printf ("%d\n", f); } short m = -f; unsigned long long n = c; c = m * f | n % c; if (n) printf ("%d\n", f); while (f < -31807) ; c = ~(n | c) | f; if (n < c) printf ("%lld\n", (long long) f); for (; d;) for (; e;) for (;;) ; c = h; c = l % c; } int main () { for (; b < 2; b++) fn1 (); return 0; }
Confirmed. It ICEs with -fno-if-conversion > ./xgcc -B. t.c -O3 -m32 -fno-if-conversion t.c: In function ‘fn1’: t.c:48:1: error: dominator of 13 should be 36, not 35 } ^ t.c:48:1: error: dominator of 35 should be 38, not 37 t.c:48:1: error: dominator of 36 status unknown t.c:48:1: internal compiler error: Segmentation fault 0xdcdd12 crash_signal /space/rguenther/src/svn/trunk3/gcc/toplev.c:334 0x949327 verify_dominators(cdi_direction) /space/rguenther/src/svn/trunk3/gcc/dominance.c:1033 0x947120 checking_verify_dominators
Weird, can't reproduce either the hang (with -O3 -m32) nor the ICE (-O3 -m32 -fno-if-conversion).
And I can reproduce the hang but not the ICE ;).
And now I can only reproduce the ICE but not the hang. Eh.
Note that if you want to reproduce this by running cc1, add -march=x86-64: $ ./cc1 -quiet v.c -O2 -fno-if-conversion -m32 $ ./cc1 -quiet v.c -O2 -fno-if-conversion -m32 -march=x86-64 v.c: In function ‘fn1’: v.c:48:1: error: dominator of 13 should be 36, not 35 } ^ v.c:48:1: error: dominator of 28 should be 31, not 30 v.c:48:1: error: dominator of 31 status unknown v.c:48:1: internal compiler error: Segmentation fault 0xdc88c2 crash_signal /home/marek/src/gcc/gcc/toplev.c:334 0x95036a verify_dominators(cdi_direction) /home/marek/src/gcc/gcc/dominance.c:1032 0x94e111 checking_verify_dominators /home/marek/src/gcc/gcc/dominance.h:71 0x94f8a7 calculate_dominance_info(cdi_direction) /home/marek/src/gcc/gcc/dominance.c:664 0xb8811c ira /home/marek/src/gcc/gcc/ira.c:5155 0xb88a52 execute /home/marek/src/gcc/gcc/ira.c:5511 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions.
The dominator ICE started with r228231.
Created attachment 36980 [details] gcc6-pr68730.patch Untested fix for the dominator issue. The problem is that the stv pass added by i386 after combiner computes CDI_DOMINATORS, but does not free them, and various passes in between the combiner and ira just assume dominators are not computed (e.g. pass_outof_cfg_layout). So the fix can be either to do what I wrote in the patch, or use something like: bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); if (dom_calculated_here) calculate_dominance_info (CDI_DOMINATORS); ... if (dom_calculated_here) free_dominance_info (CDI_DOMINATORS); in pass_stv::execute, or track down all the passes between combiner and ira that could break dominators and free_dominance_info (CDI_DOMINATORS) in them. Any preferences?
As for the miscompilation, it looks like RA issue to me. Note, -m32 -O3 -march=x86-64 is needed on the testcase, and I've been actually looking at: int printf (const char *, ...); int b, d, e; unsigned long long c = 4100543410106915; void fn1 () { short f, g = 4 % c; int h = c; if (h) { int i = ~c; if (~c) i = 25662; f = g = i; h = c - g + ~-f; c = ~(c * h - f); } f = g; unsigned long long k = g || c; short l = c ^ g ^ k; if (g > 25662 || c == 74074520320 || !(g < 2)) { k = c; l = g; c = ~((k && c) + ~l); f = ~(f * (c ^ k) | l); if (c > k) printf ("X %d\n", f); } short m = -f; unsigned long long n = c; c = m * f | n % c; if (n) printf ("Y %d %d %d\n", f, g, l); while (f < -31807) ; c = ~(n | c) | f; if (n < c) printf ("Z %lld %d %d\n", (long long) f, h, l); for (; d;) for (; e;) for (;;) ; c = h; c = l % c; } int main () { for (; b < 2; b++) fn1 (); return 0; } testcase instead (following stuff is from that) with the same options (as makes it more clear what the bug is, for the code path taken in the first call to fn1, we end up with l being equal to f (i.e. -28479) instead of the right value, l being equal to g (i.e. 25662) as can be seen already at the Y printf. We have in *.ira: (code_label 294 125 127 16 24 "" [1 uses]) (note 127 294 128 16 [bb 16] NOTE_INSN_BASIC_BLOCK) (insn 128 127 13 16 (set (reg:SI 107 [ _52 ]) (sign_extend:SI (reg:HI 132 [ g.2_118 ]))) 147 {extendhisi2} (nil)) (insn 13 128 14 16 (set (reg:DI 130 [ c.0_114 ]) (reg:DI 100 [ _42 ])) pr68730-2.c:29 85 {*movdi_internal} (expr_list:REG_DEAD (reg:DI 100 [ _42 ]) (nil))) (insn 14 13 15 16 (set (reg/v:HI 88 [ l ]) (reg/v:HI 87 [ g ])) pr68730-2.c:29 88 {*movhi_internal} (expr_list:REG_DEAD (reg/v:HI 87 [ g ]) (nil))) (insn 15 14 348 16 (set (reg/v:HI 87 [ g ]) (reg:HI 132 [ g.2_118 ])) pr68730-2.c:28 88 {*movhi_internal} (nil)) (jump_insn 348 15 349 16 (set (pc) (label_ref 140)) 663 {jump} (nil) where (reg:HI 132 [ g.2_118 ]) is apparently the RTL for the partition including both g.2_118 and _50 at least: g.2_206 = (unsigned short) g_202; ... # g.2_118 = PHI <g.2_206(25), g.2_206(10), 0(7)> ... _39 = ~prephitmp_197; _40 = iftmp.3_7 + _39; _41 = ~_40; _42 = (long long unsigned int) _41; c = _42; _45 = (unsigned short) g_202; _46 = _42 ^ c.0_114; _47 = (unsigned short) _46; _48 = _45 * _47; _49 = _48 | g.2_118; _50 = ~_49; f_51 = (short int) _50; so while reg:HI 132 is initially equal to the current value of the g variable, it is reused for something else afterwards. This particular problematic RTL bb is likely coming just from PHIs: # f_1 = PHI <g_202(9), f_51(12), f_51(13), g_202(8)> # l_5 = PHI <l_203(9), g_202(12), g_202(13), l_110(8)> # prephitmp_222 = PHI <g.2_206(9), _219(12), _220(13), 0(8)> # prephitmp_224 = PHI <prephitmp_199(9), _42(12), pretmp_223(13), prephitmp_199(8)> # prephitmp_228 = PHI <prephitmp_197(9), _226(12), _52(13), prephitmp_197(8)> on certain edge. At this point reg:HI 132 should contain one value (the current value of f after f = ~(f * (c ^ k) | l);, and reg:HI 87 should contain the current value of g variable. But in *.reload dump this is already: (code_label 294 125 127 16 24 "" [1 uses]) (note 127 294 458 16 [bb 16] NOTE_INSN_BASIC_BLOCK) (insn 458 127 128 16 (set (reg:HI 0 ax [orig:132 g.2_118 ] [132]) (reg:HI 6 bp [orig:132 g.2_118 ] [132])) 88 {*movhi_internal} (nil)) (insn 128 458 419 16 (set (reg:SI 3 bx [orig:107 _52 ] [107]) (sign_extend:SI (reg:HI 0 ax [orig:132 g.2_118 ] [132]))) 147 {extendhisi2} (nil)) (insn 419 128 13 16 (set (mem/c:SI (plus:SI (reg/f:SI 7 sp) (const_int 16 [0x10])) [4 %sfp+-32 S4 A64]) (reg:SI 3 bx [orig:107 _52 ] [107])) 86 {*movsi_internal} (nil)) (insn 13 419 479 16 (set (reg:DI 2 cx [orig:130 c.0_114 ] [130]) (reg:DI 4 si [orig:100 _42 ] [100])) pr68730-2.c:29 85 {*movdi_internal} (nil)) (insn 479 13 474 16 (set (reg/v:HI 5 di [orig:87 g ] [87]) (reg:HI 6 bp [orig:132 g.2_118 ] [132])) pr68730-2.c:29 88 {*movhi_internal} (nil)) (note 474 479 14 16 NOTE_INSN_DELETED) (insn 14 474 15 16 (set (mem/c:HI (plus:SI (reg/f:SI 7 sp) (const_int 24 [0x18])) [4 %sfp+-24 S2 A32]) (reg/v:HI 5 di [orig:87 g ] [87])) pr68730-2.c:29 88 {*movhi_internal} (nil)) (insn 15 14 348 16 (set (mem/c:HI (plus:SI (reg/f:SI 7 sp) (const_int 10 [0xa])) [4 %sfp+-38 S2 A16]) (reg:HI 0 ax [orig:132 g.2_118 ] [132])) pr68730-2.c:28 88 {*movhi_internal} (nil)) which means that both l contains the g.2_118 value (aka f), instead of the desired one (g). Vlad, can you please have a look?
Author: jakub Date: Mon Dec 14 08:04:37 2015 New Revision: 231606 URL: https://gcc.gnu.org/viewcvs?rev=231606&root=gcc&view=rev Log: PR rtl-optimization/68730 * cfgrtl.c (cfg_layout_finalize): Free dominators. * gcc.dg/pr68730.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr68730.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgrtl.c trunk/gcc/testsuite/ChangeLog
The wrong-code issue went away with r232413. Was that a fix for this, or just something that made the issue latent?
Reverting r232500 and r232413 results in the miscompilation again. So if those changes are optimizations only we have a latent issue.
(In reply to Jakub Jelinek from comment #11) > Reverting r232500 and r232413 results in the miscompilation again. So if > those changes are optimizations only we have a latent issue. Those changes fix performance regression and were not supposed to resolve any stability issue.
Investigating a bit.
It does look like an issue with lra-remat...
Author: bernds Date: Mon Feb 8 15:31:08 2016 New Revision: 233215 URL: https://gcc.gnu.org/viewcvs?rev=233215&root=gcc&view=rev Log: Fix latent LRA remat issue (PR68730) PR rtl-optimization/68730 * lra-remat.c (insn_to_cand_activation): New static variable. (lra_remat): Allocate and free it. (create_cand): New arg activation. Initialize a field in insn_to_cand_activation if it is nonnull. (create_cands): Pass the activation insn to create_cand when making a candidate involving an output reload. Reorganize code a little. (do_remat): Keep track of active status of candidates in a separate bitmap. Modified: trunk/gcc/ChangeLog trunk/gcc/lra-remat.c
I verified Bernd's committed patch fixes the issue (by going back to a Dec compiler to reproduce the runtime failure, then applying Bernd's patch and verifying the runtime behaviour is correct). Closing...
I was going to keep it open and ask for a 5.0 backport if it doesn't show negative effects in a week or two...
Ah, then reopened with a gcc-5 regression marker.
Author: bernds Date: Tue Feb 16 21:13:59 2016 New Revision: 233475 URL: https://gcc.gnu.org/viewcvs?rev=233475&root=gcc&view=rev Log: Backport lra-remat fix from mainline, PR68730 PR rtl-optimization/68730 * lra-remat.c (insn_to_cand_activation): New static variable. (lra_remat): Allocate and free it. (create_cand): New arg activation. Initialize a field in insn_to_cand_activation if it is nonnull. (create_cands): Pass the activation insn to create_cand when making a candidate involving an output reload. Reorganize code a little. (do_remat): Keep track of active status of candidates in a separate bitmap. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/lra-remat.c
Fixed everywhere.