GCC fails with an internal compiler error. Sometimes the compilation takes about 12 minutes before it fails. Originally the bug was reported in r91178. Reproducer: extern int a[][1240092]; int b; void c() { for (int d = 2; d <= 9; d++) for (int e = 32; e <= 41; e++) b += a[d][5]; } Error: >$ gcc -march=skylake-avx512 -c -O3 small.c gcc: internal compiler error: Segmentation fault signal terminated program cc1 GCC version: gcc version 10.0.0 (rev. 274155)
Doesn't ICE for me, but SLP during vectorization goes wild, for very short *.ifcvt <bb 2> [local count: 12199019]: b_lsm.4_24 = b; <bb 3> [local count: 97603136]: # d_69 = PHI <2(2), d_9(5)> # b_lsm.4_17 = PHI <b_lsm.4_24(2), _79(5)> # ivtmp_76 = PHI <8(2), ivtmp_73(5)> _14 = a[d_69][5]; _16 = _14 + b_lsm.4_17; _23 = _14 + _16; _30 = _14 + _23; _37 = _14 + _30; _44 = _14 + _37; _51 = _14 + _44; _58 = _14 + _51; _65 = _14 + _58; _72 = _14 + _65; _79 = _14 + _72; d_9 = d_69 + 1; ivtmp_73 = ivtmp_76 - 1; if (ivtmp_73 != 0) goto <bb 5>; [87.50%] else goto <bb 4>; [12.50%] <bb 5> [local count: 85404116]: goto <bb 3>; [100.00%] <bb 4> [local count: 12199019]: # _80 = PHI <_79(3)> b = _80; return; it creates a *.vect dump with 1860297 lines, with: vect__14.10_5 = MEM <vector(8) int> [(int *)vectp_a.8_7]; vectp_a.8_4 = vectp_a.8_7 + 32; vect__14.11_3 = MEM <vector(8) int> [(int *)vectp_a.8_4]; vectp_a.8_2 = vectp_a.8_4 + 32; vect__14.12_1 = MEM <vector(8) int> [(int *)vectp_a.8_2]; vectp_a.8_10 = vectp_a.8_2 + 32; vect__14.13_15 = MEM <vector(8) int> [(int *)vectp_a.8_10]; vectp_a.8_31 = vectp_a.8_10 + 32; ... where there are 620046 those MEM loads that nothing consumes and 620046 additions of 32.
Crashes here in predcom: Program received signal SIGSEGV, Segmentation fault. ... (gdb) bt -100 ... #662173 0x00000000012f7c09 in follow_ssa_edge (loop=0x7ffff78cd000, def=0x7fffd8be5160, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:1350 #662174 0x00000000012f6f22 in follow_ssa_edge_binary (loop=0x7ffff78cd000, at_stmt=0x7fffd8be51b8, type=0x7ffff78c31f8, rhs0=0x7fffd8be6120, code=POINTER_PLUS_EXPR, rhs1=0x7ffff7775d38, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:947 #662175 0x00000000012f769e in follow_ssa_edge_in_rhs (loop=0x7ffff78cd000, stmt=0x7fffd8be51b8, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:1135 #662176 0x00000000012f7c09 in follow_ssa_edge (loop=0x7ffff78cd000, def=0x7fffd8be51b8, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:1350 #662177 0x00000000012f6f22 in follow_ssa_edge_binary (loop=0x7ffff78cd000, at_stmt=0x7ffff78ccf20, type=0x7ffff78c31f8, rhs0=0x7fffd8be61b0, code=POINTER_PLUS_EXPR, rhs1=0x7ffff7775d38, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:947 #662178 0x00000000012f769e in follow_ssa_edge_in_rhs (loop=0x7ffff78cd000, stmt=0x7ffff78ccf20, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:1135 #662179 0x00000000012f7c09 in follow_ssa_edge (loop=0x7ffff78cd000, def=0x7ffff78ccf20, halting_phi=0x7ffff78d4200, evolution_of_loop=0x7fffffffdf00, limit=0) at ../../gcc/gcc/tree-scalar-evolution.c:1350 #662180 0x00000000012f812d in analyze_evolution_in_loop (loop_phi_node=0x7ffff78d4200, init_cond=0x7ffff78ce0d8) at ../../gcc/gcc/tree-scalar-evolution.c:1467 #662181 0x00000000012f864d in interpret_loop_phi (loop=0x7ffff78cd000, loop_phi_node=0x7ffff78d4200) at ../../gcc/gcc/tree-scalar-evolution.c:1630 #662182 0x00000000012fa1bd in analyze_scalar_evolution_1 (loop=0x7ffff78cd000, var=0x7ffff777fee8) at ../../gcc/gcc/tree-scalar-evolution.c:2044 #662183 0x00000000012fa393 in analyze_scalar_evolution (loop=0x7ffff78cd000, var=0x7ffff777fee8) at ../../gcc/gcc/tree-scalar-evolution.c:2108 #662184 0x00000000012fa49e in analyze_scalar_evolution_in_loop (wrto_loop=0x7ffff78cd000, use_loop=0x7ffff78cd000, version=0x7ffff777fee8, folded_casts=0x7fffffffe0df) at ../../gcc/gcc/tree-scalar-evolution.c:2210 #662185 0x00000000012fd082 in simple_iv_with_niters (wrto_loop=0x7ffff78cd000, use_loop=0x7ffff78cd000, op=0x7ffff777fee8, iv=0x7fffffffe280, iv_niters=0x0, allow_nonconstant_step=true) at ../../gcc/gcc/tree-scalar-evolution.c:3288 #662186 0x00000000012fd8e0 in simple_iv (wrto_loop=0x7ffff78cd000, use_loop=0x7ffff78cd000, op=0x7ffff777fee8, iv=0x7fffffffe280, allow_nonconstant_step=true) at ../../gcc/gcc/tree-scalar-evolution.c:3413 #662187 0x000000000207e0ef in dr_analyze_innermost (drb=0x3166a50, ref=0x7ffff78e11b8, loop=0x7ffff78cd000, stmt=0x7ffff78d8dc0) at ../../gcc/gcc/tree-data-ref.c:950 #662188 0x000000000207f342 in create_data_ref (nest=0x7ffff78d0f00, loop=0x7ffff78cd000, memref=0x7ffff78e11b8, stmt=0x7ffff78d8dc0, is_read=true, is_conditional_in_stmt=false) at ../../gcc/gcc/tree-data-ref.c:1255 #662189 0x0000000002089d1c in find_data_references_in_stmt (nest=0x7ffff78cd000, stmt=0x7ffff78d8dc0, datarefs=0x7fffffffe7f8) at ../../gcc/gcc/tree-data-ref.c:5149 #662190 0x0000000002089f0e in find_data_references_in_bb (loop=0x7ffff78cd000, bb=0x7ffff78da410, datarefs=0x7fffffffe7f8) at ../../gcc/gcc/tree-data-ref.c:5203 #662191 0x0000000002089fce in find_data_references_in_loop (loop=0x7ffff78cd000, datarefs=0x7fffffffe7f8) at ../../gcc/gcc/tree-data-ref.c:5236 #662192 0x000000000208a604 in compute_data_dependences_for_loop (loop=0x7ffff78cd000, compute_self_and_read_read_dependences=true, loop_nest=0x7fffffffe750, datarefs=0x7fffffffe7f8, dependence_relations=0x7fffffffe7f0) at ../../gcc/gcc/tree-data-ref.c:5411 #662193 0x00000000012d7b7e in tree_predictive_commoning_loop (loop=0x7ffff78cd000) at ../../gcc/gcc/tree-predcom.c:3192 #662194 0x00000000012d8134 in tree_predictive_commoning () at ../../gcc/gcc/tree-predcom.c:3314 #662195 0x00000000012d81b4 in run_tree_predictive_commoning (fun=0x7ffff78c6000) at ../../gcc/gcc/tree-predcom.c:3339 #662196 0x00000000012d8222 in (anonymous namespace)::pass_predcom::execute (this=0x2fc7cb0, fun=0x7ffff78c6000) at ../../gcc/gcc/tree-predcom.c:3368 #662197 0x000000000102b1c3 in execute_one_pass (pass=0x2fc7cb0) at ../../gcc/gcc/passes.c:2474 ... Bisected to r256634, reverting manually "fixes" the problem.
Yeah, I know. I tried to fix this for PR91178 but needed to revert. The vectorizer issue uncovers places where we run into compile-time issues with these large increment chains and I've fixed some but in the end a vectorizer fix is required. My mind tells me that at some point we limited the group gap but I didn't find records of that. In the end it should be "easy" to avoid the bad codegen, but vectorizable_load is kind-of a mess and hairy to adjust ... I'll give it another try.
OK, so I have a patch to fix the recursion depth in SCEV analysis but then we hit the next one in SLSR, in my case because with -O0 there's no tailcall performed but even with -O2 we don't tailcall it. #8 0x0000000001fbb1d7 in replace_unconditional_candidate (c=0x200a1b60) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2223 #9 0x0000000001fbc47a in replace_uncond_cands_and_profitable_phis ( c=0x200a1b60) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2625 #10 0x0000000001fbc4bc in replace_uncond_cands_and_profitable_phis ( c=0x200a1ae0) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2631 ... #599120 0x0000000001fbc4bc in replace_uncond_cands_and_profitable_phis ( c=0x33ba1b0) at /space/rguenther/src/svn/trunk2/gcc/gimple-ssa-strength-reduction.c:2631 2631 replace_uncond_cands_and_profitable_phis (lookup_cand (c->dependent)); given the structure a worklist would be necessary to fix things there.
Author: rguenth Date: Mon Aug 19 14:45:38 2019 New Revision: 274672 URL: https://gcc.gnu.org/viewcvs?rev=274672&root=gcc&view=rev Log: 2019-08-19 Richard Biener <rguenther@suse.de> PR tree-optimization/91403 * tree-scalar-evolution.c (follow_ssa_edge_binary): Inline cases we can handle with tail-recursion... (follow_ssa_edge_expr): ... here. Do so. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-scalar-evolution.c
*** Bug 96834 has been marked as a duplicate of this bug. ***
We can put a baind-aid in like for PR78699.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:84684e0f78c20c51492722a5b95cda778ad77073 commit r11-6589-g84684e0f78c20c51492722a5b95cda778ad77073 Author: Richard Biener <rguenther@suse.de> Date: Mon Jan 11 12:04:32 2021 +0100 tree-optimization/91403 - avoid excessive code-generation The vectorizer, for large permuted grouped loads, generates inefficient intermediate code (cleaned up only later) that runs into complexity issues in SCEV analysis and elsewhere. For the non-single-element interleaving case we already put a hard limit in place, this applies the same limit to the missing case. 2021-01-11 Richard Biener <rguenther@suse.de> PR tree-optimization/91403 * tree-vect-data-refs.c (vect_analyze_group_access_1): Cap single-element interleaving group size at 4096 elements. * gcc.dg/vect/pr91403.c: New testcase.
Fixed on trunk sofar.