Created attachment 40265 [details] preprocessed source seen with r243219 on powerpc64le-linux-gnu $ g++ -std=gnu++11 -c -g -O3 -fPIE -mcpu=power8 uiuc_coef_drag.ii eats all memory and then segfaults. hitting ^C after some time: (gdb) bt #0 vectorizable_load (stmt=0x3fffb3ddf910, gsi=0x3fffffffe250, vec_stmt=0x3fffffffe190, slp_node=0x11bb7720, slp_node_instance=<optimized out>) at ../../src/gcc/tree-vect-stmts.c:7394 #1 0x0000000010afcab4 in vect_transform_stmt (stmt=0x3fffb3ddf910, gsi=0x3fffffffe250, grouped_store=0x3fffffffe268, slp_node=0x11bb7720, slp_node_instance=<optimized out>) at ../../src/gcc/tree-vect-stmts.c:8562 #2 0x0000000010b19e40 in vect_schedule_slp_instance (node=0x11bb7720, instance=0x11c8bc30, vectorization_factor=1) at ../../src/gcc/tree-vect-slp.c:3742 #3 0x0000000010b19b74 in vect_schedule_slp_instance (node=0x11bc0010, instance=0x11c8bc30, vectorization_factor=1) at ../../src/gcc/tree-vect-slp.c:3620 #4 0x0000000010b19b74 in vect_schedule_slp_instance (node=0x11bc0050, instance=0x11c8bc30, vectorization_factor=1) at ../../src/gcc/tree-vect-slp.c:3620 #5 0x0000000010b19b74 in vect_schedule_slp_instance (node=0x11bc0090, instance=0x11c8bc30, vectorization_factor=1) at ../../src/gcc/tree-vect-slp.c:3620 #6 0x0000000010b19b74 in vect_schedule_slp_instance (node=0x11bc00d0, instance=0x11c8bc30, vectorization_factor=1) at ../../src/gcc/tree-vect-slp.c:3620 #7 0x0000000010b1d9b8 in vect_schedule_slp (vinfo=0x1190d7e0) at ../../src/gcc/tree-vect-slp.c:3814 #8 0x0000000010b1e26c in vect_slp_bb (bb=0x3fffb3beb050) at ../../src/gcc/tree-vect-slp.c:2816 #9 0x0000000010b2027c in (anonymous namespace)::pass_slp_vectorize::execute (this=<optimized out>, fun=0x3fffb3290420) at ../../src/gcc/tree-vectorizer.c:841 #10 0x00000000107cf2d8 in execute_one_pass (pass=0x11931430) at ../../src/gcc/passes.c:2370 #11 0x00000000107cfba4 in execute_pass_list_1 (pass=0x11931430) at ../../src/gcc/passes.c:2459 #12 0x00000000107cfbbc in execute_pass_list_1 (pass=0x11930b80) at ../../src/gcc/passes.c:2460 #13 0x00000000107cfbbc in execute_pass_list_1 (pass=0x1192f7f0) at ../../src/gcc/passes.c:2460 #14 0x00000000107cfc48 in execute_pass_list (fn=<optimized out>, pass=<optimized out>) at ../../src/gcc/passes.c:2470 #15 0x000000001048041c in cgraph_node::expand (this=0x3fffb366c0a0) at ../../src/gcc/cgraphunit.c:2001 #16 0x0000000010481f28 in expand_all_functions () at ../../src/gcc/cgraphunit.c:2137 #17 symbol_table::compile (this=0x3fffb58c0000) at ../../src/gcc/cgraphunit.c:2494 #18 0x00000000104843ec in symbol_table::compile (this=0x3fffb58c0000) at ../../src/gcc/cgraphunit.c:2554 #19 symbol_table::finalize_compilation_unit (this=0x3fffb58c0000) at ../../src/gcc/cgraphunit.c:2584 #20 0x00000000108c4724 in compile_file () at ../../src/gcc/toplev.c:488 #21 0x00000000101aa440 in do_compile () at ../../src/gcc/toplev.c:1983 #22 toplev::main (this=0x3fffffffefc0, argc=<optimized out>, argv=<optimized out>) at ../../src/gcc/toplev.c:2117 #23 0x00000000101ac4e8 in main (argc=<optimized out>, argv=0x3ffffffff3e8) at ../../src/gcc/main.c:39
Moving to tree-optimization, SLP vectorizer issue.
I will have a looksee.
#12 0x00000000013e3c0a in vectorizable_load ( stmt=<gimple_assign 0x2aaaaec39780>, gsi=0x7fffffffd140, vec_stmt=0x7fffffffd058, slp_node=0x296b970, slp_node_instance=0x278b4e0) at /space/rguenther/src/svn/trunk/gcc/tree-vect-stmts.c:7455 7455 stmt, NULL_TREE); (gdb) l 7450 { 7451 for (i = 0; i < vec_num; i++) 7452 { 7453 if (i > 0) 7454 dataref_ptr = bump_vector_ptr (dataref_ptr, ptr_incr, gsi, 7455 stmt, NULL_TREE); 7456 7457 /* 2. Create the vector-load in the loop. */ 7458 switch (alignment_support_scheme) 7459 { (gdb) p vec_num $1 = 7406755 ah, I thought we have fixed all those instances... (ah, no, I fixed cost calculation!) (gdb) p stmt_info->gap $7 = 14716900 So the ultimate issue is that we are kind-of stupid when generating code for SLP permutations. Maybe it's time to fix that...
Ok, not easily (that's even an understatement...). It's going to be Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c (revision 243474) +++ gcc/tree-vect-data-refs.c (working copy) @@ -2390,7 +2416,7 @@ vect_analyze_group_access_1 (struct data if (groupsize == 0) groupsize = count + gaps; - if (groupsize > UINT_MAX) + if (groupsize > 4096) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
Fixed.
Author: rguenth Date: Tue Dec 13 09:19:19 2016 New Revision: 243599 URL: https://gcc.gnu.org/viewcvs?rev=243599&root=gcc&view=rev Log: 2016-12-13 Richard Biener <rguenther@suse.de> PR tree-optimization/78699 * tree-vect-data-refs.c (vect_analyze_group_access_1): Limit group size. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-vect-data-refs.c