Created attachment 31000 [details] Delta reduced C++ test case That's with gcc version 4.9.0 20131014 (experimental) [trunk revision 203511] (GCC) on an x86-64-gnu-linux system. $ g++ -std=c++11 -O1 -flto -fopenmp testcase20.ii testcase20.ii:40:1: internal compiler error: in lto_output_edge, at lto-cgraph.c:300 } // namespace std ^ Please submit a full bug report, It works when compiled without -fopenmp. The only OpenMP code is: #pragma omp simd reduction(+:area) for (int yi = 0; yi < count; ++yi) ... (where delta has removed the "area" assignment.)
The failure is for the assert: 295 /* Flags that should not appear on indirect calls. */ 296 gcc_assert (!(flags & (ECF_LOOPING_CONST_OR_PURE 297 | ECF_MAY_BE_ALLOCA 298 | ECF_SIBCALL 299 | ECF_LEAF 300 | ECF_NOVOPS))); Here, (flags & ECF_LEAF) != 0. The edge->caller is: (gdb) p dump_cgraph_node(stdout, edge->caller) _ZStL17DoFinalizeBooleanSt8XPBoolOpiPKSt5XPBoxPSt6vectorIS0_SaIS0_EE/0 (std::DoFinalizeBoolean(std::XPBoolOp, int, std::XPBox const*, std::vector<std::XPBox, std::allocator<std::XPBox> >*)) @0x2aaaac1f1e40 Type: function definition analyzed Visibility: prevailing_def_ironly References: Referring: Availability: local Function flags: body local Called by: _ZNSt7XPClass4finiEPSt6vectorISt5XPBoxSaIS1_EE/1 (1.00 per call) Calls: Has 3 outgoing edges for indirect calls. My suspicion is that this is due to: DEF_INTERNAL_FN (GOMP_SIMD_VF, ECF_CONST | ECF_LEAF | ECF_NOTHROW)
Confirmed. Even shorter testcase (compile with "-O -flto -fopenmp"): ======================================= void foo(int n) { int i; #pragma omp simd reduction(+:n) for (i = 0; i < 99; ++i) ; } ======================================= The regression appeared between 4.9-20131005 and 4.9.0-20131012.
Looks like cgraph doesn't know about internal functions and treats them as indirect calls ... I suppose we shouldn't record any call edges for internal fns which all should expand to non-calls (and should all be 'leaf')? Runs into interesting fallout though ... The other variant is to forcefully build non-indirect edges, also with interesting fallout. Bah. Honza? Internal functions are like builtins just we don't have a decl (and thus a cgraph node) for them. They are just special instructions, not really calls to "functions".
Yeah, I hit exactly this as well, with DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW). The UBSAN_NULL is just like a macro, it expands to a GIMPLE_COND later on. It has to be leaf - otherwise it wouldn't be possible to have it in the middle of a BB. Temporarily, I disabled the tests with -flto.
Created attachment 31299 [details] patch Patch fixing the testcase (but otherwise untested - we don't have too many internal fn testcases).
(In reply to Richard Biener from comment #5) > Created attachment 31299 [details] > Patch fixing the testcase (but otherwise untested - we don't have too many > internal fn testcases). Doesn't work for me. With the test case of comment 0 (attachment 31000 [details]) I get with g++ -std=c++11 -fopenmp -flto -O1 test.ii the - new - ICE cc1plus: internal compiler error: in expand_call_inline, at tree-inline.c:4076 0xc046a9 expand_call_inline ../../gcc/tree-inline.c:4076 0xc046a9 gimple_expand_calls_inline ../../gcc/tree-inline.c:4430 0xc046a9 optimize_inline_calls(tree_node*) ../../gcc/tree-inline.c:4569 0x109bbdb inline_transform(cgraph_node*) ../../gcc/ipa-inline-transform.c:435
(In reply to Tobias Burnus from comment #6) > (In reply to Richard Biener from comment #5) > > Created attachment 31299 [details] > > Patch fixing the testcase (but otherwise untested - we don't have too many > > internal fn testcases). > Doesn't work for me. If one dumps the bb, one finds a call to GOMP_SIMD_VF. Using the following patch on top of yours fixes the issue for the reduced test case and for the compilation to the *.o files in the real-world code: --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4066,5 +4070,5 @@ expand_call_inline (basic_block bb, gimple stmt, copy_body_data *id) /* From here on, we're only interested in CALL_EXPRs. */ - if (gimple_code (stmt) != GIMPLE_CALL) + if (gimple_code (stmt) != GIMPLE_CALL || gimple_call_internal_p (stmt)) goto egress; * * * Unfortunately, the lto1 phase still fails for the big code. For that one, I get: xpboolean.cc:3744:8: internal compiler error: in expand_GOMP_SIMD_VF, at internal-fn.c:137 0x7b48b7 expand_GOMP_SIMD_VF ../../gcc/internal-fn.c:137 0x5fe497 expand_call_stmt ../../gcc/cfgexpand.c:2164 0x5fe497 expand_gimple_stmt_1 ../../gcc/cfgexpand.c:3143 0x5fe497 expand_gimple_stmt ../../gcc/cfgexpand.c:3295
Created attachment 31302 [details] Small test case: test.tar.gz I managed to create a test case. The attached test case is essentially attachment 31000 [details], except of changing it to a .cc file and adding #include<vector> - and writing a simple main() function for it. $ g++ -std=c++11 -flto -fopenmp-simd -Ofast test.cc test-main.cc test.cc: In member function 'fini': test.cc:31:6: internal compiler error: in expand_GOMP_SIMD_LANE, at internal-fn.c:129 void XPClass::fini(std::vector<XPBox> *__restrict out) noexcept { ^ 0x7b48d7 expand_GOMP_SIMD_LANE ../../gcc/internal-fn.c:129
We fail to stream cfun->has_force_vect_loops.
Created attachment 31305 [details] updated patch lto1 also does not get -fopenmp[-simd] passed, not sure if that is needed though. The simple testcase from comment #2 already fails with > ./xgcc -B. -B ../x86_64-unknown-linux-gnu/libgomp/ -B ../x86_64-unknown-linux-gnu/libgomp/.libs -O -flto -fopenmp t.i -r -nostdlib t.i: In function 'foo': t.i:1:6: internal compiler error: in expand_GOMP_SIMD_VF, at internal-fn.c:137 void foo(int n) ^ 0x8c4d19 expand_GOMP_SIMD_VF updated patch attached that fixes the streaming and thus runs the vectorizer at -O1 during LTRANs (but it does nothing).
Created attachment 31306 [details] updated patch And there is also has_simduid_loops and loop->safelen and loop->force_vect. Also loop->simduid but 1) I don't know how to stream it, 2) it seems to work without. But I see the vectorizer looks at DECL_UID (loop->simduid)), so "work without" may mean it just doesn't vectorize the thing but cleans up.
Created attachment 31307 [details] updated patch Indeed. t.i:1:6: note: === vect_analyze_data_refs === t.i:1:6: note: not vectorized: loop contains function calls or data references that cannot be analyzed t.i:1:6: note: bad data references. t.i:1:6: note: vectorized 0 loops in function. Ok, with simply streaming simduid as tree it "works". Not sure if we want to stream trees into the CFG section though ... (or if it even really works). What's the reason that simduid is a tree?
Eh. make check-gcc RUNTESTFLAGS="--target_board=unix/-flto/-ffat-lto-objects gomp.exp" FAIL: gcc.dg/gomp/declare-simd-1.c (internal compiler error) FAIL: gcc.dg/gomp/declare-simd-1.c (test for excess errors) But otherwise clean. /space/rguenther/src/svn/trunk/gcc/testsuite/gcc.dg/gomp/declare-simd-1.c:100:1: internal compiler error: tree code 'omp_clause' is not supported in LTO streams^M 0xa6dbe4 lto_write_tree^M /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:376^M seems like FUNCTION_DECLs now refer to OMP_CLAUSE? Ah... via attributes :/ Jakub - please add tree streaming support for the relevant required stuff.
(In reply to Richard Biener from comment #9) > We fail to stream cfun->has_force_vect_loops. Note we don't necessarily have to stream that (and has_simduid_loops), the alternative is to compute it during streaming in the IL - if it sees any force_vect loop, it can set fun->has_force_vect_loops for the loop, similarly for non-zero simdlen and fun->has_simduid_loops. Those two flags are just (pessimistically correct) flags to avoid scanning the whole IL for it, but LTO scans the whole IL anyway...
(In reply to Richard Biener from comment #13) > Eh. > > make check-gcc RUNTESTFLAGS="--target_board=unix/-flto/-ffat-lto-objects > gomp.exp" > > FAIL: gcc.dg/gomp/declare-simd-1.c (internal compiler error) > FAIL: gcc.dg/gomp/declare-simd-1.c (test for excess errors) > > But otherwise clean. > > /space/rguenther/src/svn/trunk/gcc/testsuite/gcc.dg/gomp/declare-simd-1.c: > 100:1: internal compiler error: tree code 'omp_clause' is not supported in > LTO streams^M > 0xa6dbe4 lto_write_tree^M > /space/rguenther/src/svn/trunk/gcc/lto-streamer-out.c:376^M > > seems like FUNCTION_DECLs now refer to OMP_CLAUSE? Ah... via attributes :/ > > Jakub - please add tree streaming support for the relevant required stuff. Yeah, that is the current reason why the new elementals tests to be committed also fail with LTO. I'll hack on that.
Author: rguenth Date: Wed Nov 27 15:18:23 2013 New Revision: 205447 URL: http://gcc.gnu.org/viewcvs?rev=205447&root=gcc&view=rev Log: 2013-11-27 Richard Biener <rguenther@suse.de> PR middle-end/58723 * cgraphbuild.c (build_cgraph_edges): Do not build edges for internal calls. (rebuild_cgraph_edges): Likewise. * ipa-inline-analysis.c (estimate_function_body_sizes): Skip internal calls. * tree-inline.c (estimate_num_insns): Estimate size of internal calls as 0. (gimple_expand_calls_inline): Do not try inline-expanding internal calls. * lto-streamer-in.c (input_cfg): Stream loop safelen, force_vect and simduid. (input_struct_function_base): Stream has_force_vect_loops and has_simduid_loops. (input_function): Adjust. * lto-streamer-out.c (output_cfg): Stream loop safelen, force_vect and simduid. (output_struct_function_base): Stream has_force_vect_loops and has_simduid_loops. Modified: trunk/gcc/ChangeLog trunk/gcc/cgraphbuild.c trunk/gcc/ipa-inline-analysis.c trunk/gcc/lto-streamer-in.c trunk/gcc/lto-streamer-out.c trunk/gcc/tree-inline.c
Fixed. The OMP_CLAUSE issue remains but is really a separate thing.