Started with the same revision as PR87830, I see segfault of the benchmark.
Would be possible to analyze this a bit? The patch does have effect on optimizers because we produce a lot fewer MEM_REFs on type mismatches. Of course this should not trigger wrong code but it also may be some bug in benchmark or so.
(In reply to Jan Hubicka from comment #1) > Would be possible to analyze this a bit? I would leave it to you. Note that the same happens for SPEC2006 403.gcc benchmark: $ gdb --args /home/marxin/Programming/cpu2006/benchspec/CPU2006/403.gcc/run/run_peak_ref_amd64-m64-mine.0001/gcc_peak.amd64-m64-mine /home/marxin/Programming/cpu2006/benchspec/CPU2006/403.gcc/data/ref/input/scilab.in $ Breakpoint 1, remove_useless_values () at cselib.c:394 394 abort (); (gdb) bt #0 remove_useless_values () at cselib.c:394 #1 cselib_process_insn (insn=0x1200a40) at cselib.c:1377 #2 0x000000000046fd48 in reload_cse_regs_1 (first=<optimized out>) at reload1.c:8172 #3 0x00000000004700db in reload_cse_regs (first=0xd928c0) at reload1.c:8186 #4 0x000000000044a3ec in rest_of_compilation (decl=0x954000) at toplev.c:3254 #5 0x00000000005edd56 in c_expand_body.part.1.lto_priv.1473 (fndecl=fndecl@entry=0x954000, nested_p=nested_p@entry=0, can_defer_p=can_defer_p@entry=1) at c-decl.c:7119 #6 0x00000000005ee710 in c_expand_body (can_defer_p=1, nested_p=0, fndecl=0x954000) at c-decl.c:7024 #7 finish_function (nested=0, can_defer_p=1) at c-decl.c:6986 #8 0x00000000006072c0 in yyparse_1 () at c-parse.c:2186 #9 0x0000000000402fec in yyparse () at c-lex.c:164 #10 compile_file () at toplev.c:2126 #11 do_compile () at toplev.c:5221 #12 toplev_main (argv=<optimized out>, argc=<optimized out>) at toplev.c:5255 #13 main (argc=<optimized out>, argv=<optimized out>) at main.c:35 The patch does have effect on > optimizers because we produce a lot fewer MEM_REFs on type mismatches. Of > course this should not trigger wrong code but it also may be some bug in > benchmark or so. I doubt that.
For 403.gcc one only needs: -O2 -g -flto=8
It indeed looks interesting. - Crash indeed goes away if one disables type merging. - Fat LTO objects are OK so it does not seem to be free lang data confusing early opts. - remove_useless_values looks identical in both binaries. So it indeed seems to be consequence of having fewer MEM_REFs in the stmt stream. I will try to localize it better. Honza
I think this is caused by misoptimizing void **x; void *info ATTRIBUTE_UNUSED; { cselib_val *v = (cselib_val *)*x; struct elt_loc_list **p = &v->locs; int had_locs = v->locs != 0; while (*p) { if (references_value_p ((*p)->loc, 1)) unchain_one_elt_loc_list (p); else p = &(*p)->next; } if (had_locs && v->locs == 0) { n_useless_values++; values_became_useless = 1; } return 1; } where fre1 after type cleaning concludes that v->locs is unchanged while it is unchanged in: static void unchain_one_elt_loc_list (pl) struct elt_loc_list **pl; { struct elt_loc_list *l = *pl; *pl = l->next; l->next = empty_elt_loc_lists; empty_elt_loc_lists = l; } So it seems that in some cases alias analysis gets lost in pointer dereferences while we are still in early optimization. Perhaps we want to produce alias sets earlier? Honza
If we have less MEM_REFs then we probably strip them because we think they reference equal types. I think I already told you that given that MEM_REFs use pointer types to carry alignment info _those_ may not become incomplete! But I didn't expect that to cause wrong-code but missed optimizations.
> If we have less MEM_REFs then we probably strip them because we think they > reference equal types. > > I think I already told you that given that MEM_REFs use pointer types > to carry alignment info _those_ may not become incomplete! But I didn't > expect that to cause wrong-code but missed optimizations. We do not make them incomplete. The problem actually seems to be in early optimization where we optimize out the if conditional above. Not sure why -ffat-lto-objects worked in this context.
(In reply to Jan Hubicka from comment #7) > > If we have less MEM_REFs then we probably strip them because we think they > > reference equal types. > > > > I think I already told you that given that MEM_REFs use pointer types > > to carry alignment info _those_ may not become incomplete! But I didn't > > expect that to cause wrong-code but missed optimizations. > > We do not make them incomplete. The problem actually seems to be in > early optimization where we optimize out the if conditional above. > Not sure why -ffat-lto-objects worked in this context. So can you attach preprocessed source for the affected file? And name the affected function? (dump is stripped too early)
Created attachment 44946 [details] reproducer I am attaching the preprocessed file and will be away till 2pm. What seems to be wrong is that we optimize out decrease of n_useless_values in discard_useless_locs. fre1 already differs: discard_useless_locs (void * * x, void * info) { struct elt_loc_list * l; @@ -8886,9 +8895,6 @@ struct rtx_def * _4; int _5; struct elt_loc_list * _7; - struct elt_loc_list * _8; - int n_useless_values.142_9; - int _10; struct elt_loc_list * _25; struct elt_loc_list * empty_elt_loc_lists.98_26; @@ -8947,27 +8953,6 @@ <bb 7> : # DEBUG BEGIN_STMT - if (_1 != 0B) - goto <bb 8>; [INV] - else - goto <bb 10>; [INV] - - <bb 8> : - _8 = v_16->locs; - if (_8 == 0B) - goto <bb 9>; [INV] - else - goto <bb 10>; [INV] - - <bb 9> : - # DEBUG BEGIN_STMT - n_useless_values.142_9 = n_useless_values; - _10 = n_useless_values.142_9 + 1; - n_useless_values = _10; - # DEBUG BEGIN_STMT - values_became_useless = 1; - - <bb 10> : # DEBUG BEGIN_STMT
I can only see that v->locs might be affected by fld because the type of the FIELD_DECL changes but the (alias) type of *p_11 remains the same. Thus we have get_alias_set (ptr-to-incomplete) and get_alias_set (ptr-to-complete) not agreeing. But of course they have to. I guess we can make a two-unit LTO testcase like the following - but it doesn't miscompile since we end up substituting the MEM_REF base type for the store in foo() somehow so even with more fiddling I always get *p_5 = &a; py ={v} &y; _1 ={v} py; MEM[(struct Y *)_1].p = &b; ^^^ will not use the alias set of .p _2 = *p_5; struct X; struct Y { struct X *p; }; void foo (struct Y *p, struct X *v) { p->p = v; } --- struct X { int i; }; struct Y { struct X *p; }; void foo (struct Y *, struct X *); struct X ** volatile px; struct X a, b; int main() { struct Y y; px = &y.p; struct X **p = px; *p = &a; foo (&y, &b); if (*p != &b) __builtin_abort (); return 0; }
That said - we used to give all pointer types the same alias-set but you somehow convinced yourself that not doing that is safe. Even when considering pointer-to-complete and pointer-to-incomplete types. Do you remember any details?
OK, so in GCC 8 at least pointer-to-incomplete type gets the alias set of void * and that conflicts with any other pointer. So that works. Not sure what breaks here now...
So the alias machinery disambiguates them at static bool indirect_refs_may_alias_p (tree ref1 ATTRIBUTE_UNUSED, tree base1, poly_int64 offset1, poly_int64 max_size1, ... /* Do type-based disambiguation. */ if (base1_alias_set != base2_alias_set && !alias_sets_conflict_p (base1_alias_set, base2_alias_set)) return false; where base2_alias_set == ref2_alias_set from *p_11 and ref1_alias_set == ref2_alias_set but base1_alias_set == 22 (from v_16->locs). And somehow the alias-set for *v_16 doesn't have v_16->locs as child (well, it probably has the pointer-to-complete one as child since we built the alias-set for *v_16 _before_ adjusting the FIELD_DECLs type!?)
The following does _not_ fix it (but an assert that the alias-set is -1 does trigger). We probably have to adjust all types the record parent is embedded into as well for which there's no easy way. Well. Not compute any alias-sets before free-lang-data .... -Wstrict-aliasing computes it for example, so does folding, for example in make_bit_field_ref (in fact that seems to be the only caller...). diff --git a/gcc/tree.c b/gcc/tree.c index 069d62d51be..47cbbaab9b5 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -5515,7 +5515,10 @@ free_lang_data_in_decl (tree decl, struct free_lang_data_d *fld) } else if (TREE_CODE (decl) == FIELD_DECL) { + tree orig = TREE_TYPE (decl); TREE_TYPE (decl) = fld_simplified_type (TREE_TYPE (decl), fld); + if (TREE_TYPE (decl) != orig) + TYPE_ALIAS_SET (DECL_CONTEXT (decl)) = -1; DECL_INITIAL (decl) = NULL_TREE; } else if (TREE_CODE (decl) == TRANSLATION_UNIT_DECL
But the following fixes it: diff --git a/gcc/alias.c b/gcc/alias.c index 7963ece291a..4c88c0980d3 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -1235,14 +1235,14 @@ record_component_aliases (tree type) Accesses to it conflicts with accesses to any other pointer type. */ tree t = TREE_TYPE (field); - if (in_lto_p) + if (1) { /* VECTOR_TYPE and ARRAY_TYPE share the alias set with their element type and that type has to be normalized to void *, too, in the case it is a pointer. */ while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t)) { - gcc_checking_assert (TYPE_STRUCTURAL_EQUALITY_P (t)); + gcc_checking_assert (!in_lto_p || TYPE_STRUCTURAL_EQUALITY_P (t)); t = TREE_TYPE (t); } if (POINTER_TYPE_P (t))
I don't see the miscompilation any longer, may I close it?
> I don't see the miscompilation any longer, may I close it? Yes, it was fixed by * tree.c (fld_type_variant): Copy canonical type. (fld_incomplete_type_of): Check that canonical types looks sane; copy canonical type. (verify_type): Accept when incomplete type has complete canonical type.
Fixed.