struct S { union F { struct T { #define A(n) unsigned n:1; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) \ A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) B(f) B(f1) B(f2) B(f3) B(f4) B(f5) A(f60) A(f61) A(f62) A(f63) A(f64) A(f65) A(f66) } q; unsigned int i[3]; } f; }; struct S s = { .f.q.f0 = 1, .f.q.f1 = 1, .f.q.f2 = 1, .f.q.f5 = 1, .f.q.f6 = 1, .f.q.f7 = 1, .f.q.f8 = 1, .f.q.f9 = 1, .f.q.f13 = 1, .f.q.f14 = 1, .f.q.f15 = 1, .f.q.f16 = 1, .f.q.f17 = 1, .f.q.f19 = 1, .f.q.f21 = 1, .f.q.f66 = 1 }; __attribute__((noipa)) void bar (unsigned *x) { if (x[0] != s.f.i[0] || x[1] != s.f.i[1] || x[2] != s.f.i[2]) __builtin_abort (); } __attribute__((noipa)) void foo (unsigned char *state) { struct S i; i.f.i[0] = 0; i.f.i[1] = 0; i.f.i[2] = 0; i.f.q.f7 = 1; i.f.q.f2 = 1; i.f.q.f21 = 1; i.f.q.f19 = 1; i.f.q.f14 = 1; i.f.q.f5 = 1; i.f.q.f0 = 1; i.f.q.f15 = 1; i.f.q.f16 = 1; i.f.q.f6 = 1; i.f.q.f9 = 1; i.f.q.f17 = 1; i.f.q.f1 = 1; i.f.q.f8 = 1; i.f.q.f13 = 1; i.f.q.f66 = 1; if (*state) { i.f.q.f37 = 1; i.f.q.f38 = 1; i.f.q.f39 = 1; i.f.q.f40 = 1; i.f.q.f41 = 1; i.f.q.f36 = 1; } bar (i.f.i); } int main () { unsigned char z = 0; foo (&z); return 0; } started failing with r264232.
So, the bug is that in the code introduced in the PR86844 fix, if we skip any stores because their order was > last_order, we should have marked the merged_store with some flag that prevents merging that with any further stores. Plus, there is obviously a missed optimization (and regression on that), because in this testcase there is no reason why any of the INTEGER_CST stores should be skipped. We have stores (bitposition, bitsize): 0 32 0 1 1 1 ... 21 1 32 32 64 32 66 1 where 0 32 has order 0, 0 1 has order 9, 1 1 has order 15, 32 32 has order 1, 64 32 has order 2 and 66 1 has the highest order. All the stores at offsets 0 to 21 are overlapping. We go and merge as overlapping store 0 32, 0 1 and all stores with order in between those (but that means skipping 1 1 and various others). If we have to do that, e.g. because the 1 1 store would be not INTEGER_CST store, then we need to arrange not to merge with it anymore stores with order above that problematic store. If all the to be skipped stores are INTEGER_CSTs, then we should obviously try to merge them in all (as was the case before the PR86844 fix on this testcase).
Created attachment 44948 [details] gcc9-pr87859.patch WIP patch.
Author: jakub Date: Mon Nov 5 10:28:19 2018 New Revision: 265794 URL: https://gcc.gnu.org/viewcvs?rev=265794&root=gcc&view=rev Log: PR tree-optimization/87859 * gimple-ssa-store-merging.c (struct merged_store_group): Add only_constants and first_nonmergeable_order members. (merged_store_group::merged_store_group): Initialize them. (merged_store_group::do_merge): Clear only_constants member if adding something other than INTEGER_CST store. (imm_store_chain_info::coalesce_immediate_stores): Don't merge stores with order >= first_nonmergeable_order. Use merged_store->only_constants instead of always recomputing it. Set merged_store->first_nonmergeable_order if we've skipped any stores. Attempt to merge overlapping INTEGER_CST stores that we would otherwise skip. * gcc.dg/store_merging_24.c: New test. * gcc.dg/store_merging_25.c: New test. Added: trunk/gcc/testsuite/gcc.dg/store_merging_24.c trunk/gcc/testsuite/gcc.dg/store_merging_25.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-ssa-store-merging.c trunk/gcc/testsuite/ChangeLog
Just out of curiosity, Jakub how did you get to the miscompilation?
It got reported in http://bugzilla.redhat.com/1645400
(In reply to Jakub Jelinek from comment #5) > It got reported in http://bugzilla.redhat.com/1645400 Good, thus you nicely reduced the original source files!
I didn't have access to the preprocessed source and was lazy to prepare it myself, so I just copied the corresponding structure definition without C++ ctors/dtors, figured out what I thought would be a bitset_t, and tried if I can reproduce it with that. As I could, I have cleaned that up and replaced the fields so that they are named in a way to make it more readable what is going on. Now that I think about it, I think I could fix this differently by adding a flag to the store structures that it has been merged already, but I'll postpone that to next week, need to work on stage1 stuff this week.
> Now that I think about it, I think I could fix this differently by adding a > flag to the store structures that it has been merged already, but I'll > postpone that to next week, need to work on stage1 stuff this week. Do you mean a flag that will enable to dump how a merge to struct looks like before and after store merging. Btw. do you also have a debug counter for store merging? If not, I would consider adding one.
Author: jakub Date: Mon Nov 5 14:12:15 2018 New Revision: 265807 URL: https://gcc.gnu.org/viewcvs?rev=265807&root=gcc&view=rev Log: PR tree-optimization/87859 * gimple-ssa-store-merging.c (struct merged_store_group): Add first_nonmergeable_order member. (merged_store_group::merged_store_group): Initialize them. (imm_store_chain_info::coalesce_immediate_stores): Don't merge stores with order >= first_nonmergeable_order. Set merged_store->first_nonmergeable_order if we've skipped any stores. Attempt to merge overlapping INTEGER_CST stores that we would otherwise skip. * gcc.dg/store_merging_24.c: New test. * gcc.dg/store_merging_25.c: New test. Added: branches/gcc-8-branch/gcc/testsuite/gcc.dg/store_merging_24.c branches/gcc-8-branch/gcc/testsuite/gcc.dg/store_merging_25.c Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/gimple-ssa-store-merging.c branches/gcc-8-branch/gcc/testsuite/ChangeLog
Created attachment 44991 [details] gcc9-pr87859.patch This is the approach I had my mind. For *_24.c it makes no difference, but for *_25.c, instead of: New sequence of 2 stores to replace old one of 14 stores New sequence of 1 stores to replace old one of 6 stores it now emits: New sequence of 1 stores to replace old one of 8 stores New sequence of 2 stores to replace old one of 10 stores New sequence of 1 stores to replace old one of 6 stores Resulting assembly is one insn larger. So, something that would need to be analyzed.
In any case, the regression is fixed now.
Fixed. If we want to track an improvement we can do let's do it in a non-wrong-code bug.