Created attachment 53568 [details] Reproducer for maybe uninitialized warning with gcc 12 In our production code when upgrading to gcc 12 from gcc 11 we have faced bogus warning "may be used uninitialized" on bit complex use of boost::optional vector of strings used in a loop with unrelated string stream redirect. > In file included from <..>/include/c++/12.0.0/vector:67, > from reproduce.cpp:13: > In destructor ‘std::vector<_Tp, _Alloc>::~vector() [with _Tp = std::__cxx11::basic_string<char>; _Alloc = std::allocator<std::__cxx11::basic_string<char> >]’, > inlined from ‘void boost::optional_detail::optional_base<T>::destroy_impl() [with T = std::vector<std::__cxx11::basic_string<char> >]’ at /usr/include/boost/optional/optional.hpp:771:50, > inlined from ‘void boost::optional_detail::optional_base<T>::destroy() [with T = std::vector<std::__cxx11::basic_string<char> >]’ at /usr/include/boost/optional/optional.hpp:757:21, > inlined from ‘void boost::optional_detail::optional_base<T>::assign(const boost::optional_detail::optional_base<T>&) [with T = std::vector<std::__cxx11::basic_string<char> >]’ at /usr/include/boost/optional/optional.hpp:264:21, > inlined from ‘boost::optional_detail::optional_base<T>& boost::optional_detail::optional_base<T>::operator=(const boost::optional_detail::optional_base<T>&) [with T = std::vector<std::__cxx11::basic_string<char> >]’ at /usr/include/boost/optional/optional.hpp:241:19, > inlined from ‘boost::optional<T>& boost::optional<T>::operator=(const boost::optional<T>&) [with T = std::vector<std::__cxx11::basic_string<char> >]’ at /usr/include/boost/optional/optional.hpp:1040:15, > inlined from ‘void test()’ at reproduce.cpp:55:13: > <..>/include/c++/12.0.0/bits/stl_vector.h:680:22: error: ‘*(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*)((char*)&external + offsetof(boost::Optional<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >,boost::optional<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::<unnamed>.boost::optional_detail::optional_base<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::m_storage.boost::optional_detail::aligned_storage<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::dummy_)).std::vector<std::__cxx11::basic_string<char> >::<anonymous>.std::_Vector_base<std::__cxx11::basic_string<char>, std::allocator<std::__cxx11::basic_string<char> > >::_M_impl.std::_Vector_base<std::__cxx11::basic_string<char>, std::allocator<std::__cxx11::basic_string<char> > >::_Vector_impl::<anonymous>.std::_Vector_base<std::__cxx11::basic_string<char>, std::allocator<std::__cxx11::basic_string<char> > >::_Vector_impl_data::_M_finish’ may be used uninitialized [-Werror=maybe-uninitialized] > 680 | std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, > | ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 681 | _M_get_Tp_allocator()); > | ~~~~~~~~~~~~~~~~~~~~~~ > reproduce.cpp: In function ‘void test()’: > reproduce.cpp:45:40: note: ‘external’ declared here > 45 | Optional<std::vector<std::string>> external; > | ^~~~~~~~ I have tested this with gcc 12 (12.1.1, 12.2.1 and release/gcc-12 commit https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=4ce316ca54c863cf0fd4257ba0ab768ab83c62e5") with boost 1.69.0, 1.75.0, 1.76.0 and 1.80.0. All fails. With any boost version and gcc 11 (11.3.1 and release/gcc-11 tip commit https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=7e356c3083c79473c941bc92d61f755e923bc86c) the warning doesn't appear. To reproduce: > $ g++ -Werror -std=c++20 -O2 -Wall -o reproduce.o -c reproduce.cpp I was not able to bump the reproducer to simpliear case. In all modification where loop is removed the warning disappears. The problem doesn't occur when std::optional is used. Only workaround I found is to add following pragma at top of the module: > #pragma GCC diagnostic push > #pragma GCC diagnostic ignored "-Wmaybe-uninitialized" > #include <vector> > #pragma GCC diagnostic pop I have bisected the problem to start with https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=5b8b1522e04adc20980f396571be1929a32d148a commit 5b8b1522e04adc20980f396571be1929a32d148a (HEAD, refs/bisect/bad) Author: Richard Biener <rguenther@suse.de> Date: Mon Sep 27 12:01:38 2021 +0200 tree-optimization/100112 - VN last_vuse and redundant store elimination This avoids the last_vuse optimization hindering redundant store elimination by always also recording the original VUSE that was in effect on the load. In stage3 gcc/*.o we have 3182752 times recorded a single entry and 903409 times two entries (that's ~20% overhead). With just recording a single entry the number of hashtable lookups done when walking the vuse->vdef links to find an earlier access is 28961618. When recording the second entry this makes us find that earlier for donwnstream redundant accesses, reducing the number of hashtable lookups to 25401052 (that's a ~10% reduction). 2021-09-27 Richard Biener <rguenther@suse.de> PR tree-optimization/100112 * tree-ssa-sccvn.c (visit_reference_op_load): Record the referece into the hashtable twice in case last_vuse is different from the original vuse on the stmt. * gcc.dg/tree-ssa/ssa-fre-95.c: New testcase.
Confirmed also on trunk and also with -O1 and the usual suspects -fno-ivopts -fno-thread-jumps. The question is whether external = internal; is problematic. For example struct MyStruct { }; std::vector<MyStruct> getMyStructs(); void test() { Optional<std::vector<std::string>> external; Optional<std::vector<std::string>> internal; external = internal; } is diagnosed with -std=c++20 -O -Wuninitialized -fno-tree-fre -fno-tree-dominator-opts that's because we then no longer forward the uninit state of the Optionals and thus diagnose the conditional (on initialized) <bb 6> [local count: 118702158]: _34 = MEM[(struct vector *)&external + 8B].D.88471._M_impl.D.87778._M_finish; _35 = MEM[(struct vector *)&external + 8B].D.88471._M_impl.D.87778._M_start; if (_34 != _35) goto <bb 41>; [89.00%] ... the same happens with the unreduced testcase, there we have similar code conditional on external.m_initialized where we are not able to forward the m_initialized state. It might be the bisected rev. causes a missed optimization here, I will have to check. At least I don't see a good reason why we don't forward it here.
The rev. in question shows changes in what PRE does. Before the change disabling PRE also leads to the diagnostic so there's a missed optimization there. I see less fully redundant values discovered. I'm trying to reduce a testcase.
Created attachment 53576 [details] testcase for missed PRE here's a reduction that works around r12-3918-g5b8b1522e04adc but not on the tip of the GCC 12 branch or trunk. I'm going to reduce another for current trunk.
Created attachment 53577 [details] testcase for missed PRE on trunk Here's one. So the issue is that PRE EXP_GEN now contains exp_gen[10] := { {component_ref<m_initialized>,mem_ref<0B>,addr_expr<&external>}@.MEM_31 (0009) } for the block <bb 10> [local count: 97603128]: # .MEM_31 = VDEF <.MEM_55> internal ={v} {CLOBBER}; # VUSE <.MEM_32> _10 = MEM[(struct optional_base *)&external].m_initialized; if (_10 != 0) instead of the one with last_vuse .MEM_45. That in turn causes prune_clobbered_mems to prune the expression from ANTIC_OUT of BB 9 which has a 9->10 edge removing the incentive to perform FRE on this value inside the loop. prune_clobbered_mems does this because if (ref->vuse) { gimple *def_stmt = SSA_NAME_DEF_STMT (ref->vuse); if (!gimple_nop_p (def_stmt) && ((gimple_bb (def_stmt) != block && !dominated_by_p (CDI_DOMINATORS, block, gimple_bb (def_stmt))) || (gimple_bb (def_stmt) == block && value_dies_in_block_x (expr, block)))) to_remove = i; where block == BB9 so the dominance test fails. The dominance test should be redundant (unless we have translated something over a backedge!?) but it really shows the issue that we fail to record the memory expression with the correct VUSE into EXP_GEN and fail to adjust the VUSE when translating from ANTIC_OUT to ANTIC_IN as well. When we'd do all this "correctly", then we'd get a) a ton more hashtable entries with questionable value, b) the dominance test will never fail. By that reasoning a cheaper fix should be to instead perform the value_dies_in_block_x when a dominance check cannot elide it. There's a possible latent issue with virtual operands PHI translation over backedges then though. The safest thing would be to eventually force a set of the virtual operand when any virtual PHI is on the edge we are translating over - we are actually doing this but the no-PHI-nodes optimization in phi_translate_set defeats the intent of adjusting to BB_LIVE_VOP_ON_EXIT that translate_vuse_through_block performs and its simple check of BB equality means we don't fixup later either. Fixing that spot in the simplistic way as well (rather than consistently handling the virtual operand everywhere) fixes the observed bogus diagnostic.
Created attachment 53578 [details] CFG of the second testcase With -m32 one can see the full redundancy discovery is a bit fragile since it requires an out-of-loop use to make the cross loop redundancy anticipated. See the CFG graph for illustration. It should be possible to create a smaller testcase from this, when the use of external.m_initialized is removed then the PRE does no longer work. That somehow happens with -m32 but not -m64 ...
Anyhow - I'm testing a (half-way) fix for the issue in PRE, the testcase proved quite useful. Thanks a lot for opening this bugreport!
> Thanks a lot for opening this bugreport! You are welcome! :-) Unfortunately I'm not familiar with gcc internals that much in order to comment on the analysis :-(. But once there will be patch for this issue I can quickly try that on our production code.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:5edf02ed2b6de024f83a023d046a6a18f645bc83 commit r13-2683-g5edf02ed2b6de024f83a023d046a6a18f645bc83 Author: Richard Biener <rguenther@suse.de> Date: Thu Sep 15 13:33:23 2022 +0200 tree-optimization/106922 - PRE and virtual operand translation PRE implicitely keeps virtual operands at the blocks incoming version but the explicit updating point during PHI translation fails to trigger when there are no PHIs at all in a block. Later lazy updating then fails because of a too lose block check. A similar issues plagues reference invalidation when checking the ANTIC_OUT to ANTIC_IN translation. The following fixes both and makes the lazy updating work. The diagnostic testcase unfortunately requires boost so the testcase is the one I reduced for a missed optimization in PRE. The testcase fails with -m32 on x86_64 because we optimize too much before PRE which causes PRE to not trigger so we fail to eliminate a full redundancy. I'm going to open a separate bug for this. Hopefully the !lp64 selector is good enough. PR tree-optimization/106922 * tree-ssa-pre.cc (translate_vuse_through_block): Only keep the VUSE if its def dominates PHIBLOCK. (prune_clobbered_mems): Rewrite logic so we check whether a value dies in a block when the VUSE def doesn't dominate it. * g++.dg/tree-ssa/pr106922.C: New testcase.
I verified the fix cherry-picks to GCC 12 and fixes the testcase there as well (on x86_64-linux, that is).
Created attachment 53581 [details] Second reproducer failing with 5edf02ed2b6de024f83a023d046a6a18f645bc83 I have cherry-picked the fix on top of gcc 12 and I confirm that my original reproducer passes. But our production code still fails. I have created new reproducer, the change is that the optional is inside a struct and seems there need to be two elements. With first reproducer I have eliminated this additional struct as that was also failing. Please try with: g++ -Werror -std=c++20 -O2 -Wall -o reproduce1.o -c reproduce1.cpp
So there's a similar missed optimization but it's not caused by the bisected revision. The situation is like float bar, baz; void foo (int *p, int n) { *p = 0; do { bar = 1.; if (*p) { baz = 2.; *p = 0; } } while (--n); } where we fail to realize that in the first if (*p), *p stays zero. I have a patch that fixes the above but that miscompiles genrecog.cc merge_into_decision (but no testcase in the testsuite ...).
(In reply to Richard Biener from comment #11) > So there's a similar missed optimization but it's not caused by the bisected > revision. Ah I see. I didn't try to bisect this again. I can do that if that would help anything, let me know if I should try. I have bisect script so I can run it overnight ... Also is it better then to open new BZ for this case then? > I have > a patch that fixes the above but that miscompiles genrecog.cc > merge_into_decision (but no testcase in the testsuite ...). I could at least try to test the patch with our production code. Unfortunately can't help with genrecog.cc ..
Created attachment 53597 [details] candidate patch For reference this is the patch I was talking about. I'm sure I've made a mistake in reasoning but I've not yet nailed it ;) Let's leave the issue in this PR. If you have spare cycles to do another bisection that might help as well. If your real code doesn't use LTO then bisecting preprocessed source from the affected TU might be also interesting (but I suppose the issue might affect multiple TUs?)
Created attachment 53599 [details] candidate patch 2 OK, so I think I know what's going wrong eventually. The following implements the trick differently. We might eventually be able to use dominance of the virtual operands to check for the cross-iteration case as further improvement.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:9baee6181b4e427e0b5ba417e51424c15858dce7 commit r13-2772-g9baee6181b4e427e0b5ba417e51424c15858dce7 Author: Richard Biener <rguenther@suse.de> Date: Wed Sep 21 13:52:56 2022 +0200 tree-optimization/106922 - missed FRE/PRE The following enhances the store-with-same-value trick in vn_reference_lookup_3 by not only looking for a = val; *ptr = val; .. = a; but also *ptr = val; other = x; .. = a; where the earlier store is more than one hop away. It does this by queueing the actual value to compare until after the walk but as disadvantage only allows a single such skipped store from a constant value. Unfortunately we cannot handle defs from non-constants this way since we're prone to pick up values from the past loop iteration this way and we have no good way to identify values that are invariant in the currently iterated cycle. That's why we keep the single-hop lookup for those cases. gcc.dg/tree-ssa/pr87126.c would be a testcase that's un-XFAILed when we'd handle those as well. PR tree-optimization/106922 * tree-ssa-sccvn.cc (vn_walk_cb_data::same_val): New member. (vn_walk_cb_data::finish): Perform delayed verification of a skipped may-alias. (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (vn_reference_lookup_3): When skipping stores of the same value also handle constant stores that are more than a single VDEF away by delaying the verification. * gcc.dg/tree-ssa/ssa-fre-100.c: New testcase. * g++.dg/tree-ssa/pr106922.C: Adjust.
This addressed the other missed optimization (which isn't a regression), it should make optimizing the m_initialized flag status more consistent.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:a0de11d0d22054b6fd76a0730a3ec807542379d0 commit r13-2806-ga0de11d0d22054b6fd76a0730a3ec807542379d0 Author: Jakub Jelinek <jakub@redhat.com> Date: Fri Sep 23 09:46:59 2022 +0200 testsuite: Fix up pr106922.C test On Thu, Sep 22, 2022 at 01:10:08PM +0200, Richard Biener via Gcc-patches wrote: > * g++.dg/tree-ssa/pr106922.C: Adjust. > --- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C > @@ -87,5 +87,4 @@ void testfunctionfoo() { > } > } > > -// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } } > -// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } } > +// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } } I've noticed +UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++20 scan-tree-dump-not dce3 "m_initialized" +UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++2b scan-tree-dump-not dce3 "m_initialized" with this change, both on x86_64 and i686. The dump is still cddce3, additionally as the last reference to the pre dump is gone, not sure it is worth creating that dump. With the following patch, there aren't FAILs nor UNRESOLVED tests with GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'" 2022-09-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/106922 * g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than dce3. Remove -fdump-tree-pre-details from dg-options.
Created attachment 53617 [details] Third reproducer failing with 9baee6181b4e427e0b5ba417e51424c15858dce7 I did cherry-pick 9baee6181b4e427e0b5ba417e51424c15858dce7 on top of 5edf02ed2b6de024f83a023d046a6a18f645bc83 and gcc 12 and I confirm that the second reproducer works with this. But unfortunately our production code still fails. Sorry that I cannot provide better more reliable reproducer, but our production code is complex and I try to bump it to smallest snippet which still fails. I can try to refactor our production code somehow to remove all proprietary stuff but that might take some time. I have created new reproducer, this time it fails both with boost::optional and with std::optional. In the reproducer I have commented lines which will cause the warning to disappear when removed. There is additional int in the structure which needs to be there and also the assignment needs to be repeated in the loop.
(In reply to Jan Žižka from comment #18) > Created attachment 53617 [details] > Third reproducer failing with 9baee6181b4e427e0b5ba417e51424c15858dce7 > > I did cherry-pick 9baee6181b4e427e0b5ba417e51424c15858dce7 on top of > 5edf02ed2b6de024f83a023d046a6a18f645bc83 and gcc 12 and I confirm that the > second reproducer works with this. But unfortunately our production code > still fails. > > Sorry that I cannot provide better more reliable reproducer, but our > production code is complex and I try to bump it to smallest snippet which > still fails. I can try to refactor our production code somehow to remove all > proprietary stuff but that might take some time. > > I have created new reproducer, this time it fails both with boost::optional > and with std::optional. In the reproducer I have commented lines which will > cause the warning to disappear when removed. There is additional int in the > structure which needs to be there and also the assignment needs to be > repeated in the loop. Ah, thanks - a testcase that breaks with std::optional is good to have. Aha, the new trick fails because Value numbering stmt = _271 = MEM[(struct _Optional_payload_base *)&externals + 48B]._M_engaged; Skipping possible redundant definition MEM[(struct _Optional_payload_base *)&externals + 48B]._M_engaged = 0; Cannot skip second possible redundant definition MEM[(struct _Optional_payload_base *)&externals + 48B]._M_engaged = 0; Setting value number of _271 to _271 (changed) of course a simple change can handle this - stupid me of not thinking of that!
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:af611afe5fcc908a6678b5b205fb5af7d64fbcb2 commit r13-2817-gaf611afe5fcc908a6678b5b205fb5af7d64fbcb2 Author: Richard Biener <rguenther@suse.de> Date: Fri Sep 23 14:28:52 2022 +0200 tree-optimization/106922 - extend same-val clobber FRE The following extends the skipping of same valued stores to handle an arbitrary number of them as long as they are from the same value (which we now record). That's an obvious extension which allows to optimize the m_engaged member of std::optional more reliably. PR tree-optimization/106922 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Allow an arbitrary number of same valued skipped stores. * g++.dg/torture/pr106922.C: New testcase.
try again!
Great, our production code builds just fine with af611afe5fcc908a6678b5b205fb5af7d64fbcb2 :-) thanks a lot.
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:e364e27b6636ba09755790358910f199d07194b3 commit r12-8820-ge364e27b6636ba09755790358910f199d07194b3 Author: Richard Biener <rguenther@suse.de> Date: Thu Sep 15 13:33:23 2022 +0200 tree-optimization/106922 - PRE and virtual operand translation PRE implicitely keeps virtual operands at the blocks incoming version but the explicit updating point during PHI translation fails to trigger when there are no PHIs at all in a block. Later lazy updating then fails because of a too lose block check. A similar issues plagues reference invalidation when checking the ANTIC_OUT to ANTIC_IN translation. The following fixes both and makes the lazy updating work. The diagnostic testcase unfortunately requires boost so the testcase is the one I reduced for a missed optimization in PRE. The testcase fails with -m32 on x86_64 because we optimize too much before PRE which causes PRE to not trigger so we fail to eliminate a full redundancy. I'm going to open a separate bug for this. Hopefully the !lp64 selector is good enough. PR tree-optimization/106922 * tree-ssa-pre.cc (translate_vuse_through_block): Only keep the VUSE if its def dominates PHIBLOCK. (prune_clobbered_mems): Rewrite logic so we check whether a value dies in a block when the VUSE def doesn't dominate it. * g++.dg/tree-ssa/pr106922.C: New testcase. (cherry picked from commit 5edf02ed2b6de024f83a023d046a6a18f645bc83)
Note I'm still pondering whether to backport the VN enhancement, for now I've backported the VN/PRE optimization regression fix.
I have backported all three patches but true that I didn't try to test without VN enhancement. Would it help if I'd try that with our production code and the reproducers? Or anything else I could try so that you'd know if the VM enhancement should be backported also?
On Tue, 11 Oct 2022, jan.zizka at nokia dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106922 > > --- Comment #25 from Jan ?i?ka <jan.zizka at nokia dot com> --- > I have backported all three patches but true that I didn't try to test without > VN enhancement. Would it help if I'd try that with our production code and the > reproducers? Or anything else I could try so that you'd know if the VM > enhancement should be backported also? They are clearly necessary to fix this bug. What I'm unsure yet about is the risk of generally enhancing VN for this diagnostic regression. The enhancement itself is quite small and self-contained which is why I'm still considering it ;)
(In reply to rguenther@suse.de from comment #26) > > They are clearly necessary to fix this bug. What I'm unsure yet about > is the risk of generally enhancing VN for this diagnostic regression. > The enhancement itself is quite small and self-contained which is > why I'm still considering it ;) Understood. And I didn't report this, maybe I should have, with these fixes we had another uninitialized value warning ... it was actually bug so I did fix it ... but the interesting was that same/similar bug was couple lines below and this one the compiler didn't complain about. I'm not sure if all that could be related to your considerations but if it would help I could create a reproducer for this particular case. I guess it is false negative :-) of some sorts.
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:e8d5f3a1b5a5839cb1db57d6f6766469cc4f8747 commit r12-8835-ge8d5f3a1b5a5839cb1db57d6f6766469cc4f8747 Author: Richard Biener <rguenther@suse.de> Date: Wed Sep 21 13:52:56 2022 +0200 tree-optimization/106922 - missed FRE/PRE The following enhances the store-with-same-value trick in vn_reference_lookup_3 by not only looking for a = val; *ptr = val; .. = a; but also *ptr = val; other = x; .. = a; where the earlier store is more than one hop away. It does this by queueing the actual value to compare until after the walk but as disadvantage only allows a single such skipped store from a constant value. Unfortunately we cannot handle defs from non-constants this way since we're prone to pick up values from the past loop iteration this way and we have no good way to identify values that are invariant in the currently iterated cycle. That's why we keep the single-hop lookup for those cases. gcc.dg/tree-ssa/pr87126.c would be a testcase that's un-XFAILed when we'd handle those as well. PR tree-optimization/106922 * tree-ssa-sccvn.cc (vn_walk_cb_data::same_val): New member. (vn_walk_cb_data::finish): Perform delayed verification of a skipped may-alias. (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (vn_reference_lookup_3): When skipping stores of the same value also handle constant stores that are more than a single VDEF away by delaying the verification. * gcc.dg/tree-ssa/ssa-fre-100.c: New testcase. * g++.dg/tree-ssa/pr106922.C: Adjust. (cherry picked from commit 9baee6181b4e427e0b5ba417e51424c15858dce7)
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:641369e29f57c508e6316d5d221c1a92900163f9 commit r12-8836-g641369e29f57c508e6316d5d221c1a92900163f9 Author: Jakub Jelinek <jakub@redhat.com> Date: Fri Sep 23 09:46:59 2022 +0200 testsuite: Fix up pr106922.C test On Thu, Sep 22, 2022 at 01:10:08PM +0200, Richard Biener via Gcc-patches wrote: > * g++.dg/tree-ssa/pr106922.C: Adjust. > --- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C > @@ -87,5 +87,4 @@ void testfunctionfoo() { > } > } > > -// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } } > -// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } } > +// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } } I've noticed +UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++20 scan-tree-dump-not dce3 "m_initialized" +UNRESOLVED: g++.dg/tree-ssa/pr106922.C -std=gnu++2b scan-tree-dump-not dce3 "m_initialized" with this change, both on x86_64 and i686. The dump is still cddce3, additionally as the last reference to the pre dump is gone, not sure it is worth creating that dump. With the following patch, there aren't FAILs nor UNRESOLVED tests with GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'" 2022-09-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/106922 * g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than dce3. Remove -fdump-tree-pre-details from dg-options. (cherry picked from commit a0de11d0d22054b6fd76a0730a3ec807542379d0)
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:b9f58edfc2ccb0fb3840751a2fb4268ce5dd9b3d commit r12-8837-gb9f58edfc2ccb0fb3840751a2fb4268ce5dd9b3d Author: Richard Biener <rguenther@suse.de> Date: Fri Sep 23 14:28:52 2022 +0200 tree-optimization/106922 - extend same-val clobber FRE The following extends the skipping of same valued stores to handle an arbitrary number of them as long as they are from the same value (which we now record). That's an obvious extension which allows to optimize the m_engaged member of std::optional more reliably. PR tree-optimization/106922 * tree-ssa-sccvn.cc (vn_reference_lookup_3): Allow an arbitrary number of same valued skipped stores. * g++.dg/torture/pr106922.C: New testcase. (cherry picked from commit af611afe5fcc908a6678b5b205fb5af7d64fbcb2)
Fixed.