Similar to bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90082. GCC 12.1.0 ICEs when compiling this code with: ``` g++ -DROCKSDB_PLATFORM_POSIX -isystem rocksdb-cloud -isystem rocksdb-cloud/include -O3 -march=haswell -fnon-call-exceptions -c rocksdb-cloud/db/db_impl/db_impl_compaction_flush.cc ``` All the three flags are important, as the ICE doesn't happen with -O2, nor without -march, nor with -march=skylake, but it does happen with microarchs older than haswell. ICE doesn't happen without -fnon-call-exceptions either. Version: ``` $ g++ --version g++ (GCC) 12.1.0 Copyright (C) 2022 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` The minimized repro (couldn't do better for the moment, preprocessed source attached of both minimal repro and the full file attached): ``` #include "db/db_impl/db_impl.h" namespace ROCKSDB_NAMESPACE { void DBImpl::InstallSuperVersionAndScheduleWork( ColumnFamilyData* cfd, SuperVersionContext* sv_context, const MutableCFOptions& mutable_cf_options) { if (UNLIKELY(sv_context->new_superversion == nullptr)) { sv_context->NewSuperVersion(); } bottommost_files_mark_threshold_ = kMaxSequenceNumber; for (auto* my_cfd : *versions_->GetColumnFamilySet()) { bottommost_files_mark_threshold_ = std::min( bottommost_files_mark_threshold_, my_cfd->current()->storage_info()->bottommost_files_mark_threshold()); } } } // namespace ROCKSDB_NAMESPACE ```
Created attachment 52965 [details] Preprocessed source of the full reproducer
Created attachment 52966 [details] Preprocessed source of the minimal reproducer
Confirmed, reducing right now..
Created attachment 52967 [details] A slightly reduced case A bit more reduced reproducer. Not sure it helps.
(In reply to Curdeius Curdeius from comment #4) > Created attachment 52967 [details] > A slightly reduced case > > A bit more reduced reproducer. > Not sure it helps. No, we would need a pre-processed source file reproducer.
With -fdelete-dead-exceptions, it started with r12-248-gb58dc0b803057c0e. The reduction is pretty slow..
(In reply to Martin Liška from comment #6) > With -fdelete-dead-exceptions, it started with r12-248-gb58dc0b803057c0e. > The reduction is pretty slow.. That just exposed the issue I think since the failure is at the rtl level while that change effects things way before in gimple.
(In reply to Andrew Pinski from comment #7) > (In reply to Martin Liška from comment #6) > > With -fdelete-dead-exceptions, it started with r12-248-gb58dc0b803057c0e. > > The reduction is pretty slow.. > > That just exposed the issue I think since the failure is at the rtl level > while that change effects things way before in gimple. So the insn removed that triggers the must_clean is (insn/v 27 23 30 3 (set (reg:V2DI 107) (const_vector:V2DI [ (const_int 0 [0]) repeated x2 ])) "/usr/local/include/c++/12.1.0/bits/shared_ptr_base.h":1463:9 1700 {movv2di_internal} (nil)) we first remove that and then call purge_dead_edges which then runs into the newly(!) last insn: (call_insn 23 22 30 3 (set (reg:DI 0 ax) (call (mem:QI (symbol_ref:DI ("memset") [flags 0x41] <function_decl 0x7ffff65ebc00 __builtin_memset>) [0 __builtin_memset S1 A8]) (const_int 0 [0]))) "../../thirdparty/rocksdb-cloud/db/job_context.h":49:29 909 {*call_value} (expr_list:REG_DEAD (reg:DI 5 di) (expr_list:REG_DEAD (reg:SI 4 si) (expr_list:REG_DEAD (reg:DI 1 dx) (expr_list:REG_UNUSED (reg:DI 0 ax) (expr_list:REG_CALL_DECL (symbol_ref:DI ("memset") [flags 0x41] <function_decl 0x7ffff65ebc00 __builtin_memset>) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))))))) (expr_list:DI (set (reg:DI 0 ax) (reg:DI 5 di)) (expr_list:DI (use (reg:DI 5 di)) (expr_list:SI (use (reg:SI 4 si)) (expr_list:DI (use (reg:DI 1 dx)) (nil)))))) which cannot throw. But we still have an EH edge out of this block which is the real issue here. Somebody forgot to clean the EH edge earlier. In fact before DSE we have (insn 27 23 28 3 (set (reg:V2DI 107) (const_vector:V2DI [ (const_int 0 [0]) repeated x2 ])) "/usr/local/include/c++/12.1.0/bits/shared_ptr_base.h":1463:9 1700 {movv2di_internal} (nil)) (insn 28 27 29 3 (set (mem:V2DI (plus:DI (reg/f:DI 94 [ _34 ]) (const_int 96 [0x60])) [0 MEM <vector(2) long unsigned int> [(void *)_34 + 96B]+0 S16 A64]) (reg:V2DI 107)) "/usr/local/include/c++/12.1.0/bits/shared_ptr_base.h":1463:9 1700 {movv2di_internal} (expr_list:REG_DEAD (reg:V2DI 107) (expr_list:REG_EH_REGION (const_int -15 [0xfffffffffffffff1]) (nil)))) (insn 29 28 30 3 (set (mem:QI (plus:DI (reg/f:DI 94 [ _34 ]) (const_int 112 [0x70])) [26 MEM[(struct MutableCFOptions *)_34 + 32B].disable_auto_compactions+0 S1 A64]) (const_int 0 [0])) "../../thirdparty/rocksdb-cloud/options/cf_options.h":173:9 83 {*movqi_internal} (expr_list:REG_EH_REGION (const_int 3 [0x3]) (nil))) ;; succ: 4 [always] count:1459806 (estimated locally) (FALLTHRU) ;; 49 [never] count:0 (precise) (ABNORMAL,EH) so DSE removes insn 28 and insn 29 but forgets to clean EH.
So DSE does /* DSE can eliminate potentially-trapping MEMs. Remove any EH edges associated with them. */ if ((locally_deleted || globally_deleted) && cfun->can_throw_non_call_exceptions && purge_all_dead_edges ()) { free_dominance_info (CDI_DOMINATORS); cleanup_cfg (0); which should do the trick, but the fast-DCE is invoked via dse_step0 (); dse_step1 (); dse_step2_init (); if (dse_step2 ()) { df_set_flags (DF_LR_RUN_DCE); df_analyze (); and dse_step0/1 already removed the stores, exposing the bad IL. One way to fix this might be to run cleanup_cfg after dse_step1 already, or just remove_unreachable_blocks. I'm going to test diff --git a/gcc/dse.cc b/gcc/dse.cc index b8914a3ae24..bb658a85959 100644 --- a/gcc/dse.cc +++ b/gcc/dse.cc @@ -3682,6 +3682,16 @@ rest_of_handle_dse (void) dse_step0 (); dse_step1 (); + /* DSE can eliminate potentially-trapping MEMs. + Remove any EH edges associated with them, since otherwise + DF_LR_RUN_DCE will complain later. */ + if ((locally_deleted || globally_deleted) + && cfun->can_throw_non_call_exceptions + && purge_all_dead_edges ()) + { + free_dominance_info (CDI_DOMINATORS); + delete_unreachable_blocks (); + } dse_step2_init (); if (dse_step2 ()) {
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:dfda40f8147412328f699628a54b0aaa584776e7 commit r13-373-gdfda40f8147412328f699628a54b0aaa584776e7 Author: Richard Biener <rguenther@suse.de> Date: Thu May 12 14:03:32 2022 +0200 rtl-optimization/105577 - RTL DSE and non-call EH When one of the first two stages of DSE removes a throwing stmt we have to purge dead EH edges before the DF re-analyze fires off a fast DCE since that cannot cope with the situation. 2022-05-12 Richard Biener <rguenther@suse.de> PR rtl-optimization/105577 * dse.cc (rest_of_handle_dse): Make sure to purge dead EH edges before running fast DCE via df_analyze.
Fixed on trunk sofar.
There's a reduced test case, can you please include it in testsuite? namespace { typedef long size_t; } typedef char uint8_t; typedef long uint64_t; namespace { template <typename _Tp, _Tp __v> struct integral_constant { static constexpr _Tp value = __v; }; template <bool __v> using __bool_constant = integral_constant<bool, __v>; template <bool> struct __conditional { template <typename _Tp, typename> using type = _Tp; }; template <bool _Cond, typename _If, typename _Else> using __conditional_t = typename __conditional<_Cond>::type<_If, _Else>; template <typename...> struct __and_; template <typename _B1, typename _B2> struct __and_<_B1, _B2> : __conditional_t<_B1::value, _B2, _B1> {}; template <typename> struct __not_ : __bool_constant<!bool()> {}; template <typename _Tp> struct __is_constructible_impl : __bool_constant<__is_constructible(_Tp)> {}; template <typename _Tp> struct is_default_constructible : __is_constructible_impl<_Tp> {}; template <typename _Tp> struct remove_extent { typedef _Tp type; }; template <bool> struct enable_if; } // namespace namespace std { template <typename _Tp> struct allocator_traits { using pointer = _Tp; }; template <typename _Alloc> struct __alloc_traits : allocator_traits<_Alloc> {}; template <typename, typename _Alloc> struct _Vector_base { typedef typename __alloc_traits<_Alloc>::pointer pointer; struct { pointer _M_finish; pointer _M_end_of_storage; }; }; template <typename _Tp, typename _Alloc = _Tp> class vector : _Vector_base<_Tp, _Alloc> { public: _Tp value_type; typedef size_t size_type; }; template <typename _Tp, typename _Dp> class __uniq_ptr_impl { template <typename _Up, typename> struct _Ptr { using type = _Up *; }; public: using _DeleterConstraint = enable_if<__and_<__not_<_Dp>, is_default_constructible<_Dp>>::value>; using pointer = typename _Ptr<_Tp, _Dp>::type; }; template <typename _Tp, typename _Dp = _Tp> class unique_ptr { public: using pointer = typename __uniq_ptr_impl<_Tp, _Dp>::pointer; pointer operator->(); }; enum _Lock_policy { _S_atomic } const __default_lock_policy = _S_atomic; template <_Lock_policy = __default_lock_policy> class _Sp_counted_base; template <typename, _Lock_policy = __default_lock_policy> class __shared_ptr; template <_Lock_policy> class __shared_count { _Sp_counted_base<> *_M_pi; }; template <typename _Tp, _Lock_policy _Lp> class __shared_ptr { using element_type = typename remove_extent<_Tp>::type; element_type *_M_ptr; __shared_count<_Lp> _M_refcount; }; template <typename _Tp> class shared_ptr : __shared_ptr<_Tp> { public: shared_ptr() noexcept : __shared_ptr<_Tp>() {} }; enum CompressionType : char; class SliceTransform; enum Temperature : uint8_t; struct MutableCFOptions { MutableCFOptions() : soft_pending_compaction_bytes_limit(), hard_pending_compaction_bytes_limit(level0_file_num_compaction_trigger), level0_slowdown_writes_trigger(level0_stop_writes_trigger), max_compaction_bytes(target_file_size_base), target_file_size_multiplier(max_bytes_for_level_base), max_bytes_for_level_multiplier(ttl), compaction_options_fifo(), min_blob_size(blob_file_size), blob_compression_type(), enable_blob_garbage_collection(blob_garbage_collection_age_cutoff), max_sequential_skip_in_iterations(check_flush_compaction_key_order), paranoid_file_checks(bottommost_compression), bottommost_temperature(), sample_for_compression() {} shared_ptr<SliceTransform> prefix_extractor; uint64_t soft_pending_compaction_bytes_limit; uint64_t hard_pending_compaction_bytes_limit; int level0_file_num_compaction_trigger; int level0_slowdown_writes_trigger; int level0_stop_writes_trigger; uint64_t max_compaction_bytes; uint64_t target_file_size_base; int target_file_size_multiplier; uint64_t max_bytes_for_level_base; double max_bytes_for_level_multiplier; uint64_t ttl; vector<int> compaction_options_fifo; uint64_t min_blob_size; uint64_t blob_file_size; CompressionType blob_compression_type; bool enable_blob_garbage_collection; double blob_garbage_collection_age_cutoff; uint64_t max_sequential_skip_in_iterations; bool check_flush_compaction_key_order; bool paranoid_file_checks; CompressionType bottommost_compression; Temperature bottommost_temperature; uint64_t sample_for_compression; }; template <class T, size_t kSize = 8> class autovector { using value_type = T; using size_type = typename vector<T>::size_type; size_type buf_[kSize * sizeof(value_type)]; }; class MemTable; class ColumnFamilyData; struct SuperVersion { MutableCFOptions write_stall_condition; autovector<MemTable *> to_delete; }; class ColumnFamilySet { public: class iterator { public: iterator operator++(); bool operator!=(iterator); ColumnFamilyData *operator*(); ColumnFamilyData *current_; }; iterator begin(); iterator end(); }; class VersionSet { public: ColumnFamilySet *GetColumnFamilySet(); }; struct SuperVersionContext { void NewSuperVersion() { new SuperVersion(); } }; class DBImpl { unique_ptr<VersionSet> versions_; void InstallSuperVersionAndScheduleWork(ColumnFamilyData *, SuperVersionContext *, const MutableCFOptions &); }; void DBImpl::InstallSuperVersionAndScheduleWork(ColumnFamilyData *, SuperVersionContext *sv_context, const MutableCFOptions &) { sv_context->NewSuperVersion(); for (auto my_cfd : *versions_->GetColumnFamilySet()) ; } } // namespace std
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:ef7b8976b9143aa78dd9cf5cfdaa02552d6e18a0 commit r13-506-gef7b8976b9143aa78dd9cf5cfdaa02552d6e18a0 Author: Richard Biener <rguenther@suse.de> Date: Mon May 16 12:07:31 2022 +0200 rtl-optimization/105577 - testcase for the PR 2022-05-16 Richard Biener <rguenther@suse.de> PR rtl-optimization/105577 * g++.dg/torture/pr105577.C: New testcase.
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:b251f8be6b018966edad5daeb45c42fd193b24b4 commit r12-8401-gb251f8be6b018966edad5daeb45c42fd193b24b4 Author: Richard Biener <rguenther@suse.de> Date: Thu May 12 14:03:32 2022 +0200 rtl-optimization/105577 - RTL DSE and non-call EH When one of the first two stages of DSE removes a throwing stmt we have to purge dead EH edges before the DF re-analyze fires off a fast DCE since that cannot cope with the situation. 2022-05-12 Richard Biener <rguenther@suse.de> PR rtl-optimization/105577 * dse.cc (rest_of_handle_dse): Make sure to purge dead EH edges before running fast DCE via df_analyze. (cherry picked from commit dfda40f8147412328f699628a54b0aaa584776e7)
The releases/gcc-12 branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:25d7a7381099b46b6554c5e20b00b19d460c2123 commit r12-8402-g25d7a7381099b46b6554c5e20b00b19d460c2123 Author: Richard Biener <rguenther@suse.de> Date: Mon May 16 12:07:31 2022 +0200 rtl-optimization/105577 - testcase for the PR 2022-05-16 Richard Biener <rguenther@suse.de> PR rtl-optimization/105577 * g++.dg/torture/pr105577.C: New testcase. (cherry picked from commit ef7b8976b9143aa78dd9cf5cfdaa02552d6e18a0)
Fixed.
Thanks a lot for fixing this quickly!