Consider the following code: #include <vector> const std::size_t N = 1'000'000; std::vector<int> x(N); int main() {} Then: $ hyperfine 'g++ -std=c++20 -O2 x.cpp' 'g++ -std=c++17 -O2 x.cpp' Benchmark 1: g++ -std=c++20 -O2 x.cpp Time (mean ± σ): 4.945 s ± 0.116 s [User: 4.676 s, System: 0.229 s] Range (min … max): 4.770 s … 5.178 s 10 runs Benchmark 2: g++ -std=c++17 -O2 x.cpp Time (mean ± σ): 491.3 ms ± 24.0 ms [User: 440.9 ms, System: 46.3 ms] Range (min … max): 465.6 ms … 538.0 ms 10 runs Summary g++ -std=c++17 -O2 x.cpp ran 10.07 ± 0.55 times faster than g++ -std=c++20 -O2 x.cpp If you remove the `const` from `N`, the runtime will be closer to C++17 levels. `-ftime-report` suggests that "constant expression evaluation" is the reason. I imagine this is related to C++20 making std::vector constexpr.
Confirmed with -std=c++20 -fsyntax-only constant expression evaluation : 1.80 ( 85%) 0.03 ( 14%) 1.84 ( 78%) 220M ( 88%) TOTAL : 2.13 0.22 2.36 250M Samples: 8K of event 'cycles', Event count (approx.): 9294971478 Overhead Samples Command Shared Object Symbol 16.33% 1385 cc1plus cc1plus [.] cxx_eval_constant_expression 4.35% 369 cc1plus cc1plus [.] cxx_eval_call_expression 3.90% 331 cc1plus cc1plus [.] cxx_eval_store_expression 3.16% 268 cc1plus cc1plus [.] hash_table<hash_map<tree_node*, tree_node*, simple_has 2.55% 216 cc1plus cc1plus [.] hash_table<hash_map<tree_node*, tree_node*, simple_has 2.15% 183 cc1plus cc1plus [.] ggc_internal_alloc 2.04% 173 cc1plus cc1plus [.] hash_table<int_cst_hasher, false, xcallocator>::find_s 1.98% 168 cc1plus cc1plus [.] tree_operand_check GCC 12 was fast (possibly std::vector wasn't constexpr there?)
(In reply to Richard Biener from comment #1) > GCC 12 was fast (possibly std::vector wasn't constexpr there?) It was constexpr since 12.1.0 So this might be related to Jason's implicit constexpr changes instead.
The most significant slowdown (though just approximate, I've been bisecting that in an mostly -O0 built compiler) is r13-6421-gcbaa1d9c218d9c0b5e34e510a462ff4e299a0f3f Though, no idea what could be done against it, because the standard requires that...
r13-6421 just makes us use mce_true and mce_uknown during trial constant evaluation of x's initialization, so my guess is before that commit the evaluation was quickly aborted as soon as it saw a std::is_constant_evaluated() call, and afterwards we evaluate is_constant_evaluated() to true and keep going until we hit constexpr_loop_limit / constexpr_ops_limit. So perhaps we should have reduced limits for trial constant evaluation?
Relatedly, if we actually make x constexpr #include <vector> const std::size_t N = 1'000'000; constexpr std::vector<int> x(N); int main() {} we end up evaluating its expensive initialization (and hitting the constexpr loop/ops limit) four times: 1. expand_default_init -> maybe_constant_init 2. store_init_value -> maybe_constant_value 3. store_init_value -> maybe_constant_init 4. store_init_value -> cxx_constant_init The first three silently fail due to hitting the ops limit, and the fourth actually diagnoses the failure. This seems wasteful, we should silently evaluate such an initializer once. For the original testcase where x is not constexpr, we constant evaluate the initialization just once (which silently fails due to the ops limit), which still takes a few seconds unfortunately.
Should we really be reaching the limits for cases like this? I mean, all the default ctors produce the same value here and we could figure out that they are in no way dependent on the exact position of the element in the vector, nor allocating heap memory during constexpr evaluation, so ideally we'd optimize and just fill in the whole initializer with the same value (ideally using range for the ctor index to even save compile time memory). Guess the hard part is find out the cases where the constexpr evaluation for element would be dependent on something that would prevent such an optimization.
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
Changing component to c++, this is really a front end issue.