Bug 113835 - [13/14/15 Regression] compiling std::vector with const size in C++20 is slow
Summary: [13/14/15 Regression] compiling std::vector with const size in C++20 is slow
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 13.2.1
: P2 normal
Target Milestone: 13.4
Assignee: Not yet assigned to anyone
URL:
Keywords: compile-time-hog
Depends on:
Blocks: constexpr std::vector
  Show dependency treegraph
 
Reported: 2024-02-08 17:11 UTC by Barnabás Pőcze
Modified: 2025-02-10 15:14 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work: 12.2.1
Known to fail: 13.2.1, 14.0
Last reconfirmed: 2024-02-09 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Barnabás Pőcze 2024-02-08 17:11:36 UTC
Consider the following code:

    #include <vector>
    const std::size_t N = 1'000'000;
    std::vector<int> x(N);
    int main() {}

Then:

    $ hyperfine 'g++ -std=c++20 -O2 x.cpp' 'g++ -std=c++17 -O2 x.cpp'
    Benchmark 1: g++ -std=c++20 -O2 x.cpp
      Time (mean ± σ):      4.945 s ±  0.116 s    [User: 4.676 s, System: 0.229 s]
      Range (min … max):    4.770 s …  5.178 s    10 runs
     
    Benchmark 2: g++ -std=c++17 -O2 x.cpp
      Time (mean ± σ):     491.3 ms ±  24.0 ms    [User: 440.9 ms, System: 46.3 ms]
      Range (min … max):   465.6 ms … 538.0 ms    10 runs
     
    Summary
      g++ -std=c++17 -O2 x.cpp ran
       10.07 ± 0.55 times faster than g++ -std=c++20 -O2 x.cpp

If you remove the `const` from `N`, the runtime will be closer to C++17 levels.

`-ftime-report` suggests that "constant expression evaluation" is the reason. I imagine this is related to C++20 making std::vector constexpr.
Comment 1 Richard Biener 2024-02-09 06:58:31 UTC
Confirmed with -std=c++20 -fsyntax-only

 constant expression evaluation     :   1.80 ( 85%)   0.03 ( 14%)   1.84 ( 78%)   220M ( 88%)
 TOTAL                              :   2.13          0.22          2.36          250M


Samples: 8K of event 'cycles', Event count (approx.): 9294971478                                             
Overhead       Samples  Command  Shared Object     Symbol                                                    
  16.33%          1385  cc1plus  cc1plus           [.] cxx_eval_constant_expression
   4.35%           369  cc1plus  cc1plus           [.] cxx_eval_call_expression
   3.90%           331  cc1plus  cc1plus           [.] cxx_eval_store_expression
   3.16%           268  cc1plus  cc1plus           [.] hash_table<hash_map<tree_node*, tree_node*, simple_has
   2.55%           216  cc1plus  cc1plus           [.] hash_table<hash_map<tree_node*, tree_node*, simple_has
   2.15%           183  cc1plus  cc1plus           [.] ggc_internal_alloc
   2.04%           173  cc1plus  cc1plus           [.] hash_table<int_cst_hasher, false, xcallocator>::find_s
   1.98%           168  cc1plus  cc1plus           [.] tree_operand_check

GCC 12 was fast (possibly std::vector wasn't constexpr there?)
Comment 2 Jonathan Wakely 2024-02-09 13:42:58 UTC
(In reply to Richard Biener from comment #1)
> GCC 12 was fast (possibly std::vector wasn't constexpr there?)

It was constexpr since 12.1.0

So this might be related to Jason's implicit constexpr changes instead.
Comment 3 Jakub Jelinek 2024-02-12 10:45:02 UTC
The most significant slowdown (though just approximate, I've been bisecting that in an mostly -O0 built compiler) is r13-6421-gcbaa1d9c218d9c0b5e34e510a462ff4e299a0f3f
Though, no idea what could be done against it, because the standard requires that...
Comment 4 Patrick Palka 2024-02-12 15:00:26 UTC
r13-6421 just makes us use mce_true and mce_uknown during trial constant evaluation of x's initialization, so my guess is before that commit the evaluation was quickly aborted as soon as it saw a std::is_constant_evaluated() call, and afterwards we evaluate is_constant_evaluated() to true and keep going until we hit constexpr_loop_limit / constexpr_ops_limit.  So perhaps we should have reduced limits for trial constant evaluation?
Comment 5 Patrick Palka 2024-02-12 15:28:24 UTC
Relatedly, if we actually make x constexpr

#include <vector>
const std::size_t N = 1'000'000;
constexpr std::vector<int> x(N);
int main() {}

we end up evaluating its expensive initialization (and hitting the constexpr loop/ops limit) four times:

1. expand_default_init -> maybe_constant_init
2. store_init_value -> maybe_constant_value
3. store_init_value -> maybe_constant_init
4. store_init_value -> cxx_constant_init

The first three silently fail due to hitting the ops limit, and the fourth actually diagnoses the failure.  This seems wasteful, we should silently evaluate such an initializer once.

For the original testcase where x is not constexpr, we constant evaluate the initialization just once (which silently fails due to the ops limit), which still takes a few seconds unfortunately.
Comment 6 Jakub Jelinek 2024-02-12 15:38:32 UTC
Should we really be reaching the limits for cases like this?
I mean, all the default ctors produce the same value here and we could figure out that
they are in no way dependent on the exact position of the element in the vector, nor allocating heap memory during constexpr evaluation, so ideally we'd optimize and just fill in the whole initializer with the same value (ideally using range for the ctor index to even save compile time memory).
Guess the hard part is find out the cases where the constexpr evaluation for element would be dependent on something that would prevent such an optimization.
Comment 7 Jakub Jelinek 2024-05-21 09:19:00 UTC
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
Comment 8 Patrick Palka 2025-02-10 15:14:20 UTC
Changing component to c++, this is really a front end issue.