Bug 113835 - [13 Regression] compiling std::vector with const size in C++20 is slow
Summary: [13 Regression] compiling std::vector with const size in C++20 is slow
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 13.2.1
: P2 normal
Target Milestone: 13.5
Assignee: Jason Merrill
URL:
Keywords: compile-time-hog
: 118809 (view as bug list)
Depends on:
Blocks: constexpr std::vector
  Show dependency treegraph
 
Reported: 2024-02-08 17:11 UTC by Barnabás Pőcze
Modified: 2025-06-05 17:10 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Known to work: 12.2.1
Known to fail: 13.2.1, 14.0
Last reconfirmed: 2024-02-09 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Barnabás Pőcze 2024-02-08 17:11:36 UTC
Consider the following code:

    #include <vector>
    const std::size_t N = 1'000'000;
    std::vector<int> x(N);
    int main() {}

Then:

    $ hyperfine 'g++ -std=c++20 -O2 x.cpp' 'g++ -std=c++17 -O2 x.cpp'
    Benchmark 1: g++ -std=c++20 -O2 x.cpp
      Time (mean ± σ):      4.945 s ±  0.116 s    [User: 4.676 s, System: 0.229 s]
      Range (min … max):    4.770 s …  5.178 s    10 runs
     
    Benchmark 2: g++ -std=c++17 -O2 x.cpp
      Time (mean ± σ):     491.3 ms ±  24.0 ms    [User: 440.9 ms, System: 46.3 ms]
      Range (min … max):   465.6 ms … 538.0 ms    10 runs
     
    Summary
      g++ -std=c++17 -O2 x.cpp ran
       10.07 ± 0.55 times faster than g++ -std=c++20 -O2 x.cpp

If you remove the `const` from `N`, the runtime will be closer to C++17 levels.

`-ftime-report` suggests that "constant expression evaluation" is the reason. I imagine this is related to C++20 making std::vector constexpr.
Comment 1 Richard Biener 2024-02-09 06:58:31 UTC
Confirmed with -std=c++20 -fsyntax-only

 constant expression evaluation     :   1.80 ( 85%)   0.03 ( 14%)   1.84 ( 78%)   220M ( 88%)
 TOTAL                              :   2.13          0.22          2.36          250M


Samples: 8K of event 'cycles', Event count (approx.): 9294971478                                             
Overhead       Samples  Command  Shared Object     Symbol                                                    
  16.33%          1385  cc1plus  cc1plus           [.] cxx_eval_constant_expression
   4.35%           369  cc1plus  cc1plus           [.] cxx_eval_call_expression
   3.90%           331  cc1plus  cc1plus           [.] cxx_eval_store_expression
   3.16%           268  cc1plus  cc1plus           [.] hash_table<hash_map<tree_node*, tree_node*, simple_has
   2.55%           216  cc1plus  cc1plus           [.] hash_table<hash_map<tree_node*, tree_node*, simple_has
   2.15%           183  cc1plus  cc1plus           [.] ggc_internal_alloc
   2.04%           173  cc1plus  cc1plus           [.] hash_table<int_cst_hasher, false, xcallocator>::find_s
   1.98%           168  cc1plus  cc1plus           [.] tree_operand_check

GCC 12 was fast (possibly std::vector wasn't constexpr there?)
Comment 2 Jonathan Wakely 2024-02-09 13:42:58 UTC
(In reply to Richard Biener from comment #1)
> GCC 12 was fast (possibly std::vector wasn't constexpr there?)

It was constexpr since 12.1.0

So this might be related to Jason's implicit constexpr changes instead.
Comment 3 Jakub Jelinek 2024-02-12 10:45:02 UTC
The most significant slowdown (though just approximate, I've been bisecting that in an mostly -O0 built compiler) is r13-6421-gcbaa1d9c218d9c0b5e34e510a462ff4e299a0f3f
Though, no idea what could be done against it, because the standard requires that...
Comment 4 Patrick Palka 2024-02-12 15:00:26 UTC
r13-6421 just makes us use mce_true and mce_uknown during trial constant evaluation of x's initialization, so my guess is before that commit the evaluation was quickly aborted as soon as it saw a std::is_constant_evaluated() call, and afterwards we evaluate is_constant_evaluated() to true and keep going until we hit constexpr_loop_limit / constexpr_ops_limit.  So perhaps we should have reduced limits for trial constant evaluation?
Comment 5 Patrick Palka 2024-02-12 15:28:24 UTC
Relatedly, if we actually make x constexpr

#include <vector>
const std::size_t N = 1'000'000;
constexpr std::vector<int> x(N);
int main() {}

we end up evaluating its expensive initialization (and hitting the constexpr loop/ops limit) four times:

1. expand_default_init -> maybe_constant_init
2. store_init_value -> maybe_constant_value
3. store_init_value -> maybe_constant_init
4. store_init_value -> cxx_constant_init

The first three silently fail due to hitting the ops limit, and the fourth actually diagnoses the failure.  This seems wasteful, we should silently evaluate such an initializer once.

For the original testcase where x is not constexpr, we constant evaluate the initialization just once (which silently fails due to the ops limit), which still takes a few seconds unfortunately.
Comment 6 Jakub Jelinek 2024-02-12 15:38:32 UTC
Should we really be reaching the limits for cases like this?
I mean, all the default ctors produce the same value here and we could figure out that
they are in no way dependent on the exact position of the element in the vector, nor allocating heap memory during constexpr evaluation, so ideally we'd optimize and just fill in the whole initializer with the same value (ideally using range for the ctor index to even save compile time memory).
Guess the hard part is find out the cases where the constexpr evaluation for element would be dependent on something that would prevent such an optimization.
Comment 7 Jakub Jelinek 2024-05-21 09:19:00 UTC
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
Comment 8 Patrick Palka 2025-02-10 15:14:20 UTC
Changing component to c++, this is really a front end issue.
Comment 9 Jason Merrill 2025-04-11 15:05:02 UTC
Removing the maybe_constant_init from expand_default_init speeds up the testcase by more than 10x, but also regresses a bunch of testcases, so it's not that simple.
Comment 10 Jason Merrill 2025-04-11 16:02:43 UTC
(In reply to Jason Merrill from comment #9)
> Removing the maybe_constant_init from expand_default_init speeds up the
> testcase by more than 10x, but also regresses a bunch of testcases, so it's
> not that simple.

Yeah, the speedup was from never trying to do constant evaluation at all; properly doing it in store_init_value instead provides no improvement.
Comment 11 Jason Merrill 2025-04-12 00:28:43 UTC
One issue is this in stl_uninitialized.h:

    __uninitialized_default_n(_ForwardIterator __first, _Size __n)
    {
#ifdef __cpp_lib_is_constant_evaluated
      if (std::is_constant_evaluated())
        return __uninitialized_default_n_1<false>::
                 __uninit_default_n(__first, __n);
#endif

The <false> forces us to go take the non-trivial initialization path even for a range of ints.  Disabling this 'if' so we can take the trivial path speeds up compilation by about 35%.  We still hit constexpr_loop_limit, since __fill_a1 is also a loop (since we don't support constexpr __builtin_memset yet) but we get there a lot faster.
Comment 12 Jonathan Wakely 2025-04-12 09:39:01 UTC
The __uninitialized_default_n_1<true> specialization assumes that we can use std::fill_n to assign to objects outside their lifetime. I don't think that's valid during constant evaluation, is it? That's why we use the <false> specialization which begins the lifetime of every element using std::_Construct.

GCC accepts this, but Clang rejects it:

#include <algorithm>
#include <memory>

consteval bool f(int n)
{
  int* p = std::allocator<int>().allocate(n);
  std::fill_n(p, n, 99);
  std::allocator<int>().deallocate(p, n);
  return true;
}

static_assert(f(10));


uninit.cc:12:15: error: static assertion expression is not an integral constant expression
   12 | static_assert(f(10));
      |               ^~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_algobase.h:952:11: note: assignment to object outside its lifetime is not allowed in a constant expression
  952 |         *__first = __tmp;
      |         ~~~~~~~~~^~~~~~~



If that is supposed to work, we could remove the __cpp_lib_is_constant_evaluated group in __uninitialized_default_n and just let std::fill_n decide whether to do things differently for constant eval ... but I don't think it's valid.
Comment 13 Jonathan Wakely 2025-04-12 09:41:59 UTC
Hmm, __uninitialized_default doesn't have a std::is_constant_evaluated check, but I think it needs it for exactly the same reasons.
Comment 14 Jonathan Wakely 2025-04-12 10:32:31 UTC
(In reply to Jonathan Wakely from comment #13)
> Hmm, __uninitialized_default doesn't have a std::is_constant_evaluated
> check, but I think it needs it for exactly the same reasons.

I've opened Bug 119754 for that.
Comment 15 Jason Merrill 2025-04-12 13:35:15 UTC
(In reply to Jonathan Wakely from comment #12)
> The __uninitialized_default_n_1<true> specialization assumes that we can use
> std::fill_n to assign to objects outside their lifetime. I don't think
> that's valid during constant evaluation, is it?

No, you're right; I guess I was thinking that for a type with a trivial default constructor, assignment could start the lifetime like it does for a union member, or that allocate would vacuously initialize the elements of an array of such type.

One of those being needed because we don't get implicit object creation in constant evaluation.
Comment 16 Jason Merrill 2025-04-12 14:08:06 UTC
(In reply to Patrick Palka from comment #4)
> r13-6421 just makes us use mce_true and mce_uknown during trial constant
> evaluation of x's initialization, so my guess is before that commit the
> evaluation was quickly aborted as soon as it saw a
> std::is_constant_evaluated() call, and afterwards we evaluate
> is_constant_evaluated() to true and keep going until we hit
> constexpr_loop_limit / constexpr_ops_limit.  So perhaps we should have
> reduced limits for trial constant evaluation?

The issue is that as I mentioned in comment #10, the trial mce_unknown evaluation was the only evaluation, so before that commit the evaluation was quickly aborted and then we never try again.

So it's not clear to me that this is a regression: now we are actually trying to do manifest constant-initialization as specified by the standard, and failing due to implementation limits.  Previously we wrongly didn't try.  Admittedly, we end up with the same result after taking much longer, so that's definitely a regression in the user experience.

Incidentally, perhaps we want to give a diagnostic about hitting implementation limits even in contexts like this that allow non-constant results.
Comment 17 Jason Merrill 2025-04-12 14:17:28 UTC
Clang shows the same behavior, taking a long time to give up and do dynamic initialization.
Comment 18 Jason Merrill 2025-04-12 15:20:37 UTC
If I increase the limits (or reduce N) enough that we get through the whole evaluation, constant-evaluation still fails right at the end because the result refers to the result of operator new and we don't have non-transient constexpr allocation (P3554) yet.  I'm not sure how we could shortcut that failure in general, but I suppose we could recognize std::vector and bail out early.
Comment 19 GCC Commits 2025-04-15 03:26:33 UTC
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>:

https://gcc.gnu.org/g:764f02327f7b2dc6ac5abaf89038e51cf0ee6d13

commit r15-9474-g764f02327f7b2dc6ac5abaf89038e51cf0ee6d13
Author: Jason Merrill <jason@redhat.com>
Date:   Sat Apr 12 11:35:18 2025 -0400

    c++: shortcut constexpr vector ctor [PR113835]
    
    Since std::vector became usable in constant evaluation in C++20, a vector
    variable with static storage duration might be manifestly
    constant-evaluated, so we properly try to constant-evaluate its initializer.
    But it can never succeed since the result will always refer to the result of
    operator new, so trying is a waste of time.  Potentially a large waste of
    time for a large vector, as in the testcase in the PR.
    
    So, let's recognize this case and skip trying constant-evaluation.  I do
    this only for the case of an integer argument, as that's the case that's
    easy to write but slow to (fail to) evaluate.
    
    In the test, I use dg-timeout-factor to lower the default timeout from 300
    seconds to 15; on my laptop, compilation without the patch takes about 20
    seconds versus about 2 with the patch.
    
            PR c++/113835
    
    gcc/cp/ChangeLog:
    
            * constexpr.cc (cxx_eval_outermost_constant_expr): Bail out early
            for std::vector(N).
    
    gcc/testsuite/ChangeLog:
    
            * g++.dg/cpp2a/constexpr-vector1.C: New test.
Comment 20 Jason Merrill 2025-04-15 14:58:18 UTC
*** Bug 118809 has been marked as a duplicate of this bug. ***
Comment 21 GCC Commits 2025-04-28 20:13:40 UTC
The releases/gcc-14 branch has been updated by Jason Merrill <jason@gcc.gnu.org>:

https://gcc.gnu.org/g:8dce1aa0579ab86a626e24c0af29455f30305595

commit r14-11691-g8dce1aa0579ab86a626e24c0af29455f30305595
Author: Jason Merrill <jason@redhat.com>
Date:   Sat Apr 12 11:35:18 2025 -0400

    c++: shortcut constexpr vector ctor [PR113835]
    
    Since std::vector became usable in constant evaluation in C++20, a vector
    variable with static storage duration might be manifestly
    constant-evaluated, so we properly try to constant-evaluate its initializer.
    But it can never succeed since the result will always refer to the result of
    operator new, so trying is a waste of time.  Potentially a large waste of
    time for a large vector, as in the testcase in the PR.
    
    So, let's recognize this case and skip trying constant-evaluation.  I do
    this only for the case of an integer argument, as that's the case that's
    easy to write but slow to (fail to) evaluate.
    
    In the test, I use dg-timeout-factor to lower the default timeout from 300
    seconds to 15; on my laptop, compilation without the patch takes about 20
    seconds versus about 2 with the patch.
    
    is_std_class comes from r15-4953.
    
            PR c++/113835
    
    gcc/cp/ChangeLog:
    
            * cp-tree.h (is_std_class): Declare.
            * constexpr.cc (is_std_class): New function.
            (is_std_allocator): Use it.
            (cxx_eval_outermost_constant_expr): Bail out early
            for std::vector(N).
    
    gcc/testsuite/ChangeLog:
    
            * g++.dg/cpp2a/constexpr-vector1.C: New test.
    
    (cherry picked from commit 764f02327f7b2dc6ac5abaf89038e51cf0ee6d13)
Comment 22 Jakub Jelinek 2025-06-05 17:10:37 UTC
GCC 13.4 is being released, retargeting bugs to GCC 13.5.