70277 – Improve code generation for large initializer lists

Bug 70277 - Improve code generation for large initializer lists

Summary: Improve code generation for large initializer lists

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c++ (show other bugs)
Version:	6.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	compile-time-hog, missed-optimization

Depends on:
Blocks:

Reported:	2016-03-17 14:53 UTC by Jakub Jelinek
Modified:	2023-01-19 06:21 UTC (History)
CC List:	4 users (show)

See Also:	105838
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2016-03-17 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jakub Jelinek 2016-03-17 14:53:34 UTC

#include <string>
#include <vector>

#define T10 "0", "1", "2", "3", "4", "5", "6", "7", "8", "9"
#define T100 T10, T10, T10, T10, T10, T10, T10, T10, T10, T10
#define T1000 T100, T100, T100, T100, T100, T100, T100, T100, T100, T100
#define T10000 T1000, T1000, T1000, T1000, T1000, T1000, T1000, T1000, T1000, T1000
#define T100000 T10000, T10000, T10000, T10000, T10000, T10000, T10000, T10000, T10000, T10000

int
main ()
{
  std::vector<std::string> v {
    T100000, T100000
  };
}

ICEs during GC, 200000 nested try {} finally {} gimple stmts is just too much for typical stack size limits.

Perhaps the compiler should find out that all initializers (over certain threshold, perhaps --param?) are the same type and need to be constructed the same (or at least significant range of them, e.g. if there are 1000 constant string literals, then one other initializer that needs another kind of constructor, then 10000 other initializers that need yet another kind), and expand it as a loop over some const POD array instead of emitting huge spaghetti code?

Comment 1 Richard Biener 2016-03-18 08:42:33 UTC

I only see a single try {} finally {} generated by the FE.  It looks like
the remaining are generated by gimplification do be able to output CLOBBERs.

I think for large initializer lists it would be good to emit a sorry ()
and have a compile-time limit on their element count - is sth like this
allowed by the standard?  Because I can't see how generally emitting
more optimal code is possible at all...

Comment 2 Jonathan Wakely 2016-03-18 11:43:26 UTC

It's allowed by Annex B in the standard (although I don't see specific mention of a suggested minimum for the number of initializers in a braced-init-list):

Because computers are finite, C++ implementations are inevitably limited in the size of the programs
they can successfully process. Every implementation shall document those limitations where known. This
documentation may cite fixed limits where they exist, say how to compute variable limits as a function of
available resources, or say that fixed limits do not exist or are unknown.

Comment 3 Jakub Jelinek 2016-03-18 14:44:59 UTC

Let's look at
#include <string>
#include <vector>

int
main ()
{
  std::vector<std::string> v {
    "0", "1", "2"
  };
}

which is IMHO below the limit for the proposed optimization, but on the other side easier to explain.  We already have there
  const struct basic_string D.26903[3];
array, the proposal is if the C++ FE finds long enough range of initializers with the same type it can add also
  static const char *D.xxxxx[3] = { "0", "1", "2" };
    {
    struct vector v;

    try
      {
        std::allocator<char>::allocator (&D.26895);
        int D.yyyyy;
        D.yyyyy = 0;
        try
          {
            while (D.yyyyy < 3)
              {
                std::allocator<char>::allocator (&D.26895);
                std::basic_string<char>::basic_string (&D.26903[D.yyyyy], D.xxxxx[D.yyyyy], &D.26895);
              }
            ...
          }
        finally
          {
            // Destruct only what has been constructed already in reverse order, use D.yyyyy for that.
          }
      }
...

It doesn't have to be just char literals, it can be any other kind of constants, and it will certainly result in better generated code, unroller can unroll it if beneficial later, but it will help inlining etc., compared to having hundreds of thousands of statements just to construct something.

Comment 4 Jakub Jelinek 2017-05-02 15:56:38 UTC

GCC 7.1 has been released.

Comment 5 Richard Biener 2018-01-25 08:21:37 UTC

GCC 7.3 is being released, adjusting target milestone.