Bug 4131 - The C++ compiler doesn't place a const class object to ".rodata" section with non trivial constructor
Summary: The C++ compiler doesn't place a const class object to ".rodata" section with...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 2.95
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
: 22575 30023 31785 44638 53829 66355 78773 84103 110925 (view as bug list)
Depends on: 92538
Blocks: 79189 93666 102876
  Show dependency treegraph
 
Reported: 2001-08-26 09:46 UTC by k_satoda
Modified: 2024-02-04 04:04 UTC (History)
23 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2023-08-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description k_satoda 2001-08-26 09:46:01 UTC
I want to use a 'fixed-pointed-value' as a replacement of 'floating-point-value'.
But the const instance of my class never be placed to ".rodata" section.
They seem to need ".ctors" , however the constructing code has only a constant instructions to store a certain value.
I think it dosen't need any codes , and needs only few bytes of ".rodata" section.
I tried same test on some compilers , but no one generate the codes I want.
These can be a headache on machines that have little RAM.

Please excuse my poor english typing.(I am a Japanese.)

Release:
unknown-2.9

Environment:
various

How-To-Repeat:
// compile the following with maximam optimization.
class T
{
	int raw; 
public:
	enum { BASE = (1<< 8) };
	template<class value_type> T(value_type opr) : raw(static_cast<int>(opr * BASE)) {}
	template<class value_type> operator value_type() const { return static_cast<value_type>(raw) / BASE; }
};

const T t1 = 1.0;
const int i1 = static_cast<int>(1.0 * T::BASE);

int main(void)
{
	return i1 ^ static_cast<int>(t1);
}
Comment 1 Richard Henderson 2002-10-06 20:08:19 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: Because it has a non-trivial constructor.  This could 
    probably still be optimized, but it's up to the front end
    to tell us that; changed to category c++.
Comment 2 Andrew Pinski 2003-12-28 22:40:55 UTC
Suspending as this will take a major change in the C++ front-end.
Comment 3 Andrew Pinski 2005-07-20 18:44:37 UTC
*** Bug 22575 has been marked as a duplicate of this bug. ***
Comment 4 Pawel Sikora 2005-07-20 18:53:53 UTC
hmm, i think someone should reopen this bug. 
4.1 is a good place for major changes in c++ front-end ;) 
 
Comment 5 Andrew Pinski 2005-07-20 18:58:31 UTC
(In reply to comment #4)
> hmm, i think someone should reopen this bug. 
> 4.1 is a good place for major changes in c++ front-end ;) 
Not any more since we are in stage3 already.
Comment 6 Paul Schlie 2005-07-20 19:03:13 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > hmm, i think someone should reopen this bug. 
> > 4.1 is a good place for major changes in c++ front-end ;) 
> Not any more since we are in stage3 already.

- given that 4.1's front end has already evolved from that in 2.95,
   it's not clear that a conclusion based on 2.95 is even valid for 4.1.
   (so should no likely assumed as being so).
Comment 7 Andrew Pinski 2005-07-20 19:11:56 UTC
Subject: Re:  Why the C++ compiler don't place a const class object to ".rodata" section?


On Jul 20, 2005, at 3:03 PM, schlie at comcast dot net wrote:

> - given that 4.1's front end has already evolved from that in 2.95,
>    it's not clear that a conclusion based on 2.95 is even valid for 
> 4.1.
>    (so should no likely assumed as being so).

It still is true since the front-end still does exactly what it did in 
2.95 for
this testcase and there have been no changes in this area really.

Since the mainline is in stage3 which means no more improvements except 
for fixing
bugs which are either regressions or non enhancement bugs.

-- Pinski

Comment 8 Wolfgang Bangerth 2005-07-20 19:47:11 UTC
It may be true that this bug isn't going to be fixed in this cycle, 
but there's no reason not to keep it open instead of suspending it. 
The "suspend" state is mean for PRs where we need external things to 
happen, such as a defect report to be accepted. This clearly isn't the 
case here. 
 
I'll close this PR and reopen 22575 instead. 
 
W. 

*** This bug has been marked as a duplicate of 22575 ***
Comment 9 Andrew Pinski 2005-11-18 00:51:54 UTC
Reopening this bug since it is the correct one to keep open.
Comment 10 Andrew Pinski 2005-11-18 00:55:22 UTC
*** Bug 22575 has been marked as a duplicate of this bug. ***
Comment 11 Bjoern Haase 2006-08-10 12:11:40 UTC
Hi,

here is a much simpler test case for this issue.

Bjoern.



#include <complex>

using namespace std;

const complex<char> should_be_in_rodata (42,-42);
complex<char> should_be_in_data (42,-42);
complex<char> should_be_in_bss;

Comment 12 Mike Stump 2006-08-10 16:54:02 UTC
Trivially, one could use turing completeness at compile time to achieve the desired result.  :-)  Not that I think that is better than `fixing' this bug.
Comment 13 Andrew Pinski 2006-08-10 17:52:58 UTC
Here is a real reduced testcase:

struct f
{
  f(int a) { t = a; }
  int t;
}

const f(1);

-----
Comment 14 Björn Haase 2006-08-10 19:33:58 UTC
I had already a look at the code in the cp directory. Unfortunately the documentation of the c++ front-end seems to be still worse than the docs on the back-ends (i.e. RTL). Either it is virtually inexisting or, I didn't find any hint on where to find it.

Meanwhile I had a look at the tree dumps. Unfortunately, I didn't succeed in finding the initialization data for global "plain old built-in type" variables in the tree dumps. I have so far only seen the constructor code for initialization of class objects.


AI'd at least like to have an idea of the complexity of the task. I have the impression that it might be way to difficult for me myself. But at least I'd like to try my very best to fix it before giving up. 

So any hint on a starting point for code reading and analysis would be appreciated.

Bjoern.
Comment 15 Bjoern Haase 2006-08-11 07:48:52 UTC
I just realized that yesterday the subject line has been changed.

I'd like to suggest that this new subject line is mis-leading:

The compiler doesn't place ANY object in .rodata . It's not necessary to have
a "non-trivial" constructor.
E.g. have a look at the constructors of the complex class template. There isn't any statement in the constructor. There is only the initialization of the member POD for the real and imaginary parts.

If one changes the subject line, I think that 

"the compiler don't place any const class object to .rodata"

would be appropriate.

Bjoern.

Comment 16 Andrew Pinski 2006-08-11 08:04:29 UTC
Non trivial is the wording used by the C++ standard which is why I used it.  (it is also called user defined constructor).
Comment 17 Björn Haase 2006-08-17 14:36:18 UTC
I have made a superficial analysis of the issue and would like to discuss at
the end of this post a possible approach for resolving PR4131.

The first observation is, that when one is having a code segment like

/* Start */
#include <complex>
using namespace std;
typedef complex<int> ci;

ci fa [3] = {ci(1,2), ci(-1,-2), ci(-42,+42)};
/* End */

the gimple optimizers will yield a very simple code sequence like

/* Start of gimple code */
void __static_initialization_and_destruction_0(int, int) (__initialize_p, __priority)
{
<bb 2>:
  if (__initialize_p == 1) goto <L0>; else goto <L2>;

<L0>:;
  if (__priority == 65535) goto <L1>; else goto <L2>;

<L1>:;
  fa[0]._M_real = 1;
  fa[0]._M_imag = 2;
  fa[1]._M_real = -1;
  fa[1]._M_imag = -2;
  fa[2]._M_real = -42;
  fa[2]._M_imag = 42;

<L2>:;
  return;
}
/* End of gimple code */

for the constructor function. Namely, I think that there is hope that one would
grep the most important cases if one would try to replace
 some_direct_address_in_data_member = const_immediate_integer;
expressions in the constructors by storing the value in the .data initializers.
Namely, one would be placing the values in the initialization memory region and
one would be deleting the assignment expressions.
If at the end of this process, the constructor function would no longer contain
references to the data structure, "const" qualified VAR_DECL could even be
placed in ".rodata".

Thus, for fixing PR4131 I'd like to suggest to

1.) change the definition of the VAR_DECL so that DECL_INITIAL always points
    to a memory region holding initialization data. I.e. also for the case
    that we are having constructor code. Initially the memory region would
    be initialized to 0.
2.) In order to do this, one would need to replace the tests
    "DECL_INITIAL(decl) == error_mark_node" by tests against one of the unused
    flags in tree_decl_common that would be assigned a new meaning.
    E.g., one might take "decl_flag_0" which seems to be unused so far
    for VAR_DECL.? 
3.) One would then add a new tree optimiation pass that is located somewhere
    close to the end of tree optimization. There one would be looking for
    expressions like "static_direct_address = const_immediate_value" like
    in the sample gimple code above.
    One wold be inserting the values in the DECL_INITIAL(decl) memory region
    delete the corresponding expression statements in the constructor function.
    After making all the possible replacements, one would be re-visiting the 
    code of the constructor function.
    If within the constructor code more complex references to the VAR_DECL
    remain, that could not be removed easily, one would set one second flag
    in "tree_decl_common" that states that the VAR_DECL needs to reside in ram,
    even if it's a const object.
4.) In "var_asm.c" one would be checking if DECL_INITIAL(decl) is completely
    zero. In this case it would go into .bss. If the initialization memory
    region is not zero altogether, one would place the object in .data .
    If it's a const object without the flag
    "needs_to_reside_in_ram_even_if_const" it would be placed into .rodata.


IMO, the most complicated part of it would be the new tree pass 3.).

Namely, one would need to find the approriate branch in

void __static_initialization_and_destruction_0(int, int) (__initialize_p, __priority);

, look for "direct_address_expression = immediate_integer_value;" type
expression statements that are not residing inside loops or other complicated
structures and delete them if possible.
Then one would be looking if there is any reference to some VAR_DECL remaining
in the FUNCTION_DECL of the constructor function. If there is still a reference,
one would be setting the "needs_to_reside_in_ram_even_if_const" flag.
Otherwise one would clear it.

I would be willing to start with implementing 1,2 and 4, but I am quite sure 
that I would need help for 3.

Bjoern.
Comment 18 Andrew Pinski 2006-11-29 23:22:14 UTC
*** Bug 30023 has been marked as a duplicate of this bug. ***
Comment 19 Andrew Pinski 2007-05-02 14:55:43 UTC
*** Bug 31785 has been marked as a duplicate of this bug. ***
Comment 20 Jorg Brown 2008-06-11 00:07:37 UTC
Interesting, but I'm not sure this can legally be done.

Consider:

struct POD {
  int x;
  int y;
};

struct nonPOD {
  int x;
  int y;
  nonPOD(int xx, int yy) : x(xx), y(yy) { }
};

I, for one, would love to see "nonPOD foo(1, 2);" be treated as efficiently as "POD foo = {1, 2};", and I would argue that the case that should be optimized is when the arguments to the constructor are known at compile time, the body of the constructor is completely empty, and all the member variables are POD.

However let us consider this example program, starting with the definitions above and the requisite header files and then:

extern const POD    pod;
extern const nonPOD nonpod;

std::string podStr(pod.x, '*');
std::string nonpodStr(nonpod.x, '*');

const POD pod = {1, 2};
const nonPOD nonpod(1, 2);

int main(int argc, char *argv[]) {
  std::cout << "podStr = '" << podStr << "'\n";
  std::cout << "nonpodStr = '" << nonpodStr << "'\n";
  return 0;
}

Now, the order of construction is well-defined, and that is why the program produces:

podStr = '*'
nonpodStr = ''

That is, the nonPOD is still zero-filled when the constructor for nonpodStr runs, so nonpodStr ends up empty.

I believe that if you change nonPOD so that it sits in .rodata, then it has to be initialized prior to nonpodStr.  This changes the behavior of this perfectly valid program.  No?

(Just for the record, I would whole-heartedly endorse a change to the C++ standard to allow this optimization)
Comment 21 Andrew Pinski 2008-06-11 02:17:35 UTC
Well if done correctly the compiler would see that nonpod.x was used for the initialization and inline it as zero :).  So really this can be still done.
Comment 22 Jeffrey Yasskin 2008-06-11 18:05:18 UTC
This is related to generalized constant expressions (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2235.pdf) in C++0x. Those will be marked by the explicit 'constexpr' keyword and will require the initialization to be done at static rather than dynamic initialization time, while this bug is about the optional optimization of moving some extra objects from dynamic to static time.

If I understand it correctly, in C++0x, the following code will require f to be placed in either the .rodata or .data sections, rather than .bss as it's placed now.

struct Foo
{
  constexpr Foo(int a) { t = a; }
  int t;
}

constexpr Foo f(1);


I'd also like to point out that with the extra optimization described here, the following code could also place f in the .data section:

struct Foo
{
  constexpr Foo(int a) { t = a; }
  int t;
}

Foo f(1);  // Note that f is non-const.

This would be useful for getting atomic variables initialized before anything else starts up, but it may well belong in a separate feature request.
Comment 23 Andrew Pinski 2010-06-27 22:09:47 UTC
*** Bug 44638 has been marked as a duplicate of this bug. ***
Comment 24 Jakub Jelinek 2010-06-28 11:05:10 UTC
On:
extern "C" void abort ();

struct S
{
  int x;
  int y;
};

struct T
{
  int x;
  int y;
  T (int u, int v) : x (u), y (v) {}
};

extern const S s;
extern const T t, u;

int sx = s.x;
int tx = t.x;
const S s = { 1, 2 };
const T t (1, 2);
const T u (1, 2);
int ux = u.x;

int
main ()
{
  if (sx != 1 || tx != 0 || ux != 1)
    abort ();
  if (s.x != 1 || s.y != 2)
    abort ();
  if (t.x != 1 || t.y != 2)
    abort ();
  if (u.x != 1 || u.y != 2)
    abort ();
  return 0;
}

it is easy to spot whether this optimization would be possible or not by looking at TREE_USED of the decl at check_initializer time.  It is set for t (and s), but cleared for u.
Comment 25 Jakub Jelinek 2010-06-28 11:22:26 UTC
I guess best would be to wait for the constexpr work, then use that as an infrastructure to discover ctors that aren't marked as constexpr, but they could be and use that at bit together with !TREE_USED during check_initializer to do this optimization.
Comment 26 Thiago Macieira 2011-07-08 08:03:50 UTC
Using GCC 4.6, which does support constexpr in C++0x mode, it turns out that the compiler does place initialised variables in the .data section. However, const variables are still in .data, not in .rodata.

I don't think they are the same issues, so I reported Bug 49673.
Comment 27 Andrew Pinski 2012-07-02 07:12:27 UTC
*** Bug 53829 has been marked as a duplicate of this bug. ***
Comment 28 Kenneth Almquist 2015-06-10 01:04:22 UTC
*** Bug 66355 has been marked as a duplicate of this bug. ***
Comment 29 Kenneth Almquist 2015-06-10 07:52:02 UTC
(In reply to Jorg Brown from comment #20)
> Now, the order of construction is well-defined, and that is why the program
> produces:
> 
> podStr = '*'
> nonpodStr = ''
> 
> That is, the nonPOD is still zero-filled when the constructor for nonpodStr
> runs, so nonpodStr ends up empty.

Actually, looking at paragraph 2 of section 3.6.2 of the 1997 draft standard, it appears that the order of construction is not entirely well defined.  Consider:

     extern int x, y, z;
     int z = y + 1;
     int y = x + 1;
     int x = 10;

As I understand it, x must use static initialization for x, but has the option of using either static or dynamic initialization for y and z.  Normally, variables are initialized in program order, so that z would be initialized before y.  However, if the compiler choses dynamic initialization for z and static initialization for y, then z will be initialized after y.  Similarly, in your example, nonpodStr could be either '' or '*'.

There is one special rule in evaluating the initial value of variables
when static initialization is used:

  If the expression contains a variable
  and the compiler is allowed to use dynamic initialiation for that variable
  and that variable is initialized later in the program
  then zero is used as the value of the variable.

In the above example, if static initialization is used for z, the initial
value will be 1, because zero will be used as the value of y when computing
y + 1.  On the other hand, if static initialization is used for y, the
initial value of y will be 11, because the compiler is require to use
static initialization for x and the rule given above doesn't apply.

In short, computing the values of initializers at compile time for C++ involves a special rule which, as far as I know, doesn't apply to any other language.  For that reason, it might make sense to do this calculation in the C++ front end.  I think that the front end already does a bit of this, but only for expressions that are formally constant expressions.
Comment 30 Andrew Pinski 2016-12-13 07:42:12 UTC
*** Bug 78773 has been marked as a duplicate of this bug. ***
Comment 31 Andrew Pinski 2018-01-29 20:04:48 UTC
*** Bug 84103 has been marked as a duplicate of this bug. ***
Comment 32 Jason Merrill 2018-06-25 17:37:01 UTC
As previously pointed out, since GCC 4.7, for many classes this is a simple matter of adding "constexpr" to the constructor.

With the testcase for bug 84103, I notice that clang does dynamic initialization at -O0 and static initialization at -O1.  We could do something similar: if the optimizers know the value of a variable at the end of a global initialization function, they could put that value into that variable statically instead.
Comment 33 Andrew Pinski 2023-08-06 23:36:37 UTC
*** Bug 110925 has been marked as a duplicate of this bug. ***