Bug 51571

Summary: No named return value optimization while adding a dummy scope
Product: gcc Reporter: Prasoon <prasoonsaurav.nit>
Component: c++Assignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: akim.demaille, guillaume.melquiond, jengelh, marc, thomas.braun, webrown.cpp
Priority: P3 Keywords: missed-optimization
Version: 4.6.1   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: 4.2.3, 4.7.0 Last reconfirmed: 2018-05-30 00:00:00
Bug Depends on:    
Bug Blocks: 58055    

Description Prasoon 2011-12-15 16:39:05 UTC
Simple code snippet

#include <iostream>
int global;
struct A
{
   A(){}
   A(const A&x){
       ++global;
   }
   ~A(){}
};
A foo()
{  
     A a;
     return a;  
}
int main()
{
   A x = foo();
   std::cout << global;
}
Output : 0
 
When the definition of foo is changed to

A foo()
{ 
  { 
     A a;
     return a;  
  }
}
I get 1 as the output i.e copy c-tor gets called once.

Compiler is not optimizing the call to the copy c-tor in this case.
Comment 1 Andrew Pinski 2011-12-15 21:59:31 UTC
Confirmed, this is just a missed optimization and not very critical really.
Comment 2 Paolo Carlini 2011-12-18 10:55:08 UTC
Adding Jason, I seem to remember he did NRVO on trees.
Comment 3 Guillaume Melquiond 2013-06-02 07:36:33 UTC
I have recently encountered a similar problem, but in a much more general case.

struct A {
	A(int);
	A(A const &);
	~A();
};

A f(bool b)
{
	if (b) return A(0);
	A a(1);
	return a;
}

All the return statements dominated by variable "a" return "a", so its construction should happen in-place, hence eliding copy-construction and destruction. Unfortunately, this is not what happens with g++ 4.8.0.

Interestingly enough, if one uninlines the code by hand, g++ actually generates the optimal code, so it is possible though cumbersome to work around the missed optimization:

inline A f2()
{
	A a(1);
	return a;
}

A f1(bool b)
{
	if (b) return A(0);
        return f2();
}

produces

A f1(bool) (bool b)
{
  <bb 2>:
  if (b_2(D) != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  A::A (_4(D), 0);
  goto <bb 5>;

  <bb 4>:
  A::A (_4(D), 1);

  <bb 5>:
  return _4(D);
}
Comment 4 marc 2015-02-04 22:40:49 UTC
I would like to strongly oppose the notion that this "just a missed optimisation and not very critical, really".

NRVO is not just "an optimisation". It's actually one that is explcitly permitted to change observable behaviour of the program and it's extremely powerful.

And it it _required_ for performant C++. Just try to return a std::vector by value to see the importance of this optimisation. This is not missed optimisation. This is premature pessimisation.

You could just as well stop all optimisation work for the C++ frontend until this is implemented, because any other optimisation effords are dwarfed by the overhead when NRVO is expected by the developer but not applied.

Please make this a top priority. Every C++ program will benefit both in text size and in runtime performance - dramatically.
Comment 5 Jonathan Wakely 2018-05-30 15:08:06 UTC
Comment 3 may be a different issue. Clang elides the copy for the original report with the dummy scope, but doesn't for comment 3.

extern "C" int puts(const char*);

struct A {
  int i;
  A(int i) : i(i) { puts("cons"); }
  A(A const &) { puts("copy"); }
};

A f(bool b)
{
  puts("f(bool)");
  if (b) return A(0);
  A a(1);
  return a;
}

A g()
{
  puts("g()");
  if (false) { }
  A a(1);
  return a;
}

A h()
{
  puts("h()");
  {
    A a(0);
    return a;
  }
}

int main()
{
  A a = f(false);
  A b = g();
  A c = h();
  return a.i - b.i + c.i;
}

For this code GCC prints:

f(bool)
cons
copy
g()
cons
h()
cons
copy

But Clang prints:

f(bool)
cons
copy
g()
cons
h()
cons

(Neither result is affected by optimization)