Bug 16987 - [3.3/3.4/4.0 Regression] Excessive stack allocation (totally unused)
Summary: [3.3/3.4/4.0 Regression] Excessive stack allocation (totally unused)
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2004-08-11 18:10 UTC by Giovanni Bajo
Modified: 2004-10-30 21:11 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 2.95.3
Known to fail: 3.3.3 3.4.0 4.0.0
Last reconfirmed: 2004-08-11 18:19:14


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Giovanni Bajo 2004-08-11 18:10:09 UTC
-------------------------------------------------------------------
typedef int Type;

void check(int);

template <typename To, typename From>
inline int assign(To& a, From b) {
	a = b;
	return 0;
}

template <typename To, typename From>
inline int add(To& a, From b, From c) {
	a = b + c;
	return 0;
}

template <typename T>
class Template;

template <typename T>
class Template {
private:
	T v;
public:
	Template()
		: v(0) {
	}
#if 0
	Template(const Template& y)
		: v(y.v) {
	}
#endif
	Template(const T y)
		: v(y) {
	}
	template <typename T1>
	Template(const T1 y) {
		check(assign(v, y));
	}
	const T& value() const {
		return v;
	}
};

template <typename T>
inline Template<T>
operator+(const Template<T> x, const Template<T> y) {
	T r;
	check(add(r, x.value(), y.value()));
	return r;
}

template <typename T, typename T1>
inline Template<T>
operator+(const T1 x, const Template<T> y)
{
	return Template<T>(x) + y;
}

Type s(int v)
{
	Template<Type> a = v;
	Template<Type> c = 3 + a;
	return c.value();
}
-------------------------------------------------------------------

Try compiling this with -O3, comparing the outputs after changing #if 0 to #if 
1. The .s diff is the following (excluding changes related to a label which 
changes name):


--- bind.s      2004-08-11 06:56:43.000000000 +0200
+++ bind_ctor.s 2004-08-11 06:56:23.000000000 +0200
@@ -6,25 +6,27 @@
 .LCFI2:
-       subl    $16, %esp
+       subl    $64, %esp
 .LCFI3:
-       movl    8(%ebp), %ebx
-       addl    $3, %ebx
+       movl    8(%ebp), %eax
+       leal    3(%eax), %ebx
        pushl   $0
+       movl    %eax, -40(%ebp)
+       movl    $3, -56(%ebp)
 .LCFI4:
        call    _Z5checki
        movl    %ebx, %eax
        movl    -4(%ebp), %ebx
        leave
        ret



(- is #if 0,  + is #if 1).

There are two problems here: first, there is a slight code pessimization which 
should not happen because the code is semantically identical between the two 
cases. Second, there is a *big* increase in stack allocation, for no good 
reason (16 -> 64). Seems like tree-ssa is not able to clean up something.
Comment 1 Andrew Pinski 2004-08-11 18:19:13 UTC
Confirmed, I could not figure out what variables would overlap to save the stack space in 3.4.0 but 
there is a missed optimization here for 3.5.0:
  x.v = 3;
  y.v = v;
  T.3 = (const Type *)&y;
  T.5 = (const Type *)&x;
  y = *T.3 + *T.5;
  check (0);
  c.v = y;
  T.1 = (const Type *)&c;
  return *T.1;

but this is another place where the C++ (and C) front-ends are lowering &a->b (to (const Type *)&y) 
too soon.
Comment 2 Andrew Pinski 2004-08-11 18:21:06 UTC
I should note that with the optimization SRA should be happening to cause the structs to no longer be 
there.
Comment 3 Giovanni Bajo 2004-08-11 19:44:54 UTC
The stack allocation problem is present in GCC since at least GCC 3.2.2, and 
the tree optimizers didn't fix it.

2.95.3 allocated more stack space without the constructor but less stack space 
with the constructor:

-       subl $20,%esp
+       subl $36,%esp

So we now allocate 64 bytes instead of 36. I flag this as a regression, which 
is surely important for any kind of embedded targets.

The missed optimization which Andrew speaks about in comment #1 is a known 
problem already tracked elsewhere (as he said).
Comment 4 Mark Mitchell 2004-08-29 18:20:37 UTC
Will not be fixed in GCC 3.4.x.
Comment 5 Steven Bosscher 2004-09-03 22:05:29 UTC
is this related to Bug 9997 ??? 
Comment 6 Richard Henderson 2004-09-03 22:58:05 UTC
Who can tell?  The optimization problem has been addressed, which means that
we're left with

Type s(int) (v)
{
<bb 0>:
  check (0);
  return v + 3; 
}

at the end of tree optimization.  Which I guess is "fixed", by one definition.
File a new bug report with a less reduced test case if you see this again.
Comment 7 Andrew Pinski 2004-09-07 07:31:38 UTC
Yes this was fixed by the non-lowering of &a.b into &a + offsetof(b).