optimization/8952: failure to optimize away trivial C++ object creation
martin@xemacs.org
martin@xemacs.org
Sun Dec 15 12:26:00 GMT 2002
>Number: 8952
>Category: optimization
>Synopsis: failure to optimize away trivial C++ object creation
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: unassigned
>State: open
>Class: pessimizes-code
>Submitter-Id: net
>Arrival-Date: Sun Dec 15 12:26:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: martin@xemacs.org
>Release: gcc-3.2.1
>Organization:
>Environment:
x86 Linux
>Description:
g++ 3.2.1 x86 fails to perform some easy optimizations. Here I look
at "class literal constant folding". This might be an important part
of the remaining "abstraction penalty".
Consider this most simple and optimizer-friendly class with 2 data members:
class Complex
{
private:
const int real_;
const int imag_;
public:
inline Complex (int real, int imag) : real_ (real), imag_ (imag) {}
inline int Real() const { return real_; }
inline int Imag() const { return imag_; }
inline friend Complex operator+ (Complex z1, Complex z2)
{ return Complex (z1.real_ + z2.real_, z1.imag_ + z2.imag_); }
};
If we now use "Complex Literals" like Complex(3,4), the compiler
should be able to do the obvious optimizations like constant-folding
just as with builtin types.
Now let's look at the x86 assembly code for two functions:
Complex foo () { return Complex(9,11); }
==> generates obvious optimal code:
movl 4(%esp), %eax
movl $9, (%eax)
movl $11, 4(%eax)
ret $4
On the other hand,
Complex bar () { return Complex(1,2) + Complex(8,9); }
==> generates suboptimal code:
subl $20, %esp
movl 24(%esp), %eax
movl $1, 8(%esp)
movl $8, (%esp)
movl $2, 12(%esp)
movl $9, 4(%esp)
movl $9, (%eax)
movl $11, 4(%eax)
addl $20, %esp
ret $4
The two functions should generate identical code. There seem to be
actually too simple optimizer bugs here:
- The addition operand objects above are created, but never used
(since the result is computed at compile time). So the code to
generate them can simply be discarded (the constructors have no side
effects).
- There seems to be no need to adjust %esp, since this is a leaf function.
Note that these bugs are sufficiently simple that I might be able to
write an easy optimizer pass as a postprocessor on the .s files.
But the gcc maintainers should fix the deeper problems. Stores to
never-used stack slots should be easy to optimize away.
Could this be part of the reason Intel C++ dramatically outperforms
g++ on Scott Ladd's "Complex" benchmark?
The analogous problem does NOT appear to occur with classes containing
only one data member.
Details: g++ 3.2.1 on Linux x86; g++ -Wall -S -O3 -fomit-frame-pointer
Disclaimer: The only assembly code I've ever written was IBM System/370.
>How-To-Repeat:
Compile the following on x86 Linux using: g++ -S -O3
examine the .s file generated.
class Complex
{
private:
const int real_;
const int imag_;
public:
inline Complex (int real, int imag) : real_ (real), imag_ (imag) {}
inline int Real() const { return real_; }
inline int Imag() const { return imag_; }
inline friend Complex operator+ (Complex z1, Complex z2)
{ return Complex (z1.real_ + z2.real_, z1.imag_ + z2.imag_); }
};
Complex foo () { return Complex(9,11); }
Complex bar () { return Complex(1,2) + Complex(8,9); }
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the Gcc-prs
mailing list