This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Crucial C++ inlining broken under -Os


On Thu, Jul 1, 2010 at 11:36 PM, Taras Glek <tglek@mozilla.com> wrote:
> ?On 07/01/2010 02:27 PM, Richard Guenther wrote:
>>
>> On Thu, Jul 1, 2010 at 10:29 PM, Taras Glek<tglek@mozilla.com> ?wrote:
>>>
>>> On 06/30/2010 03:06 PM, Jan Hubicka wrote:
>>>>
>>>> If you can find actual simple examples where -Os is losing size and
>>>> speed
>>>> we can try
>>>> to do something about them.
>>>>
>>> According to our code size reports, inlining is completely screwed for
>>> C++
>>> wrapper classes like ones often used for smart pointers, arrays, etc. See
>>> http://people.mozilla.com/~tglek/codesize45.txt
>>>
>>> Would be really nice if this could be fixed in 4.5. It's tricky for us to
>>> switch to 4.5 otherwise.
>>>
>>> The following code inlines as expected under -Os in 4.4. It also inlines
>>> properly with -O1+ in 4.5. But it generates giant bloated code under -Os
>>> in
>>> 4.5.
>>>
>>> class Container {
>>> public:
>>> ?Container() {
>>> ? ?member = new int;
>>> ?}
>>>
>>> ?void cleanup() {
>>> ? ?delete member;
>>> ? ?member = 0;
>>> ?}
>>>
>>> ?int value() {
>>> ? ?return *member;
>>> ?}
>>>
>>> ?~Container() {
>>> ? ?cleanup();
>>> ?}
>>> private:
>>> ?int *member;
>>> };
>>>
>>>
>>>
>>> int gimme() {
>>> ?Container m;
>>> ?return m.value();
>>> }
>>
>> Without looking I bet the issue here is call_cost at -Os (which is 1).
>> In the above example we are only not inlining the constructor, which
>> is estimated as size 2 (a function call with one (constant) parameter).
>> Inlining that enlarges the caller as we'd replace a call without an
>> argument with one with an argument.
>>
>> So the inlining decision isn't too bad for -Os here (which means
>> your testcase isn't a good representative of what is the issue).
>
> You are right. I tried a -finline-limit=50 flag that we used for gcc 4.1 &
> 4.2 and that appears to bring performance to slightly above 4.3 levels with
> -Os.
>
> However, this testcase that was most obvious regression from reading the
> above code size report. Seems like a pretty serious regression given that
> size inline.o
> ?Returns 158 with -Os and 93 with -O1.
> This doesn't get "fixed" by -finline-limit=50

That's because we eliminate the out-of-line copy of the constructor at -O1.
But that's hardly countable as it is in a .comdat section and will be shared
with other uses in different units.

If you use -ffunction-sections and look at the size of the gimme()
function you'll see that gimme is 2 bytes smaller with -Os (on i?86)
compared to -O1.  As there are possibly unknown calls to the not
inlined constructor it is not fair to complain about its size when
not using -fwhole-program.

That is, we no longer optimistically assume that comdat functions
can be eliminated if there are no callers in the local TU in 4.5
(but we did in previous releases).

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]