This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/49865] New: Unneccessary reload causes small size regression from 4.6.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49865

           Summary: Unneccessary reload causes small size regression from
                    4.6.1
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: sgunderson@bigfoot.com
            Target: i?86-*-*


Comparing 4.6.1 with gcc-snapshot from Debian:

gcc version 4.7.0 20110709 (experimental) [trunk revision 176106] (Debian
20110709-1) 

Given this code:

fugl:~> cat test.cpp 
#include <string.h>

class MyClass {
    void func();

        float f[1024];
        int i;
};

void MyClass::func()
{
    memset(f, 0, sizeof(f));
    i = 0;
}

and compiling with

fugl:~> /usr/lib/gcc-snapshot/bin/g++ -Os -c test.cpp

g++ produces, according to objdump:

00000000 <_ZN7MyClass4funcEv>:
   0:    55                       push   %ebp
   1:    31 c0                    xor    %eax,%eax
   3:    89 e5                    mov    %esp,%ebp
   5:    b9 00 04 00 00           mov    $0x400,%ecx
   a:    57                       push   %edi
   b:    8b 7d 08                 mov    0x8(%ebp),%edi
   e:    f3 ab                    rep stos %eax,%es:(%edi)
  10:    8b 45 08                 mov    0x8(%ebp),%eax
  13:    c7 80 00 10 00 00 00     movl   $0x0,0x1000(%eax)
  1a:    00 00 00 
  1d:    5f                       pop    %edi
  1e:    5d                       pop    %ebp
  1f:    c3                       ret    

while 4.6.1 has a more efficient sequence:

00000000 <_ZN7MyClass4funcEv>:
   0:    55                       push   %ebp
   1:    b9 00 04 00 00           mov    $0x400,%ecx
   6:    89 e5                    mov    %esp,%ebp
   8:    31 c0                    xor    %eax,%eax
   a:    8b 55 08                 mov    0x8(%ebp),%edx
   d:    57                       push   %edi
   e:    89 d7                    mov    %edx,%edi
  10:    f3 ab                    rep stos %eax,%es:(%edi)
  12:    c7 82 00 10 00 00 00     movl   $0x0,0x1000(%edx)
  19:    00 00 00 
  1c:    5f                       pop    %edi
  1d:    5d                       pop    %ebp
  1e:    c3                       ret   

It seems 4.6 is able to take a copy of the "this" pointer from a register
before the "rep stos" operation, which is one byte smaller than reloading it
from the stack when it needs to clear "i".

Of course, the _most_ efficient code sequence here would be doing the i = 0
before the memset, but I'm not sure if this is legal. However, eax should still
contain zero, so the mov could be done from eax instead of from a constant.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]