Hi, Given the following test C++ file: class Class { public: void func(); float *buf; int size; }; void Class::func() { for (int i = 0; i < size; ++i) { buf[i] = 0; } } 4.6 (see below for exact version) will generate larger code (36 vs. 30 bytes) than 4.5.1 (Debian 4.5.1-6) given -Os. The output is 00000000 <Class::func()>: 0: 55 push %ebp 1: 31 c0 xor %eax,%eax 3: 89 e5 mov %esp,%ebp 5: 8b 4d 08 mov 0x8(%ebp),%ecx 8: 53 push %ebx 9: 8b 59 04 mov 0x4(%ecx),%ebx c: eb 10 jmp 1e <Class::func()+0x1e> e: 8d 14 85 00 00 00 00 lea 0x0(,%eax,4),%edx 15: 40 inc %eax 16: 03 11 add (%ecx),%edx 18: c7 02 00 00 00 00 movl $0x0,(%edx) 1e: 39 d8 cmp %ebx,%eax 20: 7c ec jl e <Class::func()+0xe> 22: 5b pop %ebx 23: 5d pop %ebp 24: c3 ret Basically the problem is that the lea is large (due to the zero immediate taking up 32 bits); 4.5 uses a variation where the address calculation takes both a base and an index register, which has a shorter form not requiring to store the zero. (The joys of x86; lea edx, [eax*4 + ecx] takes less space then lea edx, [eax*4]...) === Configured with: ../src/configure -v --with-pkgversion='Debian 20100828-1' --with-bugurl=file:///usr/share/doc/gcc-snapshot/README.Bugs --enable-languages=c,ada,c++,fortran,objc,obj-c++ --prefix=/usr/lib/gcc-snapshot --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --disable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --with-plugin-ld=ld.gold --enable-objc-gc --enable-targets=all --with-arch-32=i586 --with-tune=generic --disable-werror --enable-checking=yes --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.6.0 20100828 (experimental) [trunk revision 163616] (Debian 20100828-1)
It is caused by revision 162618: http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00972.html
> It is caused by revision 162618: > > http://gcc.gnu.org/ml/gcc-cvs/2010-07/msg00972.html No, it isn't, this commit reverted an incorrect change done in revision 161907.
Well, it is actually very much related to that reversion. While it didn't regress in between 161907 and 162617, it regressed from the MEM_REF merge till 161906 and from 162618 onwards. The problem is that due to the MEM load combiner isn't able to fix this up, so if we don't get it right during expansion, we are out of luck. Looking at it now...
Created attachment 22291 [details] gcc46-pr45670.patch Untested fix.
> Well, it is actually very much related to that reversion. While it didn't > regress in between 161907 and 162617, it regressed from the MEM_REF merge till > 161906 and from 162618 onwards. Yes, that isn't different from what I said, it had been caused by something else. > The problem is that due to the MEM load combiner isn't able to fix this up, so > if we don't get it right during expansion, we are out of luck. Looking at it > now... Thanks!
Author: jakub Date: Fri Nov 5 19:00:27 2010 New Revision: 166371 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=166371 Log: PR target/45670 * expr.c (expand_expr_real_1) <case MEM_REF>: Use EXPAND_SUM instead of EXPAND_NORMAL for base expansion. * gcc.target/i386/pr45670.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr45670.c Modified: trunk/gcc/ChangeLog trunk/gcc/expr.c trunk/gcc/testsuite/ChangeLog
Fixed.