# gcc -v Reading specs from /.share/usr/app/gcc-3.4.3/bin/../lib/gcc/i386-pc-linux-gnu/3.4.3/specs Configured with: ../gcc-3.4.3/configure --prefix=/usr/app/gcc-3.4.3 --exec-prefix=/usr/app/gcc-3.4.3 --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/app/gcc-3.4.3/libexec --datadir=/usr/app/gcc-3.4.3/share --sysconfdir=/etc --sharedstatedir=/usr/app/gcc-3.4.3/var/com --localstatedir=/usr/app/gcc-3.4.3/var --libdir=/usr/lib --includedir=/usr/include --infodir=/usr/info --mandir=/usr/man --with-slibdir=/usr/app/gcc-3.4.3/lib --with-local-prefix=/usr/local --with-gxx-include-dir=/usr/app/gcc-3.4.3/include/g++-v3 --enable-languages=c,c++ --with-system-zlib --disable-nls --enable-threads=posix i386-pc-linux-gnu Thread model: posix gcc version 3.4.3 Does not happen with -Os Does not happen with 3.4.1 I have a testcase
Created attachment 8695 [details] testcase Use gcc -O2 -S t.c
4.1.0/4.0.0 gives: subl $268, %esp which is better than 3.4.x. 3.4.0 gives: subl $732, %esp Only a 3.4 regression, confirmed: subl $3516, %esp
Note your testcase has uninitialized variables, well the arrays are uninitizalized, after fixing them, it gets worse: subl $3532, %esp Though on 4.0.0/4.1.0, we get better: subl $260, %esp which is funny but what ever. 3.3.3 by the way gives: subl $636, %esp 3.4.0 gives: subl $748, %esp 3.2.3: subl $444, %esp 3.0.4: subl $556, %esp 2.95.3: subl $508,%esp so 4.0.0/4.1.0 gices the best results.
Hmm, with -fomit-frame-pointer, the stack usage goes back to an okay value: subl $604, %esp I think this is just another case where spills are no not reused, see PR 17838. so -fomit-frame-pointer causes the spilling to go down. 4.0.0 reduces the stack usage by making the arrays really each 8 variables.
Created attachment 8699 [details] testcase without use of uninitialized data
>Though on 4.0.0/4.1.0, we get better: > subl $260, %esp It's way too good. Declared locals should take 512 bytes, plus any temporaries for spills. Please find fixed testcase. My fault.
Whoops.... no, locals are 256 bytes only. (/me is looking for some coffee)
won't fix for 3.4.6
Current gcc seems to be doing fine: $ grep 'sub.*,%esp' *.asm; size *.o whirlpool-4.2.1-O2.asm: 81 ec 84 01 00 00 sub $0x184,%esp whirlpool-4.2.1-O3.asm: 81 ec 4c 01 00 00 sub $0x14c,%esp whirlpool-4.2.1-Os.asm: 81 ec 84 01 00 00 sub $0x184,%esp whirlpool-4.6.3-O2.asm: 81 ec 4c 01 00 00 sub $0x14c,%esp whirlpool-4.6.3-O3.asm: 81 ec 4c 01 00 00 sub $0x14c,%esp whirlpool-4.6.3-Os.asm: 81 ec 4c 01 00 00 sub $0x14c,%esp text data bss dec hex filename 6223 0 0 6223 184f whirlpool-4.2.1-O2.o 5663 0 0 5663 161f whirlpool-4.2.1-O3.o 6194 0 0 6194 1832 whirlpool-4.2.1-Os.o 5655 0 0 5655 1617 whirlpool-4.6.3-O2.o 5703 0 0 5703 1647 whirlpool-4.6.3-O3.o 5570 0 0 5570 15c2 whirlpool-4.6.3-Os.o
BTW, testcase needs a small fix: -static const u64 C0[256]; +u64 C0[256]; or else gcc with optimize it almost to nothing :)