This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
assembly optimizations...
- To: gcc at gcc dot gnu dot org
- Subject: assembly optimizations...
- From: tranx nouvel <tranx at cybergaia dot org>
- Date: Wed, 27 Sep 2000 08:41:55 +0100
- Organization: CyberGaia
i was looking to the assembly code generated from a c++ program using
inline calls,
and i notice a simple optimization was missing :
main:
subl $40,%esp
pushl %ebx
movl $0,16(%esp)
movl $0,20(%esp)
movl $0,28(%esp)
addl $-12,%esp
pushl $4
call __builtin_new
addl $16,%esp
movl $2,(%eax)
addl $-8,%esp
pushl %eax
leal 24(%esp),%ebx
pushl %ebx
call probe__t3avl1Z7integerP7integer
addl $16,%esp
addl $-12,%esp ;
this stack movement can be optimized with the previous one
pushl
$4 ; we
could even have thought merge the previous one doing movl $4,$-4(%esp)
but i admit it
; would be a much harder algorithm
call __builtin_new
addl $16,%esp
movl $3,(%eax)
addl $-8,%esp
; this one can be optimized with the previous one
pushl %eax
pushl %ebx
call probe__t3avl1Z7integerP7integer
addl $16,%esp
addl $-12,%esp ;
once again ...
pushl $4
call __builtin_new
addl $16,%esp
movl $-2,(%eax)
addl $-8,%esp ;
again ...
pushl %eax
pushl %ebx
call probe__t3avl1Z7integerP7integer
...
i was compiling using gcc 2.95.2
and I try a lot of optimization options (at least all that seem me bound
to the problem) and I didn't manage get the 2 addl merge.
the code you see was compiles with : gcc -I ../../../include/
-fno-implicit-templates -w -O3 -fno-rtti -fno-exceptions -fno-builtin
-fstrength-reduce -fomit-frame-pointer -fexpensive-optimizations
-fschedule-insns -fthread -jumps -felide-constructors -fdelayed-branch
-o test_avl.S test_avl.cc -S !!!
the part of the code you were seing was:
int main() {
avl<integer> avl;
integer * x;
avl.insert(new integer(2));
avl.insert(new integer(3));
...
avl.insert is an inline wrapper implemented this way :
template <t>
inline t * avl<t>::insert(t * item) {t **p = probe (item);return (*p ==
item) ? NULL : *p;}
and integer is an inline wrapper class for C integers.
it seems that cse-algorithm has removed the branch for the return value
in avl insert.
I so on the page relative to misoptimized assembly gcc output, that we
encounter the same kind
of mergeables operations when accessing some contiguous memory that can
be merged and assigning
them constants.
Is there an optimisations in gcc that check mergeable assembly
operations planed ?
Hope it would be useful for you.
thanks
tranx