This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

assembly optimizations...


i was looking to the assembly code generated  from a c++ program  using
inline calls,
and i notice a simple optimization was missing :

main:
        subl $40,%esp
        pushl %ebx
        movl $0,16(%esp)
        movl $0,20(%esp)
        movl $0,28(%esp)
        addl $-12,%esp
        pushl $4
        call __builtin_new
        addl $16,%esp
        movl $2,(%eax)
        addl $-8,%esp
        pushl %eax
        leal 24(%esp),%ebx
        pushl %ebx
        call probe__t3avl1Z7integerP7integer
        addl $16,%esp
        addl $-12,%esp                                                 ;
this stack movement can be optimized with the previous one
        pushl
$4                                                                ; we
could even have thought merge the previous one doing movl $4,$-4(%esp)
but i admit it

; would be a much harder algorithm
        call __builtin_new
        addl $16,%esp
        movl $3,(%eax)
        addl $-8,%esp
; this one can be optimized with the previous one
        pushl %eax
        pushl %ebx
        call probe__t3avl1Z7integerP7integer
        addl $16,%esp
        addl $-12,%esp                                                ;
once again ...
        pushl $4
        call __builtin_new
        addl $16,%esp
        movl $-2,(%eax)
        addl $-8,%esp                                                  ;
again ...
        pushl %eax
        pushl %ebx
        call probe__t3avl1Z7integerP7integer
...

i was compiling using gcc 2.95.2
and I try a lot of optimization options (at least all that seem me bound
to the problem) and I didn't manage get the 2 addl merge.
the code you see was compiles with : gcc  -I ../../../include/
-fno-implicit-templates -w -O3 -fno-rtti -fno-exceptions -fno-builtin
-fstrength-reduce  -fomit-frame-pointer  -fexpensive-optimizations
-fschedule-insns -fthread -jumps -felide-constructors -fdelayed-branch
-o test_avl.S test_avl.cc -S !!!

the part of the code you were seing was:

int main() {
  avl<integer> avl;
  integer * x;

  avl.insert(new integer(2));
  avl.insert(new integer(3));
...

avl.insert is an inline wrapper implemented this way  :

template <t>
inline t * avl<t>::insert(t * item) {t **p = probe (item);return (*p ==
item) ? NULL : *p;}
and integer is an inline wrapper class for C integers.

it seems that cse-algorithm has removed the branch for the return value
in avl insert.

I so on the page relative to misoptimized assembly gcc output, that we
encounter the same kind
of mergeables operations when accessing some contiguous memory that can
be merged and assigning
them constants.

Is there an optimisations in gcc that check mergeable assembly
operations planed ?

Hope it would be useful for you.

thanks
tranx



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]