This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: optimizations


Bonzini wrote:
> 
> > > Could you please also tell me if 3.3 and 3.4 remove the extra mov's in
> and out
> > > of %eax. Ideally, there should be no more than 4 instructions in the
> critical
> > > loop.
> >
> > .L2:
> > movl -4(%ebp), %eax <== still does the load
> > cmpl $16, %eax
> > je .L7
> > incl %eax
> > movl %eax, -4(%ebp) <== and store
> > jmp .L2
> > .L7:
> >
> > For some reason it is not (even with -fnew-ra), but on PPC there
> > is no extra load/store.
> 
> Instruction counts do not tell the whole story; gcc is simply putting more
> pressure on the decoding unit but less pressure on the execution unit (which
> otherwise would execute two loads in the `taken' case).  Things might be

Would you please elaborate on that?  I don't understand what you mean by the
"taken case."  The suggested optimization is:

CHANGE:
-------
.L2:
movl -4(%ebp), %eax <== still does the load
cmpl $16, %eax
je .L7
incl %eax
movl %eax, -4(%ebp) <== and store
jmp .L2
.L7:

TO:
-------
movl -4(%ebp), %eax
.L2:
cmpl $16, %eax
je .L7
incl %eax
jmp .L2
.L7:
movl %eax, -4(%ebp)

The mov's have moved _outside_ of the critical loop, and the register allocator
may still be able to remove the extra mov at entry to the loop.

The total number of instructions, and hence total program size will remain the
same even in the worst possible case.

Furthermore, an extra jump can be removed from the critical loop. If you
compile:
i=0;
for(;i<10;i++);
write(1,&i,4)   //make i volatile

then you will see that gcc optimizes away even this redundant jump, hence
producing only _three_ lines of code. But when a while() loop is used instead of
the equivalent for() loop that does not happen.

This seems like a crystal clear case for optimization, unless I am missing
something that you should kindly explain to  me in more detail.

Thanks, Reza.


> different if gcc is given other options like -mtune=i386.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]