assembler code generated by gcc

Mon Dec 29 02:47:00 GMT 2003

Stefan Müller <mail@s-mueller.ch> writes:

> if I compile the following function (with -O2):
> 
> func(char *sm){
>         char buffer[256];
>         int i;
>         for(i=0; i<=255; i++) buffer[i] = sm[i];
> }
> 
> it gives me this assembler code:
> 
> 0x08048358 <func+0>:    push   %ebp
> 0x08048359 <func+1>:    mov    %esp,%ebp
> 0x0804835b <func+3>:    sub    $0x108,%esp
> 0x08048361 <func+9>:    mov    0x8(%ebp),%ecx
> 0x08048364 <func+12>:   xor    %edx,%edx
> 0x08048366 <func+14>:   mov    %esi,%esi
> 0x08048368 <func+16>:   mov    (%edx,%ecx,1),%al
> 0x0804836b <func+19>:   mov    %al,0xfffffef8(%edx,%ebp,1)
> 0x08048372 <func+26>:   inc    %edx
> 0x08048373 <func+27>:   cmp    $0xff,%edx
> 0x08048379 <func+33>:   jle    0x8048368 <func+16>
> 0x0804837b <func+35>:   leave
> 0x0804837c <func+36>:   ret
> 0x0804837d <func+37>:   lea    0x0(%esi),%esi
> 
> gcc allocates 264 (sub $0x108,%esp) bytes on the stack. But only the lower 256 
> are used.  Without the gcc parameter "-O2" even 280 Bytes on the stack are 
> reserved.
> For what are those unused bytes?

This happens because gcc arranges to align the buffer array on a 128
bit (16 byte) boundary.  If we assume that the stack pointer on
function entry is aligned to a 16 byte boundary, then after the return
address and %ebp are pushed on the stack, gcc needs to adjust by
another 8 bytes to put the array on a 16 byte boundary.

Why does gcc want to align the buffer array?  Because
assign_stack_temp_for_type() in function.c sets the alignment for a
variable of type BLKmode to be BIGGEST_ALIGNMENT, which in i386.h is
defined to be 128 (bits).  Why does assign_stack_temp_for_type() use
BIGGEST_ALIGNMENT instead of just using TYPE_ALIGN (type)?  I don't
know.  It's worked that way since the function was introduced here:

http://gcc.gnu.org/ml/gcc-patches/1999-02n/msg00098.html

Thu Feb 11 00:08:17 1999  John Wehle  (john@feith.com)

	* i386.h (LOCAL_ALIGNMENT): Define.
	* function.c (assign_stack_local, assign_outer_stack_local): Use it.
	(assign_stack_temp_for_type): New function based on assign_stack_temp.
	(assign_stack_temp): Call it.
	(assign_temp): Use assign_stack_temp_for_type, not assign_stack_temp.
	* stmt.c: Use assign_temp, not assign_stack_temp.
	* tm.texi: Document LOCAL_ALIGNMENT.

There is a hook for the backend to override the alignment based on the
type, but the i386 backend never uses that hook to decrease the
alignment.

> And what does "mov %esi,%esi" do? Nothing?

Correct.  It does nothing.  If you look at the generated assembler
code, using -S or --save-temps, you will see why it is there: because
gcc wants to align the loop to a 16 byte boundary.  That is what the
i386.c backend prefers when optimizing.

> When will the "lea 0x0(%esi),%esi" instuction be executed?

Never.  That instruction was not even generated by the compiler.  It
was inserted by the assembler to align for the next function.  When
the assembler aligns in the .text section, it uses nop instructions of
the appropriate size.

> There's an other small program:
> 
> long getesp() {
> __asm__("movl %esp,%eax");
> }
> 
> void main() {
>         printf("%08X\n",getesp());
> }
> 
> Everytime I execute it, it gives me a slightly different value. Shouldn't the 
> esp register be the same value everytime?

It depends on your operating system, which you neglected to mention.
Some operating systems these days randomize the stack location, to
make it more difficult to attack programs which have stack overflow
bugs.

Ian