Here is a 20 % increase code size, because the compiler tries too much to jump to UI_plotHline() instead of just calling the function and then doing a standard return at the end of drawbutton()... [etienne@localhost projet]$ cat tmp.c typedef struct { unsigned short x, y; /* x should be the easyest to read */ } __attribute__ ((packed)) coord; extern inline void UI_plotHline (coord xy, unsigned short xend, unsigned color) { extern UI_function_plotHline (coord xy, unsigned xend, unsigned color); UI_function_plotHline (xy, xend, color); } extern inline void UI_setpixel (coord xy, unsigned color) { extern UI_function_setpixel (coord xy, unsigned color); UI_function_setpixel (xy, color); } extern inline void bound_stack (void) { /* * limit included - but add 2 to high limit for reg16, and 4 for reg32 * if not in bound, exception #BR generated (INT5). * iret from INT5 will retry the bound instruction. */ extern unsigned STATE_stack_limit; asm volatile (" bound %%esp,%0 " : : "m" (STATE_stack_limit) ); } void drawbutton (coord upperleft, coord lowerright, unsigned upperleftcolor, unsigned lowerrightrcolor, unsigned fillcolor, unsigned drawbackground) {bound_stack();{ /* Enlarge the button by few pixels: */ upperleft.x -= 2; lowerright.x += 2; lowerright.y -= 1; /* do not overlap two consecutive lines */ UI_plotHline (upperleft, lowerright.x, upperleftcolor); /* top line */ /* do not change VESA1 banks too often, process horizontally, left to right, line per line */ for (;;) { upperleft.y += 1; if (upperleft.y >= lowerright.y) break; UI_setpixel (upperleft, upperleftcolor); if (drawbackground) UI_plotHline (((coord) { .x = upperleft.x + 1, .y = upperleft.y }), lowerright.x - 1, fillcolor); UI_setpixel (((coord) { .x = lowerright.x - 1, .y = upperleft.y }), lowerrightrcolor); } UI_plotHline (upperleft, lowerright.x, lowerrightrcolor); /* bottom line */ }} [etienne@localhost projet]$ gcc -v Using built-in specs. Target: i386-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=i386-redhat-linux Thread model: posix gcc version 4.1.0 20060304 (Red Hat 4.1.0-3) [etienne@localhost projet]$ gcc -Os -c tmp.c && size *.o text data bss dec hex filename 276 0 0 276 114 tmp.o [etienne@localhost projet]$ toolchain-3.4.5/bin/gcc -v Reading specs from /home/etienne/projet/toolchain-3.4.5/bin/../lib/gcc/i686-pc-linux-gnu/3.4.5/specs Configured with: ../configure --prefix=/home/etienne/projet/toolchain --enable-languages=c Thread model: posix gcc version 3.4.5 [etienne@localhost projet]$ toolchain-3.4.5/bin/gcc -Os -c tmp.c && size *.o text data bss dec hex filename 227 0 0 227 e3 tmp.o [etienne@localhost projet]$
This has nothing to do with tail return optimization as 3.4.0 also did it.
etienne@cygne:~/projet/gujin$ /home/etienne/projet/toolchain/bin/gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../configure --prefix=/home/etienne/projet/toolchain --enable-languages=c Thread model: posix gcc version 4.1.1 20060517 (prerelease) etienne@cygne:~/projet/gujin$ /home/etienne/projet/toolchain/bin/gcc tmp1.c -Os -c -o tmp.o && size tmp.o text data bss dec hex filename 279 0 0 279 117 tmp.o etienne@cygne:~/projet/gujin$ /home/etienne/projet/toolchain/bin/gcc tmp1.c -Os -c -o tmp.o -fno-optimize-sibling-calls && size tmp.o text data bss dec hex filename 251 0 0 251 fb tmp.o So "tail return optimization" disabled by -fno-optimize-sibling-calls has at least something to do with the size increase. Note that this new GCC-4.1.1 prerelease also produce such code: addl $12, %esp leal -12(%ebp), %esp pop allregs ret
rguenther@murzim:/tmp> gcc-3.3 -Os -c t.i -m32 rguenther@murzim:/tmp> size t.o text data bss dec hex filename 222 0 0 222 de t.o rguenther@murzim:/tmp> gcc-4.1 -Os -c t.i -m32 rguenther@murzim:/tmp> size t.o text data bss dec hex filename 280 0 0 280 118 t.o rguenther@murzim:/tmp> gcc-4.3 -Os -c t.i -m32 rguenther@murzim:/tmp> size t.o text data bss dec hex filename 269 0 0 269 10d t.o rguenther@murzim:/tmp> gcc-4.4 -Os -c t.i -m32 rguenther@murzim:/tmp> size t.o text data bss dec hex filename 290 0 0 290 122 t.o rguenther@murzim:/tmp> gcc-4.5 -Os -c t.i -m32 rguenther@murzim:/tmp> size t.o text data bss dec hex filename 237 0 0 237 ed t.o
With current mainline h@gcc17:~/trunk/build/gcc$ size pa.o text data bss dec hex filename 211 0 0 211 d3 pa.o I get gcc-4.1 also producing 480 bytes. I however need -fno-asynchornous-unwind-tables. Otherwise I get: jh@gcc17:~/trunk/build/gcc$ size pa.o text data bss dec hex filename 347 0 0 347 15b pa.o jh@gcc17:~/trunk/build/gcc$ objdump -h pa.o pa.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 000000d3 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 00000108 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000108 2**2 ALLOC 3 .eh_frame 00000088 00000000 00000000 00000108 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA 4 .comment 0000002a 00000000 00000000 00000190 2**0 CONTENTS, READONLY 5 .note.GNU-stack 00000000 00000000 00000000 000001ba 2**0 CONTENTS, READONLY why .eh_frame is counted as text?
Seems to be fixed for 4.6. Now on to a way to add object-size testcases ...