This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: RFC: Idea for code size reduction
On Fri, Mar 07, 2008 at 01:05:03PM +0100, Philipp Marek wrote:
> When wouldn't that possible? My script currently splits on an
> instruction-level -- although I would see no problem that some branch
> jumps into a "half" opcode of another branch, if the byte sequence
> matches.
Consider:
00000000 <bar>:
0: b8 a4 00 00 00 mov $0xa4,%eax
5: ba fc 04 00 00 mov $0x4fc,%edx
a: f7 e2 mul %edx
c: 05 d2 04 00 00 add $0x4d2,%eax
11: c3 ret
...
00020012 <foo>:
20012: 39 d2 cmp %edx,%edx
20014: 75 07 jne 2001d <foo+0xb>
20016: ba fc 04 00 00 mov $0x4fc,%edx
2001b: f7 e2 mul %edx
2001d: 05 d2 04 00 00 add $0x4d2,%eax
20022: c3 ret
If you merge the mov/mul/add/ret sequences by replacing the foo tail
sequence with jmp bar+5, then the jne will branch to wrong place, or
if you try to adjust it, it is too far to reach the target.
> > but even jmp argument is relative, not absolute.
> That's why I take only jumps with 32bit arguments - these are absolute.
No, they are relative.
00000000 <bar>:
0: e9 05 00 02 00 jmp 2000a <foo>
5: e9 00 00 02 00 jmp 2000a <foo>
...
0002000a <foo>:
2000a: 90 nop
See how they are encoded.
> Yes ... I think doing a second compile pass might be the easiest way, and
> not much slower than other solutions.
> (We could always remember which object files could be optimized, and
> only recompile *those*. After all, it's just an additional
> optimization.)
BTW, have you tried to compile the whole kernel with --combine, or at least
e.g. each kernel directory with --combine? I guess that will give you
bigger savings than 30K. Also, stop defining inline to inline
__attribute__((always_inline)), I think Ingo also added such patch recently
and it saved 120K.
Jakub