Serious code size regression from 3.0.2 to now

tm tm@mail.kloo.net
Tue Jul 23 16:50:00 GMT 2002



On 23 Jul 2002, Geoff Keating wrote:

> 
> It sounds like at least one of the problems is that very short
> sequences of code are being made to each align at the start of a cache line,
> but if you have (for example, not any particular real processor)
> 
>         .align 5
>         mov r0,#1
>         b L123
>         .align 5
>         mov r0,#2
>         b L124
> ...
> 
> you don't need the second '.align', since the next chunk will be fully
> contained in a cache line.
> 
> If this is done properly, most short code sequences won't need extra
> alignment...
> 
> One tricky bit is that sometimes multiple alignments are helpful, for instance
> 
>         .align 5
>         b L123
>         .align 5
>         mov r0,#2
>         b L124
> 
> the middle '.align' might need to be a '.align 3' (rather than
> deleted) if it will help to keep code aligned to two-instruction
> boundaries.  This is still much better than the original, though.
> 
> This would be a helpful project.  Perhaps it should go on the projects page?
> 
> 

Doh, I forgot to "Reply All" when sending previous message, so I'm
resending.

The culprit appears to be the basic block reordering pass. These
cache-aligned blocks with only two instructions only appear after
BBRO. BBRO seems to determine these isntructions are unlikely to be
executed, and pulls them out-of-line into their own separate code block.

Unfortunately, these code blocks later become cache-aligned.

I can see two possible solutions for this problem:

1) Tweak BBRO heuristics to prevent out-of-lining short code sequences
   of less than n insns

2) Suppress cache alignment for BBRO-generated blocks

Are there any other possible solutiosn?

Also, I propose BBRO be disabled when optimizing for size (-Os).

Toshi





More information about the Gcc-bugs mailing list