Combine four insns

Richard Guenther richard.guenther@gmail.com
Fri Aug 6 15:04:00 GMT 2010


On Fri, Aug 6, 2010 at 4:48 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> I was slightly bored while waiting for some SPEC runs, so I played with
> the combiner a little.  The following extends it to do four-insn
> combinations.
>
> Conceptually, the following is one motivation for the change: consider a
> RISC target (probably not very prevalent when the combiner was written),
> where arithmetic ops don't allow constants, only registers.  Then, to
> combine multiple such operations into one (or rather, two), you need a
> four-instruction window.  This is what happens e.g. on Thumb-1; PR42172
> is such an example.  We have
>
>        ldrb    r3, [r0]
>        mov     r2, #7
>        bic     r3, r2
>        add     r2, r2, #49
>        bic     r3, r2
>        sub     r2, r2, #48
>        orr     r3, r2
>        add     r2, r2, #56
>        bic     r3, r2
>        add     r2, r2, #63
>        and     r3, r2
>        strb    r3, [r0]
>
> which can be optimized into
>
>        mov     r3, #8
>        strb    r3, [r0]
>
> by the patch below.  I'm attaching a file with a few more examples I
> found.  The same patterns occur quite frequently - several times e.g. in
> Linux TCP code.
>
> The downside is a compile-time cost, which appears to be about 1% user
> time on a full bootstrap.  To put that in persepective, it's 12s of real
> time.
>
> real 16m13.446s user 103m3.607s sys 3m2.235s
> real 16m25.534s user 104m0.686s sys 3m4.158s
>
> I'd argue that compile-time shouldn't be our top priority, as it's one
> of the few things that still benefits from Moore's Law, while the same
> may not be true for the programs we compile.  I expect people will argue
> a 1% slowdown is unacceptable, but in that case I think we need to
> discuss whether users are complaining about slow compiles more often
> than they complain about missed optimizations - in my experience the
> reverse is true.
>
> Bootstrapped on i686-linux, a slightly earlier version also
> regression-tested.

Do you have statistics how many two, three and four insn combinations
a) are tried, b) can be validated, c) are used in the end, for example
during a GCC bootstrap?

It might make sense to restrict 4 insn combinations to
-fexpensive-optimizations (thus, not enable it at -O1).

Richard.

>
> Bernd
>



More information about the Gcc-patches mailing list