Re: Local optimization on i386 ?

You do have to be careful with some of these simple peephole optimizations.
CodeWarrior has a peephole phase right before the internal assembler
produces object code, and we've had several released code generation bugs
because of what seemed harmless... consider the sequence mentioned already:

    subl %40, eax
    subl %56, eax

It may seem obvious at first to combine these into a single "subl %96, eax",
but it isn't harmless because the carry and overflow flags would now be set
differently than in the original set of instructions, for example if you
started with eax == 32.  When flags are involved, you need to follow the
flow of control after the instructions and ensure that the flags are killed
by a succeeding instruction before they are consumed.

(Hmmm... this kind of coding could be useful in a switch block to determine
a range of values -- have the two subs, then you know its in the range if
NF=1 and CF=1 after the two subs)

Most mov peepholes are OK, however, since the mov instructions tend to leave
the processor flags alone.

Ben Combee, x86/Win32/Novell/Linux CompilerWarrior
Jamie Lokier
Richard Henderson
Marc Espie
Tuesday, November 30, 1999
Subject: Re: Local optimization on i386 ?

> I've examined GCC output for many years.  I've even done a little
> proprietary work on the x86 machine description.  GCC does many very
> clever optimisations.  Yet I have always been amazed that it still emits
> silly sequences, like "movl %eax,%edx; movl %edx,%eax", and stores
> followed by no loads to stack slots at the end of a function, for
> example.
> Such simple things that GAS, or even sed could peephole away.
> But I also know that after everything has been done, there are /obvious/
> redundancies that could be cleaned up by a peephole and aren't.  Really
> obvious things like redundant stores followed by no uses.

