[PATCH, i386]: Generate xlat insn

Jakub Jelinek jakub@redhat.com
Tue Sep 25 10:50:00 GMT 2007


On Tue, Sep 25, 2007 at 08:05:43AM +0200, Uros Bizjak wrote:
> On 9/25/07, Chris Lattner <clattner@apple.com> wrote:
> 
> > > This small patch implements xlat insn. With attached patch,
> > > following testcase:
> 
> > Out of curiosity, why do this at all?  Your example above doesn't
> > make a compelling case for a size win here, and xlat seems like it
> > would always be slower than the equivalent load.  Is there a recent
> > x86 implementation where your first asm is faster than the second chunk?
> 
> Not that I know of. I was just experimenting with xlat, looking for
> optimization opportunities (performance or code size) using this insn.
> It looks that there is only a marginal 1byte code size reduction
> (including push and pop of %ebx) with possible performance
> degradation.

Even the code size reduction is not present.
Compare:

00000000 <test>:
   0:   0f b6 44 24 04          movzbl 0x4(%esp),%eax
   5:   8a 80 00 00 00 00       mov    0x0(%eax),%al
                        7: R_386_32     table
   b:   c3                      ret    

0000000c <test2>:
   c:   0f b6 44 24 04          movzbl 0x4(%esp),%eax
  11:   8b 15 00 00 00 00       mov    0x0,%edx
                        13: R_386_32    table
  17:   8a 04 02                mov    (%edx,%eax,1),%al
  1a:   c3                      ret    

0000001b <test3>:
  1b:   53                      push   %ebx
  1c:   0f b6 44 24 04          movzbl 0x4(%esp),%eax
  21:   8b 1d 00 00 00 00       mov    0x0,%ebx
                        23: R_386_32    table
  27:   d7                      xlat   %ds:(%ebx)
  28:   5b                      pop    %ebx
  29:   c3                      ret    

>From what I can see, the first one (which GCC doesn't generate for -Os,
why?) is 3 bytes smaller than the one with xlat.  And 3 byte movb
vs. 1 byte xlat + push + pop is also the same.

	Jakub



More information about the Gcc-patches mailing list