This is the mail archive of the egcs@egcs.cygnus.com mailing list for the EGCS project. See the EGCS home page for more information.
>On Fri, Mar 05, 1999 at 12:23:46PM -0500, Alfred Perlstein wrote: >> > I any way "movzb? %al,%?ax" and "and? $255,%?ax" takes 1 tick both. >> > So this is a kind of mistery with this instructions. >> >> I think the magic lies in that with register renaming, instruction >> caches and all the 'behind the scenes' optimizations PPro and later >> versions of x86 chips can do. It really should be investigated more. > >It has nothing to do with register renaming. > >It is most likely to be related to instruction alignment -- some >important insn in the loop is straddling a 16-byte boundary, which >requires an extra cycle to decode. > >I've seen such create up to a 20% difference in runtime on a small loop. > It has nothing to deal with para boundary. In movz case xorb insn crosses para boundary while with andl no insn crosses para boundary. Sincerely Yours, Eugene. P.S. For H.J.Lu -- I do not state that things go slower with movz. Slow down I get were 1% (this can be statistical error). Nevertheless there is no speed up in most cases too (or such a huge speed up as with decompression). We should try to find out more why and how this happens. BTW I have PPro 180MHz.