This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Patch to improve x86 FP -> integer conversions
- To: law at cygnus dot com
- Subject: Re: Patch to improve x86 FP -> integer conversions
- From: John Wehle <john at feith dot com>
- Date: Thu, 15 Apr 1999 12:50:45 -0400 (EDT)
- Cc: egcs-patches at egcs dot cygnus dot com
> > Old uops New uops
> >
> > fnstcw -4(%ebp) 3 fnstcw -4(%ebp) 3
> > movl -4(%ebp),%ecx 1 movb $12,-3(%ebp) 1
> > movb $12,%ch 1 fnstcw -6(%ebp) 3
> > movl %ecx,-12(%ebp) 1
> > fldcw -12(%ebp) 3 fldcw -4(%ebp) 3
> > ... ...
> >
> > The old sequence (which required the scratch register) was
> > 9 uops / 3 decode cycles / 2 memory writes / 2 memory reads.
> > The new sequence (which doesn't require the scratch register) is
> > 10 uops / 3 decode cycles / 3 memory writes / 1 memory read.
>
> In the old sequence we generated fewer uops. Of the instructions, only two
> of them were multi-uop insns (which is good if we are able to schedule the
> individual instructions in the sequence since it's more likely the single
> uop insns will be able to go to the 2nd & 3rd decoders).
Currently the insns are not scheduled. The sequence temporarly reprograms
the floating point control register in order to achieve the truncation
so we can't allow other FP insns to mix with the sequence which makes it
hard to allow this sequence to be scheduled. Burning 1 extra uop seemed
to me like a good trade considering that there are only four QImode registers
on the x86. We'll make up that 1 uop if having that extra register prevents
a spill.
> You've replaced 2 writes + 2 reads with 3 writes + 1 read. My experience
> has been that writes are generally more expensive than reads (writeback
> buffers have size limits, serialization issues, etc).
Two of these writes in the new sequence are to the same memory. Doesn't
the x86 merge these into one write? The first read in the old sequence
is a large load following a small store, the second read in the old sequence
is a small load following a large store ... I believe that both cause
stalls on PPro / PII processors.
-- John
-------------------------------------------------------------------------
| Feith Systems | Voice: 1-215-646-8000 | Email: john@feith.com |
| John Wehle | Fax: 1-215-540-5495 | |
-------------------------------------------------------------------------