fixed point math?

Fri Jun 27 13:33:00 GMT 2003

On 26 Jun 2003, Adam Megacz wrote:
> Do you know where I can find a comparison of the time to multiply two
> 32-bit floats vs two 32-bit ints on x86?
>
> <digs around a bit>
>
> Hrm, perhaps it's this:
>
> ...exploits the fact that the P5 architecture allows for scheduling
> integer and floating operations in parallel...

That last sentence is key, and it applies to nearly all modern CPUs.

If you simply compare int/fp instruction timings, you assume a simplistic
processor model that doesn't adapt to deep pipelines or multiple execution
units.

Some time ago I ran some experiments on an Alpha ev56 with strictly
integer code (string hashcode computation IIRC).  After some loop
unrolling the code peaked at 2 insns/cycle.  Although the ev56 has four
execution units, two are fp-only, so I was using just half the processor.
After changing some loop variables to fp registers, the insn scheduler was
able to achieve quad-issue on certain cycles, in spite of the increased
latency (most integer operations need 1-2 cycles on ev56, fp needs at
least 4).

Besides that, FP instructions unlock registers that aren't generally
available to integer codes.

I don't know much about x86, but presumably you can find some timing info
in the .md files (e.g. pentium.md).  Also try compiling with -dS to
generate a dump describing what the scheduler is doing:

gcc -O2 -fschedule-insns -dS foo.c

Jeff