[PATCH, i386] Enable x86_ext_80387_constants for m_K8, m_CORE2 and m_GENERIC64

Richard Guenther richard.guenther@gmail.com
Mon Nov 27 10:12:00 GMT 2006


On 11/27/06, Uros Bizjak <ubizjak@gmail.com> wrote:
> On 11/26/06, Roger Sayle <roger@eyesopen.com> wrote:
>
> > For CPU tuning patches, I wonder if you could also post timing on some
> > trivial microbenchmark, when possible, to confirm that the published cycle
> > counts are representative and not dominated by other timing issues (decode
> > paths, pipeline stalls, memory latency, etc...) so that we can confirm at
> > least some piece of code will be faster on some CPU in practice.
> > Compiler tuning is often more of an art than a science :-)
>
> Following code can be used to measure latencies of _integer_
> instructions (perhaps the approach will be of general interest,
> otherwise it is  an example of xchg vs rolw timings), but the results
> for FP instructions are totally unpredictable (due to
> out-of-order-execution of FP insns I guess). So for FP, there is no
> other way than to benchmark several million loops of instructions.

More useful is a peek into the relevant docs, which contain this
information (like insn latency and throughput) ;)

> And povray was a bit faster with this patch ;)

This, of course, is a more useful statement - as would be measurements
with any real-life-resembling microbenchmark.

Richard.



More information about the Gcc-patches mailing list