This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, i386] Enable x86_ext_80387_constants for m_K8, m_CORE2 and m_GENERIC64
On 11/27/06, Uros Bizjak <email@example.com> wrote:
On 11/26/06, Roger Sayle <firstname.lastname@example.org> wrote:
> For CPU tuning patches, I wonder if you could also post timing on some
> trivial microbenchmark, when possible, to confirm that the published cycle
> counts are representative and not dominated by other timing issues (decode
> paths, pipeline stalls, memory latency, etc...) so that we can confirm at
> least some piece of code will be faster on some CPU in practice.
> Compiler tuning is often more of an art than a science :-)
Following code can be used to measure latencies of _integer_
instructions (perhaps the approach will be of general interest,
otherwise it is an example of xchg vs rolw timings), but the results
for FP instructions are totally unpredictable (due to
out-of-order-execution of FP insns I guess). So for FP, there is no
other way than to benchmark several million loops of instructions.
More useful is a peek into the relevant docs, which contain this
information (like insn latency and throughput) ;)
And povray was a bit faster with this patch ;)
This, of course, is a more useful statement - as would be measurements
with any real-life-resembling microbenchmark.