patch for k6_cost
Jan Hubicka
hubicka@atrey.karlin.mff.cuni.cz
Sat Apr 17 07:05:00 GMT 1999
Hi
Here is another patch to tune -mk6. Now there are no slowdowns compared
to -mpentium on my benchmark suite.
It change k6_cost array to be less optimistic about speed of div and mul
instructions. K6 manual says that mul is cheap, because it takes only
2 cycles, but forgets to note, that it takes another 2 cycles to decode
and decoding is the problem....
I've also added comment tescribing how I got those magic values.
Honza
Fri Apr 16 22:47:09 CEST 1999
* i386.c (k6_cost): Take into account the decoding time.
*** i386.c.old Fri Apr 16 20:48:36 1999
--- i386.c Fri Apr 16 22:45:00 1999
*************** struct processor_costs pentiumpro_cost =
*** 100,113 ****
17 /* cost of a divide/mod */
};
struct processor_costs k6_cost = {
1, /* cost of an add instruction */
1, /* cost of a lea instruction */
1, /* variable shift costs */
1, /* constant shift costs */
! 2, /* cost of starting a multiply */
0, /* cost of multiply per each bit set */
! 18 /* cost of a divide/mod */
};
struct processor_costs *ix86_cost = &pentium_cost;
--- 100,118 ----
17 /* cost of a divide/mod */
};
+ /* We use decoding time together with execution time.
+ To get correct vale add 1 for short decodable, 2 for long decodable
+ and 4 for vector decodable instruction to execution time and divide
+ by two (because CPU is able to do two insns at a time). */
+
struct processor_costs k6_cost = {
1, /* cost of an add instruction */
1, /* cost of a lea instruction */
1, /* variable shift costs */
1, /* constant shift costs */
! 3, /* cost of starting a multiply */
0, /* cost of multiply per each bit set */
! 20 /* cost of a divide/mod */
};
struct processor_costs *ix86_cost = &pentium_cost;
More information about the Gcc-patches
mailing list