patch for k6_cost

Jan Hubicka hubicka@atrey.karlin.mff.cuni.cz
Sat Apr 17 07:05:00 GMT 1999


Hi
Here is another patch to tune -mk6. Now there are no slowdowns compared
to -mpentium on my benchmark suite.
It change k6_cost array to be less optimistic about speed of div and mul
instructions.  K6 manual says that mul is cheap, because it takes only
2 cycles, but forgets to note, that it takes another 2 cycles to decode
and decoding is the problem....

I've also added comment tescribing how I got those magic values.

Honza

Fri Apr 16 22:47:09 CEST 1999
	* i386.c (k6_cost): Take into account the decoding time.

*** i386.c.old	Fri Apr 16 20:48:36 1999
--- i386.c	Fri Apr 16 22:45:00 1999
*************** struct processor_costs pentiumpro_cost =
*** 100,113 ****
    17					/* cost of a divide/mod */
  };
  
  struct processor_costs k6_cost = {
    1,					/* cost of an add instruction */
    1,					/* cost of a lea instruction */
    1,					/* variable shift costs */
    1,					/* constant shift costs */
!   2,					/* cost of starting a multiply */
    0,					/* cost of multiply per each bit set */
!   18					/* cost of a divide/mod */
  };
  
  struct processor_costs *ix86_cost = &pentium_cost;
--- 100,118 ----
    17					/* cost of a divide/mod */
  };
  
+ /* We use decoding time together with execution time. 
+    To get correct vale add 1 for short decodable, 2 for long decodable
+    and 4 for vector decodable instruction to execution time and divide
+    by two (because CPU is able to do two insns at a time). */
+ 
  struct processor_costs k6_cost = {
    1,					/* cost of an add instruction */
    1,					/* cost of a lea instruction */
    1,					/* variable shift costs */
    1,					/* constant shift costs */
!   3,					/* cost of starting a multiply */
    0,					/* cost of multiply per each bit set */
!   20					/* cost of a divide/mod */
  };
  
  struct processor_costs *ix86_cost = &pentium_cost;


More information about the Gcc-patches mailing list