This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Patch to global.c to use regclass' costs instead of N_REFS


Hi
This patch changes global alloc to use cost calculated by reg-class instead
of N_REFS as base for register preferrencing.

Why I am doing that?

I believe that it is stupid to maitain two cost caluclations that means actually
the same. Global alloc ought to depend the allocation on the gain get by
putting given pseudo to register and thats exactly what reg-class calculates
for it's preferrencing choices.

The values calculated by reg-class can be more sensible in various cases:

  1) it can take into account what insns supports memory operands and what not.
  In case memory operands are noticeably cheaper than loading of register, this
  can help (like on Pentiums fp units). I would like to add target macro to fine
  tune the memory operand costs in near future (in fact I've done it at my local
  tree).  This can be also used to represent that some operatios can be done only
  with aprameter in memory (like i387 integer operands).
  2) Reg-class takes into account extra cost caused by coliding requirements of
  various insns. Sometimes it is cheaper to keep register in memory than do
  reloading from one hard to another again and again. (When for example single
  pseudo is used by both fp and integer unit)
  3) the extra cost for loading parameter from memory (when pseudo is just copy
  of memory location) is calculated.
  4) When loading is cheaper/more expensive than storing, this fact is also taken
  into account.
  5) When different types have different costs (for example HImode spilling is
  less effective than SI or QI mode in i386) this is taken into account
  6) future fine tuning of reclass is possible. It is hard to do any tuning
  on N_REFS.  (for example multiple references into single pseudo in insn can be
  counted just once, because this is how the costs usually work).

How does the current cost diffes from original?

It is basically twice as big (because MOVE_COST is 2), it does not take into
account the mach_dup operands and loop costs are much higher (3^loop_depth
is used by regclass, while loop_depth is used by flow).
The last will probably need most experiments.

How does it affect performance/size?

In performance there are changes in both directions, as you would expect from
such kind of change.  I am certainly getting more improvements than loses.
Cca 6% plus on byte bechmarks at my home tree (all the patches here are 
actually re-make of it, because if was dirty).
The code is slightly bigger with -O2 mainly because of increased loop costs.
(20kb on cc1 binarry).
Code is noticeably smaller with -Os, because the memory operands are cheaper.

More experiments and tunning are certainly needed.

Honza

Mon Nov 22 18:53:25 MET 1999  Jan Hubicka  <hubicka@freesoft.cz>
	* flow.c (dump_flow_info): Dump register's priority too.
	* global.c (allocno structure): New field "priority".
	(global_alloc): Calculate priority field, do not allocate
	registers with negative priority.
	(allocno_compare): Use priority instead of n_refs.
	* regclass.c (reg_pref): New field "priority".
	(reg_priority): New function.
	(regclass): Set register's priority.
	* rtl.h (reg_priority): Declare.

Index: egcs/gcc/flow.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/flow.c,v
retrieving revision 1.189
diff -c -3 -p -r1.189 flow.c
*** flow.c	1999/11/18 06:45:55	1.189
--- flow.c	1999/11/22 17:38:24
*************** dump_flow_info (file)
*** 4967,4972 ****
--- 4967,4973 ----
  	      fprintf (file, "; pref %s, else %s",
  		       reg_class_names[(int) class],
  		       reg_class_names[(int) altclass]);
+ 	    fprintf (file, "; priority %d", reg_priority (i));
  	  }
  	if (REGNO_POINTER_FLAG (i))
  	  fprintf (file, "; pointer");
Index: egcs/gcc/global.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/global.c,v
retrieving revision 1.48
diff -c -3 -p -r1.48 global.c
*** global.c	1999/11/21 12:53:31	1.48
--- global.c	1999/11/22 17:38:25
*************** struct allocno
*** 90,95 ****
--- 90,98 ----
       pseudo reg.  */
    int size;
  
+   /* Priority for allocation.  */
+   int priority;
+ 
    /* Number of calls crossed by each allocno.  */
    int calls_crossed;
  
*************** global_alloc (file)
*** 451,456 ****
--- 454,460 ----
  	allocno[num].size = PSEUDO_REGNO_SIZE (i);
  	allocno[num].calls_crossed += REG_N_CALLS_CROSSED (i);
  	allocno[num].n_refs += REG_N_REFS (i);
+ 	allocno[num].priority += reg_priority (i);
  	if (allocno[num].live_length < REG_LIVE_LENGTH (i))
  	  allocno[num].live_length = REG_LIVE_LENGTH (i);
        }
*************** global_alloc (file)
*** 553,559 ****
        /* Try allocating them, one by one, in that order,
  	 except for parameters marked with reg_live_length[regno] == -2.  */
  
!       for (i = 0; i < (size_t) max_allocno; i++)
  	if (reg_renumber[allocno[allocno_order[i]].reg] < 0
  	    && REG_LIVE_LENGTH (allocno[allocno_order[i]].reg) >= 0)
  	  {
--- 557,564 ----
        /* Try allocating them, one by one, in that order,
  	 except for parameters marked with reg_live_length[regno] == -2.  */
  
!       for (i = 0; i < (size_t) max_allocno
! 		  && allocno[allocno_order[i]].priority > 0; i++)
  	if (reg_renumber[allocno[allocno_order[i]].reg] < 0
  	    && REG_LIVE_LENGTH (allocno[allocno_order[i]].reg) >= 0)
  	  {
*************** allocno_compare (v1p, v2p)
*** 608,621 ****
       the value of floor_log2 times the maximum number of
       times a register can occur in one insn (surely less than 100).
       Multiplying this by 10000 can't overflow.  */
!   register int pri1
!     = (((double) (floor_log2 (allocno[v1].n_refs) * allocno[v1].n_refs)
! 	/ allocno[v1].live_length)
!        * 10000 * allocno[v1].size);
!   register int pri2
!     = (((double) (floor_log2 (allocno[v2].n_refs) * allocno[v2].n_refs)
! 	/ allocno[v2].live_length)
!        * 10000 * allocno[v2].size);
    if (pri2 - pri1)
      return pri2 - pri1;
  
--- 613,630 ----
       the value of floor_log2 times the maximum number of
       times a register can occur in one insn (surely less than 100).
       Multiplying this by 10000 can't overflow.  */
!   register int pri1 = allocno[v1].priority;
!   register int pri2 = allocno[v2].priority;
! 
!   /* Registers with negative priority will not be allocated, so their
!      order doesn't matter much.  Just ensure that they are last.  */
!   if (pri1 >= 0 && pri2 >= 0)
!     {
!       pri1 = (((double) (floor_log2 (pri1) * pri1) / allocno[v1].live_length)
! 	      * 10000 * allocno[v1].size);
!       pri2 = (((double) (floor_log2 (pri2) * pri2) / allocno[v2].live_length)
! 	      * 10000 * allocno[v2].size);
!     }
    if (pri2 - pri1)
      return pri2 - pri1;
  
Index: egcs/gcc/regclass.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/regclass.c,v
retrieving revision 1.72
diff -c -3 -p -r1.72 regclass.c
*** regclass.c	1999/11/22 13:43:39	1.72
--- regclass.c	1999/11/22 17:38:26
*************** struct reg_pref
*** 675,680 ****
--- 675,684 ----
       but since it is recommended that there be a class corresponding to the
       union of most major pair of classes, that generality is not required.  */
    char altclass;
+ 
+   /* Cost we save by putting register into it's preferred class.
+      Negative values means, that it is better to allocate register on stack.  */
+   int priority;
  };
  
  /* Record the cost of each class for each pseudo.  */
*************** reg_alternate_class (regno)
*** 744,749 ****
--- 756,769 ----
    return (enum reg_class) reg_pref[regno].altclass;
  }
  
+ int
+ reg_priority (regno)
+      int regno;
+ {
+   if (reg_pref == 0)
+     abort();
+   return reg_pref[regno].priority;
+ }
  /* Initialize some global data for this pass.  */
  
  void
*************** regclass (f, nregs, dump)
*** 1138,1143 ****
--- 1801,1809 ----
  	  if (alt == best)
  	    alt = NO_REGS;
  
+ 	  /* Update priority only in first pass, when costs are caluclated correctly.  */
+ 	  reg_pref[i].priority = p->mem_cost - p->cost[(int) best];
+ 
  	  /* We cast to (int) because (char) hits bugs in some compilers.  */
  	  reg_pref[i].prefclass = (int) best;
  	  reg_pref[i].altclass = (int) alt;
Index: egcs/gcc/rtl.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/rtl.h,v
retrieving revision 1.157
diff -c -3 -p -r1.157 rtl.h
*** rtl.h	1999/11/21 12:33:17	1.157
--- rtl.h	1999/11/22 17:38:36
*************** extern char *decode_asm_operands	PROTO((
*** 1165,1170 ****
--- 1165,1171 ----
  
  extern enum reg_class reg_preferred_class PROTO((int));
  extern enum reg_class reg_alternate_class PROTO((int));
+ extern int reg_priority PROTO((int));
  
  extern rtx get_first_nonparm_insn	PROTO((void));
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]