This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] cse.c: Speed up cse_reg_info maintainance - Part 2


Hi,

Attached is the second patch to speed up cse_reg_info maintenance.

A collection of cse_reg_info instances forms a mapping from a register
number to a quantity number among other things.  Since this mapping is
from an integer to a struct, you might guess that it is implemented as
an array indexed by a register number, but that's not the case.
Instead, CSE has a fixed-sized hash table which maps a register number
to a linked list of cse_reg_info entries with the same hash key.  If
you are looking for information associated with a register, you have
to go through one of these collision chains.

This patch simplifies all this by implementing the collection as a
single big array of cse_reg_info called cse_reg_info_table indexed by
a register number.  For example, if you are looking for a quantity
number associated with (reg:xx 37), you would say

  cse_reg_info_table[37].reg_qty

Very simple.  Unfortunately the story doesn't end here.  Here are
problems that we have to address:

1. We don't know the highest register number at compile time of GCC.

2. We forget everything at the beginning of a basic block, including
   the mapping from register numbers to quantity numbers, etc.  In
   other words, invalidation of this new table, cse_reg_info_table,
   must be fast.

To address the first issue, I keep doubling the size of the table
until it is big enough to store NREGS registers, where NREGS is
obtained from one of the arguments to cse_main.

To address the second issue, I am using a timestamp.  I keep a global
variable cse_reg_info_timestamp.  Also, I add a new member timestamp
to cse_reg_info.  When a cse_reg_info entry is allocated, its
timestamp entry is initialized to cse_reg_info_timestamp - 1,
indicating that the entry has not been initialized.  When
get_cse_reg_info visits this entry for the first time, we set its
timestamp to cse_reg_info_timestamp and initialize its other members.
When we start scanning a new basic block, we increment
cse_reg_info_timestamp, making it different from the timestamp values
in all existing cse_reg_info entries.  That is, the timestamp field of
a cse_reg_info entry matches the value of this variable if and only if
the entry has been initialized during the current run of
cse_basic_block.

Here are some remarks.

The new implmentation has better spatial locality because we use a
single big array of cse_reg_info instead of a bunch of entries
allocated one by one with malloc.

It's probably reasonable to assume that the set of registers numbers
is compact.  I expect most dead code to be eliminated at tree level.
Also, most constants should be propagated at tree level, so we
shouldn't have many pseudo registers that are created but gone due to
DCE, constant propagation, etc.  This property should even improve the
locality.

Here is a timing in seconds for five runs of ./cc1 -quiet -O2 -o
/dev/null.

             original patched   diff%
c-common.i     18.544  18.439 -0.566%
combine.i      17.541  17.492 -0.279%
fold-const.i   38.604  38.437 -0.432%
reload.i       13.601  13.593 -0.058%
reload1.i      14.425  14.407 -0.124%
cc1-i files   214.556 213.992 -0.262%

cc1-i files were compiled only once.

Tested on i686-pc-linux-gnu.  I confirmed that the patched GCC
produces identical code as the unpatched version while compiling cc1-i
files.  OK to apply?

Kazu Hirata

2005-01-30  Kazu Hirata  <kazu@cs.umass.edu>

	* cse.c (cse_reg_info): Remove hash_next, next, regno.  Add
	timestamp.
	(cse_reg_info_list, cse_reg_info_list_free, REGHASH_SHIFT,
	REGHASH_SIZE, REGHASH_MASK, reg_hash, REGHASH_FN,
	cached_cse_reg_info, GET_CSE_REG_INFO): Remove.
	(cached_regno): Initialize to INVALID_REGNUM.
	(REG_TICK, REG_IN_TABLE, SUBREG_TICKED, REG_QTY): Use
	get_cse_reg_info.
	(init_cse_reg_info, get_cse_reg_info_1): New.
	(get_cse_reg_info): Cache the last look-up.
	(new_basic_block): Update the code to clear mappings from
	registers to cse_reg_info entries.
	(cse_main): Call init_cse_reg_info.

Index: cse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cse.c,v
retrieving revision 1.336
diff -c -d -p -r1.336 cse.c
*** cse.c	29 Jan 2005 12:08:04 -0000	1.336
--- cse.c	30 Jan 2005 06:41:29 -0000
*************** static struct reg_eqv_elem *reg_eqv_tabl
*** 302,315 ****
  
  struct cse_reg_info
  {
!   /* Next in hash chain.  */
!   struct cse_reg_info *hash_next;
! 
!   /* The next cse_reg_info structure in the free or used list.  */
!   struct cse_reg_info *next;
! 
!   /* Search key */
!   unsigned int regno;
  
    /* The quantity number of the register's current contents.  */
    int reg_qty;
--- 302,309 ----
  
  struct cse_reg_info
  {
!   /* The timestamp at which this register is initialized.  */
!   unsigned int timestamp;
  
    /* The quantity number of the register's current contents.  */
    int reg_qty;
*************** struct cse_reg_info
*** 329,367 ****
    unsigned int subreg_ticked;
  };
  
! /* We maintain a linked list of cse_reg_info instances, which is
!    partitioned into two pieces.  The first part, pointed to by
!    cse_reg_info_list, is a list of those entries that are in use.  The
!    second part, pointed to by cse_reg_info_list_free, is a list of
!    those entries that are not in use.
! 
!    We combine these two parts into one linked list for efficiency.
!    Specifically, when we take an element from the second part and want
!    to move it to the first part, all we have to do is move the pointer
!    cse_reg_info_list_free to the next element.  Also, if we wish to
!    move all elements into the second part, we just have to move the
!    pointer to the first element of the list.  */
! 
! /* A linked list of cse_reg_info entries that have been allocated so
!    far.  */
! static struct cse_reg_info *cse_reg_info_list;
! 
! /* A pointer to the first unused entry in the above linked list.  */
! static struct cse_reg_info *cse_reg_info_list_free;
  
! /* A mapping from registers to cse_reg_info data structures.  */
! #define REGHASH_SHIFT	7
! #define REGHASH_SIZE	(1 << REGHASH_SHIFT)
! #define REGHASH_MASK	(REGHASH_SIZE - 1)
! static struct cse_reg_info *reg_hash[REGHASH_SIZE];
  
! #define REGHASH_FN(REGNO)	\
! 	(((REGNO) ^ ((REGNO) >> REGHASH_SHIFT)) & REGHASH_MASK)
  
! /* The last lookup we did into the cse_reg_info_tree.  This allows us
     to cache repeated lookups.  */
! static unsigned int cached_regno;
! static struct cse_reg_info *cached_cse_reg_info;
  
  /* A HARD_REG_SET containing all the hard registers for which there is
     currently a REG expression in the hash table.  Note the difference
--- 323,345 ----
    unsigned int subreg_ticked;
  };
  
! /* A table of cse_reg_info indexed by register numbers.  */
! struct cse_reg_info *cse_reg_info_table;
  
! /* The size of the above table.  */
! static unsigned int cse_reg_info_table_size;
  
! /* The timestamp at the beginning of the current run of
!    cse_basic_block.  We increment this variable at at the beginning of
!    the current run of cse_basic_block.  The timestamp field of a
!    cse_reg_info entry matches the value of this variable if and only
!    if the entry has been initialized during the current run of
!    cse_basic_block.  */
! static unsigned int cse_reg_info_timestamp;
  
! /* The last lookup we did into the cse_reg_info_table.  This allows us
     to cache repeated lookups.  */
! static unsigned int cached_regno = INVALID_REGNUM;
  
  /* A HARD_REG_SET containing all the hard registers for which there is
     currently a REG expression in the hash table.  Note the difference
*************** struct table_elt
*** 523,551 ****
  #define COST(X) (REG_P (X) ? 0 : notreg_cost (X, SET))
  #define COST_IN(X,OUTER) (REG_P (X) ? 0 : notreg_cost (X, OUTER))
  
- /* Get the info associated with register N.  */
- 
- #define GET_CSE_REG_INFO(N)			\
-   (((N) == cached_regno && cached_cse_reg_info)	\
-    ? cached_cse_reg_info : get_cse_reg_info ((N)))
- 
  /* Get the number of times this register has been updated in this
     basic block.  */
  
! #define REG_TICK(N) ((GET_CSE_REG_INFO (N))->reg_tick)
  
  /* Get the point at which REG was recorded in the table.  */
  
! #define REG_IN_TABLE(N) ((GET_CSE_REG_INFO (N))->reg_in_table)
  
  /* Get the SUBREG set at the last increment to REG_TICK (-1 if not a
     SUBREG).  */
  
! #define SUBREG_TICKED(N) ((GET_CSE_REG_INFO (N))->subreg_ticked)
  
  /* Get the quantity number for REG.  */
  
! #define REG_QTY(N) ((GET_CSE_REG_INFO (N))->reg_qty)
  
  /* Determine if the quantity number for register X represents a valid index
     into the qty_table.  */
--- 501,523 ----
  #define COST(X) (REG_P (X) ? 0 : notreg_cost (X, SET))
  #define COST_IN(X,OUTER) (REG_P (X) ? 0 : notreg_cost (X, OUTER))
  
  /* Get the number of times this register has been updated in this
     basic block.  */
  
! #define REG_TICK(N) (get_cse_reg_info (N)->reg_tick)
  
  /* Get the point at which REG was recorded in the table.  */
  
! #define REG_IN_TABLE(N) (get_cse_reg_info (N)->reg_in_table)
  
  /* Get the SUBREG set at the last increment to REG_TICK (-1 if not a
     SUBREG).  */
  
! #define SUBREG_TICKED(N) (get_cse_reg_info (N)->subreg_ticked)
  
  /* Get the quantity number for REG.  */
  
! #define REG_QTY(N) (get_cse_reg_info (N)->reg_qty)
  
  /* Determine if the quantity number for register X represents a valid index
     into the qty_table.  */
*************** static rtx cse_basic_block (rtx, rtx, st
*** 647,653 ****
  static void count_reg_usage (rtx, int *, int);
  static int check_for_label_ref (rtx *, void *);
  extern void dump_class (struct table_elt*);
! static struct cse_reg_info * get_cse_reg_info (unsigned int);
  static int check_dependence (rtx *, void *);
  
  static void flush_hash_table (void);
--- 619,626 ----
  static void count_reg_usage (rtx, int *, int);
  static int check_for_label_ref (rtx *, void *);
  extern void dump_class (struct table_elt*);
! static void get_cse_reg_info_1 (unsigned int regno);
! static struct cse_reg_info * get_cse_reg_info (unsigned int regno);
  static int check_dependence (rtx *, void *);
  
  static void flush_hash_table (void);
*************** notreg_cost (rtx x, enum rtx_code outer)
*** 862,910 ****
  }
  
  
! static struct cse_reg_info *
! get_cse_reg_info (unsigned int regno)
  {
!   struct cse_reg_info **hash_head = &reg_hash[REGHASH_FN (regno)];
!   struct cse_reg_info *p;
  
!   for (p = *hash_head; p != NULL; p = p->hash_next)
!     if (p->regno == regno)
!       break;
  
!   if (p == NULL)
!     {
!       /* Get a new cse_reg_info structure.  */
!       if (cse_reg_info_list_free)
! 	{
! 	  p = cse_reg_info_list_free;
! 	  cse_reg_info_list_free = p->next;
! 	}
!       else
! 	{
! 	  p = xmalloc (sizeof (struct cse_reg_info));
! 	  p->next = cse_reg_info_list;
! 	  cse_reg_info_list = p;
! 	}
  
!       /* Insert into hash table.  */
!       p->hash_next = *hash_head;
!       *hash_head = p;
  
!       /* Initialize it.  */
!       p->reg_tick = 1;
!       p->reg_in_table = -1;
!       p->subreg_ticked = -1;
!       p->reg_qty = -regno - 1;
!       p->regno = regno;
      }
  
!   /* Cache this lookup; we tend to be looking up information about the
!      same register several times in a row.  */
    cached_regno = regno;
!   cached_cse_reg_info = p;
  
!   return p;
  }
  
  /* Clear the hash table and initialize each register with its own quantity,
--- 835,908 ----
  }
  
  
! /* Initialize CSE_REG_INFO_TABLE.  */
! 
! static void
! init_cse_reg_info (unsigned int nregs)
  {
!   /* Do we need to grow the table?  */
!   if (nregs > cse_reg_info_table_size)
!     {
!       unsigned int old_size = cse_reg_info_table_size;
!       unsigned int new_size = old_size ? old_size : 64;
!       unsigned int old_timestamp = cse_reg_info_timestamp - 1;
!       unsigned int i;
  
!       /* Compute a new size that is a power of 2 and no smaller than
! 	 NREGS.  */
!       while (new_size < nregs)
! 	new_size *= 2;
  
!       /* Reallocate the table with NEW_SIZE entries.  */
!       cse_reg_info_table = xrealloc (cse_reg_info_table,
! 				     (sizeof (struct cse_reg_info)
! 				      * new_size));
!       cse_reg_info_table_size = new_size;
  
!       /* Put the old timestamp on all newly allocated entries so that
! 	 they will all be considered out of date.  */
!       for (i = old_size; i < new_size; i++)
! 	cse_reg_info_table[i].timestamp = old_timestamp;
!     }
! }
  
! /* Given REGNO, ensure that a cse_reg_info entry exists for REGNO by
!    growing the cse_reg_info_table and/or initializing the entry for
!    REGNO.  */
! 
! static void
! get_cse_reg_info_1 (unsigned int regno)
! {
!   /* Is the entry invalid?  */
!   if (cse_reg_info_table[regno].timestamp != cse_reg_info_timestamp)
!     {
!       /* Set TIMESTAMP field to CSE_REG_INFO_TIMESTAMP so that this
! 	 entry will be considered to have been initialized.  */
!       cse_reg_info_table[regno].timestamp = cse_reg_info_timestamp;
! 
!       /* Initialize the rest of the entry.  */
!       cse_reg_info_table[regno].reg_tick = 1;
!       cse_reg_info_table[regno].reg_in_table = -1;
!       cse_reg_info_table[regno].subreg_ticked = -1;
!       cse_reg_info_table[regno].reg_qty = -regno - 1;
      }
  
!   /* Record that the last look-up is REGNO.  */
    cached_regno = regno;
! }
  
! /* Find a cse_reg_info entry for REGNO.  */
! 
! static inline struct cse_reg_info *
! get_cse_reg_info (unsigned int regno)
! {
!   /* If we are looking for REGNO that is different from the last
!      look-up, make sure the entry for REGNO exists and has been
!      initialized.  */
!   if (regno != cached_regno)
!     get_cse_reg_info_1 (regno);
! 
!   return &cse_reg_info_table[regno];
  }
  
  /* Clear the hash table and initialize each register with its own quantity,
*************** new_basic_block (void)
*** 917,930 ****
  
    next_qty = 0;
  
!   /* Clear out hash table state for this pass.  */
! 
!   memset (reg_hash, 0, sizeof reg_hash);
! 
!   cse_reg_info_list_free = cse_reg_info_list;
! 
!   cached_cse_reg_info = 0;
  
    CLEAR_HARD_REG_SET (hard_regs_in_table);
  
    /* The per-quantity values used to be initialized here, but it is
--- 915,925 ----
  
    next_qty = 0;
  
!   /* Invalidate cse_reg_info_table and its cache.  */
!   cse_reg_info_timestamp++;
!   cached_regno = INVALID_REGNUM;
  
+   /* Clear out hash table state for this pass.  */
    CLEAR_HARD_REG_SET (hard_regs_in_table);
  
    /* The per-quantity values used to be initialized here, but it is
*************** cse_main (rtx f, int nregs, FILE *file)
*** 6698,6703 ****
--- 6693,6700 ----
    rtx insn = f;
    int i;
  
+   init_cse_reg_info (nregs);
+ 
    val.path = xmalloc (sizeof (struct branch_path)
  		      * PARAM_VALUE (PARAM_MAX_CSE_PATH_LENGTH));
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]