This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFC] Speed up invalid_mode_change_p


On Tue, 7 Dec 2010, Vladimir Makarov wrote:

> On 12/07/2010 10:24 AM, Richard Guenther wrote:
> > When looking at PR18687 (again) I noticed that invalid_mode_change_p is
> > pretty high in the profiles, so this patch tries to rework what it does
> > based on the few existing callers.
> > 
> > This patch does two things, first, it changes the interface of
> > invalid_mode_change_p (only used by ira-cost.c) to not pass in
> > a source mode but assume the mode of the pseudo.  Second, it
> > caches and computes the outcome of invalid_mode_change_p at
> > record_subregs_of_mode time, in an array indexed by register class.
> > 
> > As there are usually less register classes than machine modes
> > this should save memory for the flags.  It also avoids the
> > loop over machine modes in invalid_mode_change_p, making the
> > predicate cheaper in favor of doing the equivalent of
> > N_REG_CLASSES checks for each subreg we encounter (we can
> > optimize this with some extra memory cost at record_subregs_of_mode
> > time to avoid the loop for mode pairs we've already seen).
> > 
> > The patch also caches the hashtable lookup (something which can
> > also be done independently).
> > 
> > A even more reworked interface would for example have a
> > sbitmap invalid_mode_change_classes (regno) interface
> > returning a bitmap indexed by reg_class, so the users can
> > use TEST_BIT on it directly.
> > 
> > The patch is a hack, just to make the following measurements
> > possible, I will rework it to be nice but expect some direction
> > from your IRA maintainer folks (maybe you've already done a
> > similar thing).
> > 
> I like the idea, Richard.  So if you want to do it and have time, I really
> appreciate this.
> 
> ira-improv branch has a code which decreases # of calls of
> invalid_mode_change.  So the effect of reworking of the function will be
> smaller when/if the branch is merged into the trunk.  Still implementation of
> your idea will give considerable compilation time improvement (at least for
> this test case) and is really worth to do.

Ok.  I have split the patch into two pieces, one obvious changing the
interface of invalid_mode_change_p (with no performance impact) and
one with reginfo.c local changes only.

Micha suggested to use a single bitmap for all information, this
exploits the sparseness of the information and with the current queries
which are iterate over all regnos should have O(1) bitmap operations.

I use a temporary bitmap to cache whether we have seen a mode change
of the same kind already to avoid the compute loop.  
Even though CANNOT_CHANGE_MODE_CLASS itself should be cheap, caching 
the case of a positive outcome with another bitmap test seems to be
worthwhile for the testcase.

Re-benchmarking the patch on the testcase shows a 2% improvement
and we also see slightly less number of minor pagefaults.

I'll bootstrap and regtest this now.

Do you have any suggestions for further performance testing?  Are
there any targets where we'd have a larger number of register
classes than modes?  For optimized compiles I can't measure an
off-noise effect of the patches.

Ok if the bootstrap + regtest passes?

Thanks,
Richard.

2010-12-08  Richard Guenther  <rguenther@suse.de>

	* rtl.h (invalid_mode_change_p): Adjust prototype.
	* reginfo.c (invalid_mode_change_p): Remove from argument.
	* ira-costs.c (print_allocno_costs): Adjust callers.
	(find_costs_and_classes): Likewise.

Index: trunk/gcc/ira-costs.c
===================================================================
*** trunk.orig/gcc/ira-costs.c	2010-12-08 13:30:37.000000000 +0100
--- trunk/gcc/ira-costs.c	2010-12-08 13:32:17.000000000 +0100
*************** print_allocno_costs (FILE *f)
*** 1093,1100 ****
  	      && (! in_inc_dec[i] || ! forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass,
! 					  PSEUDO_REGNO_MODE (regno))
  #endif
  	      )
  	    {
--- 1093,1099 ----
  	      && (! in_inc_dec[i] || ! forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass)
  #endif
  	      )
  	    {
*************** print_pseudo_costs (FILE *f)
*** 1131,1138 ****
  	      && (! in_inc_dec[regno] || ! forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass,
! 					  PSEUDO_REGNO_MODE (regno))
  #endif
  	      )
  	    fprintf (f, " %s:%d", reg_class_names[rclass],
--- 1130,1136 ----
  	      && (! in_inc_dec[regno] || ! forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 	      && ! invalid_mode_change_p (regno, (enum reg_class) rclass)
  #endif
  	      )
  	    fprintf (f, " %s:%d", reg_class_names[rclass],
*************** find_costs_and_classes (FILE *dump_file)
*** 1336,1343 ****
  		  || (inc_dec_p && forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 		  || invalid_mode_change_p (i, (enum reg_class) rclass,
! 					    PSEUDO_REGNO_MODE (i))
  #endif
  		  )
  		continue;
--- 1334,1340 ----
  		  || (inc_dec_p && forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 		  || invalid_mode_change_p (i, (enum reg_class) rclass)
  #endif
  		  )
  		continue;
*************** find_costs_and_classes (FILE *dump_file)
*** 1412,1419 ****
  			  || (inc_dec_p && forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 			  || invalid_mode_change_p (i, (enum reg_class) rclass,
! 						    PSEUDO_REGNO_MODE (i))
  #endif
  			  )
  			;
--- 1409,1415 ----
  			  || (inc_dec_p && forbidden_inc_dec_class[rclass])
  #endif
  #ifdef CANNOT_CHANGE_MODE_CLASS
! 			  || invalid_mode_change_p (i, (enum reg_class) rclass)
  #endif
  			  )
  			;
Index: trunk/gcc/reginfo.c
===================================================================
*** trunk.orig/gcc/reginfo.c	2010-12-08 13:30:37.000000000 +0100
--- trunk/gcc/reginfo.c	2010-12-08 13:31:57.000000000 +0100
*************** init_subregs_of_mode (void)
*** 1338,1349 ****
     mode.  */
  bool
  invalid_mode_change_p (unsigned int regno,
! 		       enum reg_class rclass ATTRIBUTE_UNUSED,
! 		       enum machine_mode from)
  {
    struct subregs_of_mode_node dummy, *node;
    unsigned int to;
    unsigned char mask;
  
    gcc_assert (subregs_of_mode);
    dummy.block = regno & -8;
--- 1338,1349 ----
     mode.  */
  bool
  invalid_mode_change_p (unsigned int regno,
! 		       enum reg_class rclass ATTRIBUTE_UNUSED)
  {
    struct subregs_of_mode_node dummy, *node;
    unsigned int to;
    unsigned char mask;
+   enum machine_mode from = PSEUDO_REGNO_MODE (regno);
  
    gcc_assert (subregs_of_mode);
    dummy.block = regno & -8;
Index: trunk/gcc/rtl.h
===================================================================
*** trunk.orig/gcc/rtl.h	2010-11-24 13:21:33.000000000 +0100
--- trunk/gcc/rtl.h	2010-12-08 13:32:39.000000000 +0100
*************** extern void init_reg_sets (void);
*** 2455,2462 ****
  extern void regclass (rtx, int);
  extern void reg_scan (rtx, unsigned int);
  extern void fix_register (const char *, int, int);
! extern bool invalid_mode_change_p (unsigned int, enum reg_class,
! 				   enum machine_mode);
  
  /* In reorg.c */
  extern void dbr_schedule (rtx);
--- 2455,2461 ----
  extern void regclass (rtx, int);
  extern void reg_scan (rtx, unsigned int);
  extern void fix_register (const char *, int, int);
! extern bool invalid_mode_change_p (unsigned int, enum reg_class);
  
  /* In reorg.c */
  extern void dbr_schedule (rtx);


2010-12-08  Richard Guenther  <rguenther@suse.de>

	* reginfo.c (struct subregs_of_mode_node): Remove.
	(subregs_of_mode): Likewise.
	(som_hash): Likewise.
	(som_eq): Likewise.
	(invalid_mode_changes): New bitmap.
	(record_subregs_of_mode): Get subregs_of_mode argument.
	Fill in invalid_mode_changes bitmap.
	(find_subregs_of_mode): Get subregs_of_mode argument and pass
	it through.
	(init_subregs_of_mode): Adjust.
	(finish_subregs_of_mode): Likewise.
	(invalid_mode_change_p): Query invalid_mode_changes bitmap.

Index: trunk/gcc/reginfo.c
===================================================================
*** trunk.orig/gcc/reginfo.c	2010-12-08 13:31:57.000000000 +0100
--- trunk/gcc/reginfo.c	2010-12-08 14:36:41.000000000 +0100
*************** reg_classes_intersect_p (reg_class_t c1,
*** 1235,1273 ****
  
  #ifdef CANNOT_CHANGE_MODE_CLASS
  
! struct subregs_of_mode_node
! {
!   unsigned int block;
!   unsigned char modes[MAX_MACHINE_MODE];
! };
! 
! static htab_t subregs_of_mode;
! 
! static hashval_t
! som_hash (const void *x)
! {
!   const struct subregs_of_mode_node *const a =
!     (const struct subregs_of_mode_node *) x;
!   return a->block;
! }
! 
! static int
! som_eq (const void *x, const void *y)
! {
!   const struct subregs_of_mode_node *const a =
!     (const struct subregs_of_mode_node *) x;
!   const struct subregs_of_mode_node *const b =
!     (const struct subregs_of_mode_node *) y;
!   return a->block == b->block;
! }
  
  static void
! record_subregs_of_mode (rtx subreg)
  {
-   struct subregs_of_mode_node dummy, *node;
    enum machine_mode mode;
    unsigned int regno;
-   void **slot;
  
    if (!REG_P (SUBREG_REG (subreg)))
      return;
--- 1235,1247 ----
  
  #ifdef CANNOT_CHANGE_MODE_CLASS
  
! static bitmap invalid_mode_changes;
  
  static void
! record_subregs_of_mode (rtx subreg, bitmap subregs_of_mode)
  {
    enum machine_mode mode;
    unsigned int regno;
  
    if (!REG_P (SUBREG_REG (subreg)))
      return;
*************** record_subregs_of_mode (rtx subreg)
*** 1278,1318 ****
    if (regno < FIRST_PSEUDO_REGISTER)
      return;
  
!   dummy.block = regno & -8;
!   slot = htab_find_slot_with_hash (subregs_of_mode, &dummy,
! 				   dummy.block, INSERT);
!   node = (struct subregs_of_mode_node *) *slot;
!   if (node == NULL)
      {
!       node = XCNEW (struct subregs_of_mode_node);
!       node->block = regno & -8;
!       *slot = node;
      }
- 
-   node->modes[mode] |= 1 << (regno & 7);
  }
  
  /* Call record_subregs_of_mode for all the subregs in X.  */
  static void
! find_subregs_of_mode (rtx x)
  {
    enum rtx_code code = GET_CODE (x);
    const char * const fmt = GET_RTX_FORMAT (code);
    int i;
  
    if (code == SUBREG)
!     record_subregs_of_mode (x);
  
    /* Time for some deep diving.  */
    for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
      {
        if (fmt[i] == 'e')
! 	find_subregs_of_mode (XEXP (x, i));
        else if (fmt[i] == 'E')
  	{
  	  int j;
  	  for (j = XVECLEN (x, i) - 1; j >= 0; j--)
! 	    find_subregs_of_mode (XVECEXP (x, i, j));
  	}
      }
  }
--- 1252,1292 ----
    if (regno < FIRST_PSEUDO_REGISTER)
      return;
  
!   if (bitmap_set_bit (subregs_of_mode,
! 		      regno * NUM_MACHINE_MODES + (unsigned int) mode))
      {
!       unsigned int rclass;
!       for (rclass = 0; rclass < N_REG_CLASSES; rclass++)
! 	if (!bitmap_bit_p (invalid_mode_changes,
! 			   regno * N_REG_CLASSES + rclass)
! 	    && CANNOT_CHANGE_MODE_CLASS (PSEUDO_REGNO_MODE (regno),
! 					 mode, (enum reg_class) rclass))
! 	  bitmap_set_bit (invalid_mode_changes,
! 			  regno * N_REG_CLASSES + rclass);
      }
  }
  
  /* Call record_subregs_of_mode for all the subregs in X.  */
  static void
! find_subregs_of_mode (rtx x, bitmap subregs_of_mode)
  {
    enum rtx_code code = GET_CODE (x);
    const char * const fmt = GET_RTX_FORMAT (code);
    int i;
  
    if (code == SUBREG)
!     record_subregs_of_mode (x, subregs_of_mode);
  
    /* Time for some deep diving.  */
    for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
      {
        if (fmt[i] == 'e')
! 	find_subregs_of_mode (XEXP (x, i), subregs_of_mode);
        else if (fmt[i] == 'E')
  	{
  	  int j;
  	  for (j = XVECLEN (x, i) - 1; j >= 0; j--)
! 	    find_subregs_of_mode (XVECEXP (x, i, j), subregs_of_mode);
  	}
      }
  }
*************** init_subregs_of_mode (void)
*** 1322,1371 ****
  {
    basic_block bb;
    rtx insn;
  
!   if (subregs_of_mode)
!     htab_empty (subregs_of_mode);
!   else
!     subregs_of_mode = htab_create (100, som_hash, som_eq, free);
  
    FOR_EACH_BB (bb)
      FOR_BB_INSNS (bb, insn)
!     if (NONDEBUG_INSN_P (insn))
!       find_subregs_of_mode (PATTERN (insn));
  }
  
  /* Return 1 if REGNO has had an invalid mode change in CLASS from FROM
     mode.  */
  bool
  invalid_mode_change_p (unsigned int regno,
! 		       enum reg_class rclass ATTRIBUTE_UNUSED)
  {
!   struct subregs_of_mode_node dummy, *node;
!   unsigned int to;
!   unsigned char mask;
!   enum machine_mode from = PSEUDO_REGNO_MODE (regno);
! 
!   gcc_assert (subregs_of_mode);
!   dummy.block = regno & -8;
!   node = (struct subregs_of_mode_node *)
!     htab_find_with_hash (subregs_of_mode, &dummy, dummy.block);
!   if (node == NULL)
!     return false;
! 
!   mask = 1 << (regno & 7);
!   for (to = VOIDmode; to < NUM_MACHINE_MODES; to++)
!     if (node->modes[to] & mask)
!       if (CANNOT_CHANGE_MODE_CLASS (from, (enum machine_mode) to, rclass))
! 	return true;
! 
!   return false;
  }
  
  void
  finish_subregs_of_mode (void)
  {
!   htab_delete (subregs_of_mode);
!   subregs_of_mode = 0;
  }
  #else
  void
--- 1296,1332 ----
  {
    basic_block bb;
    rtx insn;
+   bitmap_obstack srom_obstack;
+   bitmap subregs_of_mode;
  
!   gcc_assert (invalid_mode_changes == NULL);
!   invalid_mode_changes = BITMAP_ALLOC (NULL);
!   bitmap_obstack_initialize (&srom_obstack);
!   subregs_of_mode = BITMAP_ALLOC (&srom_obstack);
  
    FOR_EACH_BB (bb)
      FOR_BB_INSNS (bb, insn)
!       if (NONDEBUG_INSN_P (insn))
!         find_subregs_of_mode (PATTERN (insn), subregs_of_mode);
! 
!   BITMAP_FREE (subregs_of_mode);
!   bitmap_obstack_release (&srom_obstack);
  }
  
  /* Return 1 if REGNO has had an invalid mode change in CLASS from FROM
     mode.  */
  bool
  invalid_mode_change_p (unsigned int regno,
! 		       enum reg_class rclass)
  {
!   return bitmap_bit_p (invalid_mode_changes,
! 		       regno * N_REG_CLASSES + (unsigned) rclass);
  }
  
  void
  finish_subregs_of_mode (void)
  {
!   BITMAP_FREE (invalid_mode_changes);
  }
  #else
  void


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]