This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fix gcc.dg/lower-subreg-1.c failure, revisited.


> From: Richard Sandiford <rdsandiford@googlemail.com>
> Date: Wed, 9 May 2012 11:14:49 +0200

> Hans-Peter Nilsson <hans-peter.nilsson@axis.com> writes:
> >> From: Richard Sandiford <rdsandiford@googlemail.com>
> >> Date: Tue, 1 May 2012 16:46:38 +0200
> >
> >> To repeat: as things stand, very few targets define proper rtx costs
> >> for SET.
> >
> > IMHO it's wrong to start blaming targets when rtx_cost doesn't
> > take the mode in account in the first place, for the default
> > cost.  (Well, except for the modes-tieable subreg special-case.)
> > The targets where an operation in N * word_mode costs no more
> > than one in word_mode, if there even is one, is a minority,
> > let's adjust the defaults to that.
> 
> I'll pass on approving or disapproving this patch,

Here's an update.  Could you please reconsider?  I have to
appeal to your sense of fairness: after all you were involved in
the breaking patch.

> but for the record:
> a factor of word_mode seems a bit too simplistic.  It's OK for moves
> and logic ops, but addition of multiword modes is a bit more complicated.
> Multiplication and division by multiword modes is even more so, of course.

I adjusted multiplication and division to the more realistic
O(n*n).  That's just for show of course; we're comparing 2**2 to
instead of 2*2 as the focus is on the double size.  I did not
add specific adjustments for arithmetic codes not already there,
as I did not think such an adjustment to be appropriate.  The
intent is just to adapt the existing default costs to the change
of focus due to the lower-subreg change.  But if it's necessary
for approval, I'm willing to fix that.

> As things stand, rtx_cost is intentionally used for more than
> just valid target insns.  One of the main uses has always been
> to decide whether it is better to implement multiplications by a
> shift-and-add sequence, or whether to use multiplication instructions.
> The associated expmed code has never tried to decide which shifts are
> actually representable as matching rtl insns and which aren't.
> The same goes for division-using-multiplication.
> 
> So I think this patch

(referring to the commit that caused the major
gcc.dg/lower-subreg-1.c regression in May, not the predecessor
to *this* patch)

> is using rtx_cost according to its current
> interface.

Regardless of how "interface" is defined, that commit put much
more focus on the mode and made it about necessary to handle
SET, something that the default costs don't do without the patch
below.

> > Isn't the below better than doing virtually the same in each
> > target's rtx_costs?
> 
> FWIW, MIPS, SH and x86 all used something slightly different (and more
> complicated).  I imagine PowerPC and SPARC will too.  So "each" seems
> a bit strong.

Just strong enough: this patch indeed would have fixed the
gcc.dg/lower-subreg-1.c regression for MIPS and x86 (no
regressions) that was present at revision 187212, and no doubt
the predecessor patch too.  (I had to add commits 187299 and
187582 for MIPS and I couldn't get a baseline with java or
fortran enabled, thus --enable-languages=c,c++,lto,objc).  Also
similarly tested arm-eabi, sh-elf (though I couldn't repeat the
regression for those at the baseline) and cris-elf; no
regressions.  Fixes PR53176 for cris-elf.

Ok to commit?

	* rtlanal.c (rtx_cost): Adjust default cost for X with a
	UNITS_PER_WORD factor for all X according to the size of
	its mode, not just for SUBREGs with untieable modes.
	Handle SET.  Use factor * factor for MULT, DIV, UDIV,
	MOD, UMOD.

Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 188403)
+++ gcc/rtlanal.c	(working copy)
@@ -3755,10 +3755,17 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   enum rtx_code code;
   const char *fmt;
   int total;
+  int factor;
 
   if (x == 0)
     return 0;
 
+  /* A size N times larger than UNITS_PER_WORD likely needs N times as
+     many insns, taking N times as long.  */
+  factor = GET_MODE_SIZE (GET_MODE (x)) / UNITS_PER_WORD;
+  if (factor == 0)
+    factor = 1;
+
   /* Compute the default costs of certain things.
      Note that targetm.rtx_costs can override the defaults.  */
 
@@ -3766,20 +3773,31 @@ rtx_cost (rtx x, enum rtx_code outer_cod
   switch (code)
     {
     case MULT:
-      total = COSTS_N_INSNS (5);
+      /* Multiplication has time-complexity O(N*N), where N is the
+	 number of units (translated from digits) when using
+	 schoolbook long multiplication.  */
+      total = factor * factor * COSTS_N_INSNS (5);
       break;
     case DIV:
     case UDIV:
     case MOD:
     case UMOD:
-      total = COSTS_N_INSNS (7);
+      /* Similarly, complexity for schoolbook long division.  */
+      total = factor * factor * COSTS_N_INSNS (7);
       break;
     case USE:
       /* Used in combine.c as a marker.  */
       total = 0;
       break;
+    case SET:
+      /* A SET doesn't have a mode, so let's look at the SET_DEST to get
+	 the mode for the factor.  */
+      factor = GET_MODE_SIZE (GET_MODE (SET_DEST (x))) / UNITS_PER_WORD;
+      if (factor == 0)
+	factor = 1;
+      /* Pass through.  */
     default:
-      total = COSTS_N_INSNS (1);
+      total = factor * COSTS_N_INSNS (1);
     }
 
   switch (code)
@@ -3792,8 +3810,7 @@ rtx_cost (rtx x, enum rtx_code outer_cod
       /* If we can't tie these modes, make this expensive.  The larger
 	 the mode, the more expensive it is.  */
       if (! MODES_TIEABLE_P (GET_MODE (x), GET_MODE (SUBREG_REG (x))))
-	return COSTS_N_INSNS (2
-			      + GET_MODE_SIZE (GET_MODE (x)) / UNITS_PER_WORD);
+	return COSTS_N_INSNS (2 + factor);
       break;
 
     default:


brgds, H-P


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]