This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

From: Kenneth Zadeck <zadeck at naturalbridge dot com>
To: Jakub Jelinek <jakub at redhat dot com>
Cc: Richard Biener <richard dot guenther at gmail dot com>, gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org, rdsandiford at googlemail dot com
Date: Wed, 31 Oct 2012 14:19:29 -0400
Subject: Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
References: <20121029175642.GK1752@tucnak.redhat.com> <5090410F.5010801@naturalbridge.com> <CAFiYyc1go51+Srjy9erkBBC+o-MC5U08S23e6A6vNuE9BxOgvg@mail.gmail.com> <877gq73qsm.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com> <50912B52.60207@naturalbridge.com> <20121031135508.GI1752@tucnak.redhat.com> <5091300A.9020101@naturalbridge.com> <20121031142743.GK1752@tucnak.redhat.com>

Jakub,

it is hard from all of the threads to actually distill what the real issues are here. So let me start from a clean slate and state them simply.

Richi has three primary objections:

1) that we can do all of this with a templated version of double-int. 2) that we should not be passing in a precision and bitsize into the interface. 3) that the interface is too large.

I have attached a fragment of my patch #5 to illustrate the main thrust of my patches and to illustrate the usefulness to gcc right now.

In the current trunk, we have code that does simplification when the mode fits in an HWI and we have code that does the simplification if the mode fits in two HWIs. if the mode does not fit in two hwi's the code does not do the simplification.

Thus here and in a large number of other places we have two copies of the code. Richi wants there to be multiple template instantiations of double-int. This means that we are now going to have to have 3 copies of this code to support oi mode on a 64 bit host and 4 copies on a 32 bit host.

Further note that there are not as many cases for the 2*hwi in the code as their are for the hwi case and in general this is true through out the compiler. (CLRSB is missing from the 2hwi case in the patch) We really did not write twice the code when we stated supporting 2 hwi, we added about 1.5 times the code (simplify-rtx is better than most of the rest of the compiler). I am using the rtl level as an example here because i have posted all of those patches, but the tree level is no better.

I do not want to write this code a third time and certainly not a fourth time. Just fixing all of this is quite useful now: it fills in a lot of gaps in our transformations and it removes many edge case crashes because ti mode really is lightly tested. However, this patch becomes crucial as the world gets larger.

Richi's second point is that we should be doing everything at "infinite precision" and not passing in an explicit bitsize and precision. That works ok (sans the issues i raised with it in tree-vpn earlier) when the largest precision on the machine fits in a couple of hwis. However, for targets that have large integers or cross compilers, this becomes expensive. The idea behind my set of patches is that for the transformations that can work this way, we do the math in the precision of the type or mode. In general this means that almost all of the math will be done quickly, even on targets that support really big integers. For passes like tree-vrp, the math will be done at some multiple of the largest type seen in the actual program. The amount of the multiple is a function of the optimization, not the target or the host. Currently (on my home computer) the wide-int interface allows the optimization to go 4x the largest mode on the target.

I can get rid of this bound at the expense of doing an alloca rather than stack allocating a fixed sized structure. However, given the extremely heavy use of this interface, that does not seem like the best of tradeoffs.

The truth is that the vast majority of the compiler actually wants to see the math done the way that it is going to be done on the machine. Tree-vrp and the gimple constant prop do not. But i have made accommodations to handle both needs. I believe that the reason that double-int was never used at the rtl level is that it does not actually do the math in a way that is useful to the target.

Richi's third objection is that the interface is too large. I disagree. It was designed based on the actual usage of the interface. When i found places where i was writing the same code over and over again, i put it in a function as part of the interface. I later went back and optimized many of these because this is a very heavily used interface. Richi has many other objections, but i have agreed to fix almost all of them, so i am not going to address them here.

It really will be a huge burden to have to carry these patched until the next revision. We are currently in stage 1 and i believe that the minor issues that richi raises can be easily addressed.

kenny

@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
       return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
     }
 
-  if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
     {
-      HOST_WIDE_INT arg0 = INTVAL (op);
-      HOST_WIDE_INT val;
+      wide_int result;
+      enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+      wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+      /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you added this to the test
+	 above the code would die later anyway.  If this assert
+	 happens, you just need to make the port support wide int.  */
+      gcc_assert (width <= HOST_BITS_PER_DOUBLE_INT); 
+#endif
 
       switch (code)
 	{
 	case NOT:
-	  val = ~ arg0;
+	  result = ~op0;
 	  break;
 
 	case NEG:
-	  val = - arg0;
+	  result = op0.neg ();
 	  break;
 
 	case ABS:
-	  val = (arg0 >= 0 ? arg0 : - arg0);
+	  result = op0.abs ();
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = ffs_hwi (arg0);
+	  result = op0.ffs ();
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
-	    ;
-	  else
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
+	  result = op0.clz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (mode) - 1;
-	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
-	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
+	  result = op0.clrsb (GET_MODE_BITSIZE (mode), 
+			      GET_MODE_PRECISION (mode));
 	  break;
-
+	  
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0)
-	    {
-	      /* Even if the value at zero is undefined, we have to come
-		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_PRECISION (mode);
-	    }
-	  else
-	    val = ctz_hwi (arg0);
+	  result = op0.ctz (GET_MODE_BITSIZE (mode), 
+			    GET_MODE_PRECISION (mode));
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
+	  result = op0.popcount (GET_MODE_BITSIZE (mode), 
+				 GET_MODE_PRECISION (mode));
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
-	  val = 0;
-	  while (arg0)
-	    val++, arg0 &= arg0 - 1;
-	  val &= 1;
+	  result = op0.parity (GET_MODE_BITSIZE (mode), 
+			       GET_MODE_PRECISION (mode));
 	  break;
 
 	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    val = 0;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-		byte = (arg0 >> s) & 0xff;
-		val |= byte << d;
-	      }
-	  }
+	  result = op0.bswap ();
 	  break;
 
 	case TRUNCATE:
-	  val = arg0;
+	  result = op0.sext (mode);
 	  break;
 
 	case ZERO_EXTEND:
-	  /* When zero-extending a CONST_INT, we need to know its
-             original mode.  */
-	  gcc_assert (op_mode != VOIDmode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (GET_MODE_BITSIZE (op_mode) < HOST_BITS_PER_WIDE_INT)
-	    val = arg0 & GET_MODE_MASK (op_mode);
-	  else
-	    return 0;
+	  result = op0.zext (mode);
 	  break;
 
 	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode)
-	    op_mode = mode;
-	  op_width = GET_MODE_PRECISION (op_mode);
-	  if (op_width == HOST_BITS_PER_WIDE_INT)
-	    {
-	      /* If we were really extending the mode,
-		 we would have to distinguish between zero-extension
-		 and sign-extension.  */
-	      gcc_assert (width == op_width);
-	      val = arg0;
-	    }
-	  else if (op_width < HOST_BITS_PER_WIDE_INT)
-	    {
-	      val = arg0 & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, val))
-		val |= ~GET_MODE_MASK (op_mode);
-	    }
-	  else
-	    return 0;
+	  result = op0.sext (mode);
 	  break;
 
 	case SQRT:
-	case FLOAT_EXTEND:
-	case FLOAT_TRUNCATE:
-	case SS_TRUNCATE:
-	case US_TRUNCATE:
-	case SS_NEG:
-	case US_NEG:
-	case SS_ABS:
-	  return 0;
-
-	default:
-	  gcc_unreachable ();
-	}
-
-      return gen_int_mode (val, mode);
-    }
-
-  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
-     for a DImode operation on a CONST_INT.  */
-  else if (width <= HOST_BITS_PER_DOUBLE_INT
-	   && (CONST_DOUBLE_AS_INT_P (op) || CONST_INT_P (op)))
-    {
-      double_int first, value;
-
-      if (CONST_DOUBLE_AS_INT_P (op))
-	first = double_int::from_pair (CONST_DOUBLE_HIGH (op),
-				       CONST_DOUBLE_LOW (op));
-      else
-	first = double_int::from_shwi (INTVAL (op));
-
-      switch (code)
-	{
-	case NOT:
-	  value = ~first;
-	  break;
-
-	case NEG:
-	  value = -first;
-	  break;
-
-	case ABS:
-	  if (first.is_negative ())
-	    value = -first;
-	  else
-	    value = first;
-	  break;
-
-	case FFS:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ffs_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ffs_hwi (first.high);
-	  else
-	    value.low = 0;
-	  break;
-
-	case CLZ:
-	  value.high = 0;
-	  if (first.high != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.high) - 1
-	              - HOST_BITS_PER_WIDE_INT;
-	  else if (first.low != 0)
-	    value.low = GET_MODE_PRECISION (mode) - floor_log2 (first.low) - 1;
-	  else if (! CLZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case CTZ:
-	  value.high = 0;
-	  if (first.low != 0)
-	    value.low = ctz_hwi (first.low);
-	  else if (first.high != 0)
-	    value.low = HOST_BITS_PER_WIDE_INT + ctz_hwi (first.high);
-	  else if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, value.low))
-	    value.low = GET_MODE_PRECISION (mode);
-	  break;
-
-	case POPCOUNT:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  break;
-
-	case PARITY:
-	  value = double_int_zero;
-	  while (first.low)
-	    {
-	      value.low++;
-	      first.low &= first.low - 1;
-	    }
-	  while (first.high)
-	    {
-	      value.low++;
-	      first.high &= first.high - 1;
-	    }
-	  value.low &= 1;
-	  break;
-
-	case BSWAP:
-	  {
-	    unsigned int s;
-
-	    value = double_int_zero;
-	    for (s = 0; s < width; s += 8)
-	      {
-		unsigned int d = width - s - 8;
-		unsigned HOST_WIDE_INT byte;
-
-		if (s < HOST_BITS_PER_WIDE_INT)
-		  byte = (first.low >> s) & 0xff;
-		else
-		  byte = (first.high >> (s - HOST_BITS_PER_WIDE_INT)) & 0xff;
-
-		if (d < HOST_BITS_PER_WIDE_INT)
-		  value.low |= byte << d;
-		else
-		  value.high |= byte << (d - HOST_BITS_PER_WIDE_INT);
-	      }
-	  }
-	  break;
-
-	case TRUNCATE:
-	  /* This is just a change-of-mode, so do nothing.  */
-	  value = first;
-	  break;
-
-	case ZERO_EXTEND:
-	  gcc_assert (op_mode != VOIDmode);
-
-	  if (op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-
-	  value = double_int::from_uhwi (first.low & GET_MODE_MASK (op_mode));
-	  break;
-
-	case SIGN_EXTEND:
-	  if (op_mode == VOIDmode
-	      || op_width > HOST_BITS_PER_WIDE_INT)
-	    return 0;
-	  else
-	    {
-	      value.low = first.low & GET_MODE_MASK (op_mode);
-	      if (val_signbit_known_set_p (op_mode, value.low))
-		value.low |= ~GET_MODE_MASK (op_mode);
-
-	      value.high = HWI_SIGN_EXTEND (value.low);
-	    }
-	  break;
-
-	case SQRT:
-	  return 0;
-
 	default:
 	  return 0;
 	}
 
-      return immed_double_int_const (value, mode);
+      return immed_wide_int_const (result, mode);
     }

References:
- GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Jakub Jelinek
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Kenneth Zadeck
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Richard Biener
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Richard Sandiford
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Kenneth Zadeck
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Jakub Jelinek
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Kenneth Zadeck
- Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
  - From: Jakub Jelinek

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]