This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: wide-int more performance fixes for wide multiplication.

From: Kenneth Zadeck <zadeck at naturalbridge dot com>
To: Richard Biener <rguenther at suse dot de>, Mike Stump <mikestump at comcast dot net>, gcc-patches <gcc-patches at gcc dot gnu dot org>, rdsandiford at googlemail dot com
Date: Sat, 14 Dec 2013 12:22:46 -0500
Subject: Re: wide-int more performance fixes for wide multiplication.
Authentication-results: sourceware.org; auth=none
References: <52A61506 dot 5000407 at naturalbridge dot com> <87sitv3bkq dot fsf at talisman dot default> <52AC4EC9 dot 3020806 at naturalbridge dot com> <87ob4j33pd dot fsf at talisman dot default>


On 12/14/2013 09:30 AM, Richard Sandiford wrote:

+  /* True if the result needs to be negated.  */
+  bool is_neg = false;

/* If the top level routine did not really pass in an overflow, then

        just make sure that we never attempt to set it.  */
     bool needs_overflow = (overflow != 0);
+  const HOST_WIDE_INT zero = 0;
     if (needs_overflow)
       *overflow = false;

@@ -1382,61 +1392,96 @@ wi::mul_internal (HOST_WIDE_INT *val, co

         return 1;
       }

- /* We do unsigned mul and then correct it. */

-  wi_unpack (u, (const unsigned HOST_WIDE_INT *) op1val, op1len,
-	     half_blocks_needed, prec, SIGNED);
-  wi_unpack (v, (const unsigned HOST_WIDE_INT *) op2val, op2len,
-	     half_blocks_needed, prec, SIGNED);
-
-  /* The 2 is for a full mult.  */
-  memset (r, 0, half_blocks_needed * 2
-	  * HOST_BITS_PER_HALF_WIDE_INT / CHAR_BIT);
+  ulen = op1len * 2;
+  vlen = op2len * 2;
+
+  if (sgn == SIGNED)
+    {
+      if (wi::neg_p (op1))
+	{
+	  HOST_WIDE_INT nop1[2 * WIDE_INT_MAX_ELTS];
+

E.g. following on from the above:

	  /* Add an extra HWI so that we can represent the negative of
	     the minimum.  */
	  HOST_WIDE_INT *nop1 = XALLOCAVEC (HOST_WIDE_INT, op1len + 1);

+	  int nop1len
+ = wi::sub_large (nop1, &zero, 1, op1val, op1len, prec + 1, SIGNED,
0);

Not sure about the "prec + 1" here, since it could cause nop1len to be
greater than the maximum length for "prec".  And you go on to unpack
nop1 in "prec" rather than "prec + 1".

the unpacking with prec is the wrong part.  That should have been prec + 1

Hmm, I still think it's the opposite, because...

AIUI, once you've taken the absolutes, you've got two unsigned numbers
of precision "prec" and create a "prec * 2" result, so naybe:
sub_large (..., prec, UNSIGNED, 0) instead?

the signed and unsigned does not matter here.    the prec + 1 means we
never overflow.

...my point is that after this we treat the number as unsigned,
so it isn't a question of whether the absolute value overflows
the precision as a signed number but whether it overflows as an
unsigned number.  And it never will.  It might need 1 extra HWI
than the original input if op1len < blocks_needed, but it shouldn't
need extra precision.

E.g. say for 8-bit HWIs we start out with a signed -0x80 * -0x80.
We convert that into an unsigned 0x80 * 0x80, which is still
precision 8 * precision 8 -> precision 16.  And we'd need to be
able to handle those precision-8 values correctly (with no extra
zero HWI) if we had an unsigned 0x80 * 0x80 from the outset.

it will not work like that. You are thinking that i really can do an8 bit X 8 bit multiply. I cannot.The case that i am worried about is 115 bits x 115 bits. in this casei have to make sure that the top bits of the top block are provisionedproperly. If you think about your case above where bit 114 is set andthe bits below it are zeros, then i need to invert it so that the bit115 and above are zero. otherwise i get the top of the blockpolluted with the 1 bits which would mess up the top block multiplied byany of the lower blocks.

however, you are, at a deeper level correct. if i change the parms towi_unpack to UNSIGNED then it is likely you are correct. I will thinkthis through today and see if i can find a counter example. but itmay be true that if i invert the number, i always want the top bits ofthe block to be 0 which is what the sign parameter controls.

+	  is_neg = !is_neg;
+	  ulen = nop1len * 2;
+	  wi_unpack (u, (const unsigned HOST_WIDE_INT*)nop1, nop1len,
+		     ulen, prec, SIGNED);

Back on the alloca theme, we'd have:

	  u = XALLOCAVEC (unsigned HOST_HALF_WIDE_INT, ulen);

before this and other unpacks.

Note that there's supposed to be space after ")" in casts.  Lots of
instances in the patch :-)

i actually did not know this.   i will fix.

+      /* UNSIGNED mul. */
+      /* For large positive integers inside regular wide-ints, we must
+	 explictly expand out the 1s to full width for the mul to be
+	 correct. Note that the top bit will never be set for a
+	 unsigned number in offset_int of widest-int.  */

s/of/or/.  But I think the comment is confusing because the unsignedness
is a property of the multiplication rather than the inputs.  We should never
be doing an unsigned operation on widest_int to begin with.  And if
offset_ints are being treated as having a sign then the same is true there.

If we ever do have an unsigned multiplication on those types for some reason
then -1 will (rightly) be treated as the maximum unsigned value.

The real question here is "do you want to do an assert or do you want to
explain what happens if the input comes in that way?"

References:
- wide-int more performance fixes for wide multiplication.
  - From: Kenneth Zadeck
- Re: wide-int more performance fixes for wide multiplication.
  - From: Richard Sandiford
- Re: wide-int more performance fixes for wide multiplication.
  - From: Kenneth Zadeck
- Re: wide-int more performance fixes for wide multiplication.
  - From: Richard Sandiford

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]