This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [wide-int] int_traits <tree>


On Thu, 17 Oct 2013, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> >> The new tree representation can have a length greater than max_len
> >> for an unsigned tree constant that occupies a whole number of HWIs.
> >> The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
> >> When extended to max_wide_int the representation is the same.
> >> But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero.
> >> The MIN trims the length from 3 to 2 in the last case.
> >
> > Oh, so it was the tree rep that changed?  _Why_ was it changed?
> > We still cannot use it directly from wide-int and the extra
> > word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE ()).
> 
> It means that we can now use the tree's HWI array directly, without any
> copying, for addr_wide_int and max_wide_int.  The only part of decompose ()
> that does a copy is the small_prec case, which is trivially compiled out
> for addr_wide_int and max_wide_int.

"     2) addr_wide_int.  This is a fixed size representation that is
     guaranteed to be large enough to compute any bit or byte sized
     address calculation on the target.  Currently the value is 64 + 4
     bits rounded up to the next number even multiple of
     HOST_BITS_PER_WIDE_INT (but this can be changed when the first
     port needs more than 64 bits for the size of a pointer).

     This flavor can be used for all address math on the target.  In
     this representation, the values are sign or zero extended based
     on their input types to the internal precision.  All math is done
     in this precision and then the values are truncated to fit in the
     result type.  Unlike most gimple or rtl intermediate code, it is
     not useful to perform the address arithmetic at the same
     precision in which the operands are represented because there has
     been no effort by the front ends to convert most addressing
     arithmetic to canonical types.

     In the addr_wide_int, all numbers are represented as signed
     numbers.  There are enough bits in the internal representation so
     that no infomation is lost by representing them this way."

so I guess from that that addr_wide_int.get_precision is always
that "64 + 4 rounded up".  Thus decompose gets that constant precision
input and the extra zeros make the necessary extension always a no-op.
Aha.

For max_wide_int the same rules apply, just its size is larger.

Ok.  So the reps are only canonical wide-int because we only
ever use them with precision > xprecision (maybe we should assert
that).

Btw, we are not using them directly, but every time we actually
build a addr_wide_int / max_wide_int we copy them anyway:

/* Initialize the storage from integer X, in precision N.  */
template <int N>
template <typename T>
inline fixed_wide_int_storage <N>::fixed_wide_int_storage (const T &x)
{
  /* Check for type compatibility.  We don't want to initialize a
     fixed-width integer from something like a wide_int.  */
  WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
  wide_int_ref xi (x, N);
  len = xi.len;
  for (unsigned int i = 0; i < len; ++i)
    val[i] = xi.val[i];
}

it avoids a 2nd copy though, which shows nicely what was rummaging in
my head for the last two days - that the int_trais <> abstraction
was somehow at the wrong level - it should have been traits that
are specific to the storage model?  or the above should use
int_traits<>::decompose manually with it always doing the 
copy (that would also optimize away one copy and eventually
would make the extra zeros not necessary).

I originally thought that extra zeros get rid of all copying from trees
to all wide-int kinds.

What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.

Richard.

> >> > > the case actually comes up on the ppc because they do a lot of 128 bit 
> >> >> > math.    I think i got thru the x86-64 without noticing this.
> >> >> 
> >> >> Well, it'd be suspicious if we're directly using 128-bit numbers
> >> >> in addr_wide_int.  The justification for the assertion was that we
> >> >> should explicitly truncate to addr_wide_int when deliberately
> >> >> ignoring upper bits, beyond bit or byte address width.  128 bits
> >> >> definitely falls into that category on powerpc.
> >> >
> >> > My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
> >> > wide-int value if it has precision 16.
> >> 
> >> No, for a 16-bit wide_int it should be 0xff.  0x00 0xff 0xff is correct
> >> for any wide_int wider than 16 bits though.
> >> 
> >> > AFAIK that is what the code produces,
> >> 
> >> In which case?  This is:
> >> 
> >>    precision == 16
> >>    xprecision == 16
> >>    len == 3
> >>    max_len == 2
> >> 
> >> The MIN trims the len to 2 and then the loop Kenny added trims it
> >> again to 1, so the "0x00 0xff 0xff" becomes "0xff".  The "0x00 0xff"
> >> is still there in the array, but not used.
> >> 
> >> > but now Kenny says this is only for some kind
> >> > of wide-ints but not all?  That is, why is
> >> >
> >> > inline wi::storage_ref
> >> > wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch,
> >> >                                         unsigned int precision, const_tree 
> >> > x)
> >> > {
> >> >   unsigned int len = TREE_INT_CST_NUNITS (x);
> >> >   const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, 
> >> > 0);
> >> >   return wi::storage_ref (val, len, precision);
> >> > }
> >> >
> >> > not a valid implementation together with making sure that the
> >> > INTEGER_CST tree rep has that extra word of zeros if required?
> >> 
> >> The fundamental problem here is that we're trying to support two cases:
> >> 
> >> (a) doing N-bit arithemtic in cases where the inputs have N bits
> >> (b) doing N-bit arithmetic in cases where the inputs have fewer than N bits
> >>     and are extended according to TYPE_SIGN.
> >> 
> >> Let's assume 32-bit HWIs.  The 16-bit (4-hex-digit) constant 0x8000 is
> >> 0x8000 regardless of whether the type is signed or unsigned.  But if it's
> >> extended to 32-bits you get two different numbers 0xffff8000 and 0x00008000,
> >> depending on the sign.
> >> 
> >> So for one value of the "precision" parameter (i.e. xprecision), signed
> >> and unsigned constants produce the same number.  But for another value
> >> of the "precision" parameter (those greater than xprecision), signed and
> >> unsigned constants produce different numbers.  Yet at the moment the tree
> >> constant has a single representation.
> >
> > But a correctly extended one, up to its len!  (as opposed to RTL)
> 
> But extending the precision can change the right value of "len".
> Take the same example with 16-bit HWIs.  In wide_int terms, and with
> the original tree representation, the constant is a single HWI:
> 
>     0x8000
> 
> with len 1.  And in case (a) -- where we're asking for a 16-bit wide_int --
> this single HWI is all we want.  The signed and unsigned constants give
> the same wide_int.
> 
> But the same constant extended to 32 bits and left "uncompressed" would be
> two HWIs:
> 
>     0x0000 0x8000 for unsigned constants
>     0xffff 0x8000 for signed constants
> 
> Compressed according to the sign scheme they are:
> 
>     0x0000 0x8000 (len == 2) for unsigned constants
>            0x8000 (len == 1) for signed constants
> 
> which is also the new tree representation.
> 
> So the unsigned case is different for (a) and (b).



> >> So I think the possibilities are:
> >> 
> >> (1) Use the representation of an N-bit wide_int to store N-bit tree constants.
> >>     Do work when extending them to wider wide_ints.
> >> 
> >> (2) Use the representation of max_wide_int to store N-bit tree constants.
> >>     Do work when creating an N-bit wide_int.
> >> 
> >> (3) Store both representations in the tree constant.
> >> 
> >> (4) Require all tree arithmetic to be done in the same way as rtl arithmetic,
> >>     with explicit extensions.  This gets rid of case (b).
> >> 
> >> (5) Require all tree arithemtic to be done in wider wide_ints than the inputs,
> >>     which I think is what you preferred.  This gets rid of case (a).
> >> 
> >> (6) Allow the same wide_int constant to have several different
> >> representations.
> >>     Remember that this is to some extent what Kenny's original implementation
> >>     did, since partial HWIs were filled with don't-cares.  You and I are the
> >>     ones who argued that each wide_int should have a single representation.
> >
> > As far as I can see when we need to extend the tree rep to a bigger
> > precision (extend according to its sign) the only thing we have to
> > do to make the wide-int rep canonical is sign-extend the tree rep
> > at the desired precision.
> 
> That's what the new representation does.  It sign-extends from
> TYPE_PRECISION to any value greater than TYPE_PRECISION.
> 
> > And as we never truncate this sign-extension
> > is a no-op for signed tree constants, is a no-op for unsigned
> > tree constants when precision > xprecision but not for unsigned tree
> > constants and precision == xprecision when the MSB is set.
> 
> Are you talking about addr_wide_int and max_wide_int only,
> or wide_int too?  The above is right for addr_wide_int and max_wide_int,
> and is also all that decompose does for those cases.  The small_prec case
> is compiled out because the "precision" parameter is constant and
> known at compile time to be a HWI multiple.
> 
> But the small_prec case is needed for wide_ints whenever len == max_len.
> That includes the precision == xprecision case, but also something like:
> 
>      precision == 13, xprecision == 12
> 
> Which is unlikely to occur in practice, but given that the code is
> still needed for precision == xprecision == 12, there's not much
> point forbidding it.
> 
> Thanks,
> Richard
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]