This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [wide-int] int_traits <tree>


Richard Biener <rguenther@suse.de> writes:
> On Thu, 17 Oct 2013, Richard Sandiford wrote:
>
>> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>> >> As mentioned in my message yesterday, I thought your new way of canonising
>> >> unsigned tree constants meant that there was always an upper zero bit.
>> >> Is that right?
>> > i believe this is correct.
>> >> If so, xprecision < precision is a no-op, because the number always
>> >> has the right form for wider precisions.  The only difficult case is
>> >> xprecision == precision, since then we need to peel off any upper -1 HWIs.
>> > say my HWI size is 8 bits (just to keep from typing a million 'f's.   if 
>> > i have a 16 bit unsigned number that is all 1s in the tree world it is 3 
>> > hwis
>> > 0x00 0xff 0xff.
>> >
>> > but inside regular wide int, it would take 1 wide int whose value is 0xff.
>> > inside of max it would be the same as the tree, but then the test 
>> > precision < xprecision + hbpwi never kicks in because precision is 
>> > guaranteed to be huge.
>> >
>> > inside of addr_wide_int, i think we tank with the assertion.
>> 
>> It should be OK for addr_wide_int too.  The precision still fits 2 HWIs.
>> The initial length is greater than the maximum length of an addr_wide_int,
>> but your "len = MAX (len, max_len)" deals with that.
>
> It's
>
>   len = MIN (len, max_len)

Oops, yeah, I meant MIN, sorry.

> which looked suspicious to me, but with precision >= xprecision
> precision can only be zero if xprecision is zero which looked to
> me like it cannot happen - or rather it should be fixed.

Despite the comment above the code, I don't think this MIN is there
for the zero-precision case.  I think it's there to handle the new
tree representation.

The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero.
The MIN trims the length from 3 to 2 in the last case.

> > the case actually comes up on the ppc because they do a lot of 128 bit 
>> > math.    I think i got thru the x86-64 without noticing this.
>> 
>> Well, it'd be suspicious if we're directly using 128-bit numbers
>> in addr_wide_int.  The justification for the assertion was that we
>> should explicitly truncate to addr_wide_int when deliberately
>> ignoring upper bits, beyond bit or byte address width.  128 bits
>> definitely falls into that category on powerpc.
>
> My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
> wide-int value if it has precision 16.

No, for a 16-bit wide_int it should be 0xff.  0x00 0xff 0xff is correct
for any wide_int wider than 16 bits though.

> AFAIK that is what the code produces,

In which case?  This is:

   precision == 16
   xprecision == 16
   len == 3
   max_len == 2

The MIN trims the len to 2 and then the loop Kenny added trims it
again to 1, so the "0x00 0xff 0xff" becomes "0xff".  The "0x00 0xff"
is still there in the array, but not used.

> but now Kenny says this is only for some kind
> of wide-ints but not all?  That is, why is
>
> inline wi::storage_ref
> wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch,
>                                         unsigned int precision, const_tree 
> x)
> {
>   unsigned int len = TREE_INT_CST_NUNITS (x);
>   const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, 
> 0);
>   return wi::storage_ref (val, len, precision);
> }
>
> not a valid implementation together with making sure that the
> INTEGER_CST tree rep has that extra word of zeros if required?

The fundamental problem here is that we're trying to support two cases:

(a) doing N-bit arithemtic in cases where the inputs have N bits
(b) doing N-bit arithmetic in cases where the inputs have fewer than N bits
    and are extended according to TYPE_SIGN.

Let's assume 32-bit HWIs.  The 16-bit (4-hex-digit) constant 0x8000 is
0x8000 regardless of whether the type is signed or unsigned.  But if it's
extended to 32-bits you get two different numbers 0xffff8000 and 0x00008000,
depending on the sign.

So for one value of the "precision" parameter (i.e. xprecision), signed
and unsigned constants produce the same number.  But for another value
of the "precision" parameter (those greater than xprecision), signed and
unsigned constants produce different numbers.  Yet at the moment the tree
constant has a single representation.

So I think the possibilities are:

(1) Use the representation of an N-bit wide_int to store N-bit tree constants.
    Do work when extending them to wider wide_ints.

(2) Use the representation of max_wide_int to store N-bit tree constants.
    Do work when creating an N-bit wide_int.

(3) Store both representations in the tree constant.

(4) Require all tree arithmetic to be done in the same way as rtl arithmetic,
    with explicit extensions.  This gets rid of case (b).

(5) Require all tree arithemtic to be done in wider wide_ints than the inputs,
    which I think is what you preferred.  This gets rid of case (a).

(6) Allow the same wide_int constant to have several different representations.
    Remember that this is to some extent what Kenny's original implementation
    did, since partial HWIs were filled with don't-cares.  You and I are the
    ones who argued that each wide_int should have a single representation.

> [This function shows another optimization issue:
>
>     case BOOLEAN_TYPE:
>       /* Cache false or true.  */
>       limit = 2;
>       if (wi::leu_p (cst, 1))
>         ix = cst.to_uhwi ();
>
> I would have expected cst <= 1 be optimized to cst.len == 1 &&
> cst.val[0] <= 1.  It expands to
>
> <L27>:
>   MEM[(long int *)&D.50698 + 16B] = 1;
>   MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct 
> wide_int_ref_storage *)&D.50698].scratch;
>   MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
>   MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
>   _277 = MEM[(const struct wide_int_storage *)&cst + 260B];
>   if (_277 <= 64)
>     goto <bb 42>;
>   else
>     goto <bb 43>;
>
>   <bb 42>:
>   xl_491 = zext_hwi (1, 32);  // ok, checking enabled and thus out-of-line
>   _494 = MEM[(const long int *)&cst];
>   _495 = (long unsigned int) _494;
>   yl_496 = zext_hwi (_495, _277);
>   _497 = xl_491 < yl_496;
>   goto <bb 44>;
>
>   <bb 43>:
>   _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage 
> *)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage 
> *)&cst].val, len_274, _277);
>
> this keeps D.50698 and cst un-SRAable - inline storage is problematic
> for this reason.  But the representation should guarantee the
> compare with a low precision (32 bit) constant is evaluatable
> at compile-time if len of the larger value is > 1, no?
>
>   <bb 44>:
>   # _504 = PHI <_497(42), _503(43)>
>   D.50698 ={v} {CLOBBER};
>   if (_504 != 0)
>     goto <bb 45>;
>   else
>     goto <bb 46>;
>
>   <bb 45>:
>   pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
>   goto <bb 229> (<L131>);
>
>   <bb 46>:
>   _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0);
>   ix_66 = (int) _65;
>   goto <bb 91>;
>
> The question is whether we should try to optimize wide-int for
> such cases or simply not use wi:leu_p (cst, 1) but rather
>
>  if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)
>
> ?

I think we should do the first, trying to optimise wide_int.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]