This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Kenneth Zadeck <zadeck at naturalbridge dot com>*To*: Richard Biener <rguenther at suse dot de>, Richard Sandiford <rsandifo at linux dot vnet dot ibm dot com>*Cc*: gcc-patches at gcc dot gnu dot org, Mike Stump <mikestump at comcast dot net>*Date*: Thu, 17 Oct 2013 08:43:04 -0400*Subject*: Re: [wide-int] int_traits <tree>*Authentication-results*: sourceware.org; auth=none*References*: <alpine dot LNX dot 2 dot 00 dot 1310161537220 dot 11149 at zhemvz dot fhfr dot qr> <525EB50F dot 2090003 at naturalbridge dot com> <87ppr56on1 dot fsf at talisman dot default> <525EFC33 dot 4010304 at naturalbridge dot com> <87hacg720z dot fsf at talisman dot default> <alpine dot LNX dot 2 dot 00 dot 1310171015000 dot 11149 at zhemvz dot fhfr dot qr> <87k3hc9n9w dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <alpine dot LNX dot 2 dot 00 dot 1310171309390 dot 11149 at zhemvz dot fhfr dot qr>

On 10/17/2013 07:30 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Richard Sandiford wrote:Richard Biener <rguenther@suse.de> writes:On Thu, 17 Oct 2013, Richard Sandiford wrote:Kenneth Zadeck <zadeck@naturalbridge.com> writes:As mentioned in my message yesterday, I thought your new way of canonising unsigned tree constants meant that there was always an upper zero bit. Is that right?i believe this is correct.If so, xprecision < precision is a no-op, because the number always has the right form for wider precisions. The only difficult case is xprecision == precision, since then we need to peel off any upper -1 HWIs.say my HWI size is 8 bits (just to keep from typing a million 'f's. if i have a 16 bit unsigned number that is all 1s in the tree world it is 3 hwis 0x00 0xff 0xff. but inside regular wide int, it would take 1 wide int whose value is 0xff. inside of max it would be the same as the tree, but then the test precision < xprecision + hbpwi never kicks in because precision is guaranteed to be huge. inside of addr_wide_int, i think we tank with the assertion.It should be OK for addr_wide_int too. The precision still fits 2 HWIs. The initial length is greater than the maximum length of an addr_wide_int, but your "len = MAX (len, max_len)" deals with that.It's len = MIN (len, max_len)Oops, yeah, I meant MIN, sorry.which looked suspicious to me, but with precision >= xprecision precision can only be zero if xprecision is zero which looked to me like it cannot happen - or rather it should be fixed.Despite the comment above the code, I don't think this MIN is there for the zero-precision case. I think it's there to handle the new tree representation. The new tree representation can have a length greater than max_len for an unsigned tree constant that occupies a whole number of HWIs. The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00. When extended to max_wide_int the representation is the same. But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero. The MIN trims the length from 3 to 2 in the last case.Oh, so it was the tree rep that changed? _Why_ was it changed? We still cannot use it directly from wide-int and the extra word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE ()).

the case actually comes up on the ppc because they do a lot of 128 bitmath. I think i got thru the x86-64 without noticing this.Well, it'd be suspicious if we're directly using 128-bit numbers in addr_wide_int. The justification for the assertion was that we should explicitly truncate to addr_wide_int when deliberately ignoring upper bits, beyond bit or byte address width. 128 bits definitely falls into that category on powerpc.My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid wide-int value if it has precision 16.No, for a 16-bit wide_int it should be 0xff. 0x00 0xff 0xff is correct for any wide_int wider than 16 bits though.AFAIK that is what the code produces,In which case? This is: precision == 16 xprecision == 16 len == 3 max_len == 2 The MIN trims the len to 2 and then the loop Kenny added trims it again to 1, so the "0x00 0xff 0xff" becomes "0xff". The "0x00 0xff" is still there in the array, but not used.but now Kenny says this is only for some kind of wide-ints but not all? That is, why is inline wi::storage_ref wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch, unsigned int precision, const_tree x) { unsigned int len = TREE_INT_CST_NUNITS (x); const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, 0); return wi::storage_ref (val, len, precision); } not a valid implementation together with making sure that the INTEGER_CST tree rep has that extra word of zeros if required?The fundamental problem here is that we're trying to support two cases: (a) doing N-bit arithemtic in cases where the inputs have N bits (b) doing N-bit arithmetic in cases where the inputs have fewer than N bits and are extended according to TYPE_SIGN. Let's assume 32-bit HWIs. The 16-bit (4-hex-digit) constant 0x8000 is 0x8000 regardless of whether the type is signed or unsigned. But if it's extended to 32-bits you get two different numbers 0xffff8000 and 0x00008000, depending on the sign. So for one value of the "precision" parameter (i.e. xprecision), signed and unsigned constants produce the same number. But for another value of the "precision" parameter (those greater than xprecision), signed and unsigned constants produce different numbers. Yet at the moment the tree constant has a single representation.But a correctly extended one, up to its len! (as opposed to RTL)So I think the possibilities are: (1) Use the representation of an N-bit wide_int to store N-bit tree constants. Do work when extending them to wider wide_ints. (2) Use the representation of max_wide_int to store N-bit tree constants. Do work when creating an N-bit wide_int. (3) Store both representations in the tree constant. (4) Require all tree arithmetic to be done in the same way as rtl arithmetic, with explicit extensions. This gets rid of case (b). (5) Require all tree arithemtic to be done in wider wide_ints than the inputs, which I think is what you preferred. This gets rid of case (a). (6) Allow the same wide_int constant to have several different representations. Remember that this is to some extent what Kenny's original implementation did, since partial HWIs were filled with don't-cares. You and I are the ones who argued that each wide_int should have a single representation.As far as I can see when we need to extend the tree rep to a bigger precision (extend according to its sign) the only thing we have to do to make the wide-int rep canonical is sign-extend the tree rep at the desired precision. And as we never truncate this sign-extension is a no-op for signed tree constants, is a no-op for unsigned tree constants when precision > xprecision but not for unsigned tree constants and precision == xprecision when the MSB is set. I don't see where adding the fake zero padding for unsigneds in the tree rep helps here. Thus, Index: gcc/tree.h =================================================================== --- gcc/tree.h (revision 203743) +++ gcc/tree.h (working copy) @@ -5189,28 +5189,28 @@ wi::int_traits <const_tree>::decompose ( /* Got to be careful of precision 0 values. */ if (precision) len = MIN (len, max_len); - if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED) + if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED + && precision == xprecision + && len == max_len) { unsigned int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1); - if (small_prec) + if (small_prec + && len == max_len) { /* We have to futz with this because the canonization for short unsigned numbers in wide-int is different from the canonized short unsigned numbers in the tree-cst. */ - if (len == max_len) - { - for (unsigned int i = 0; i < len - 1; i++) - scratch[i] = val[i]; - scratch[len - 1] = sext_hwi (val[len - 1], precision); - return wi::storage_ref (scratch, len, precision); - } + for (unsigned int i = 0; i < len - 1; i++) + scratch[i] = val[i]; + scratch[len - 1] = sext_hwi (val[len - 1], small_prec); + while (len > 1 && scratch[len - 1] == (HOST_WIDE_INT)-1) + len--; + return wi::storage_ref (scratch, len, precision); } /* We have to futz here because a large unsigned int with precision 128 may look (0x0 0xFFFFFFFFFFFFFFFF 0xF...) as a tree-cst and as (0xF...) as a wide-int. */ - else if (precision == xprecision && len == max_len) - while (len > 1 && val[len - 1] == (HOST_WIDE_INT)-1) - len--; + while (len > 1 && val[len - 1] == (HOST_WIDE_INT)-1) + len--; }/* Signed and the rest of the unsigned cases are easy. */explain: the current code misses compressing the now sign-extended value in the small_prec case (and it still sext_hwi's to the wrong precision if precision > HOST_BITS_PER_WIDE_INT). It restricts special-casing to the only relevant case, no explicit or implicit extension. ? Richard.[This function shows another optimization issue: case BOOLEAN_TYPE: /* Cache false or true. */ limit = 2; if (wi::leu_p (cst, 1)) ix = cst.to_uhwi (); I would have expected cst <= 1 be optimized to cst.len == 1 && cst.val[0] <= 1. It expands to <L27>: MEM[(long int *)&D.50698 + 16B] = 1; MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct wide_int_ref_storage *)&D.50698].scratch; MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1; MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32; _277 = MEM[(const struct wide_int_storage *)&cst + 260B]; if (_277 <= 64) goto <bb 42>; else goto <bb 43>; <bb 42>: xl_491 = zext_hwi (1, 32); // ok, checking enabled and thus out-of-line _494 = MEM[(const long int *)&cst]; _495 = (long unsigned int) _494; yl_496 = zext_hwi (_495, _277); _497 = xl_491 < yl_496; goto <bb 44>; <bb 43>: _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage *)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage *)&cst].val, len_274, _277); this keeps D.50698 and cst un-SRAable - inline storage is problematic for this reason. But the representation should guarantee the compare with a low precision (32 bit) constant is evaluatable at compile-time if len of the larger value is > 1, no? <bb 44>: # _504 = PHI <_497(42), _503(43)> D.50698 ={v} {CLOBBER}; if (_504 != 0) goto <bb 45>; else goto <bb 46>; <bb 45>: pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B]; goto <bb 229> (<L131>); <bb 46>: _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0); ix_66 = (int) _65; goto <bb 91>; The question is whether we should try to optimize wide-int for such cases or simply not use wi:leu_p (cst, 1) but rather if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1) ?I think we should do the first, trying to optimise wide_int. Thanks, Richard

**Follow-Ups**:**Re: [wide-int] int_traits <tree>***From:*Richard Biener

**References**:**[wide-int] int_traits <tree>***From:*Richard Biener

**Re: [wide-int] int_traits <tree>***From:*Kenneth Zadeck

**Re: [wide-int] int_traits <tree>***From:*Richard Sandiford

**Re: [wide-int] int_traits <tree>***From:*Kenneth Zadeck

**Re: [wide-int] int_traits <tree>***From:*Richard Sandiford

**Re: [wide-int] int_traits <tree>***From:*Richard Biener

**Re: [wide-int] int_traits <tree>***From:*Richard Sandiford

**Re: [wide-int] int_traits <tree>***From:*Richard Biener

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |