This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Richard Sandiford <rsandifo at linux dot vnet dot ibm dot com>*To*: Richard Biener <rguenther at suse dot de>*Cc*: Kenneth Zadeck <zadeck at naturalbridge dot com>, gcc-patches at gcc dot gnu dot org, Mike Stump <mikestump at comcast dot net>*Date*: Thu, 17 Oct 2013 11:58:51 +0100*Subject*: Re: [wide-int] int_traits <tree>*Authentication-results*: sourceware.org; auth=none*References*: <alpine dot LNX dot 2 dot 00 dot 1310161537220 dot 11149 at zhemvz dot fhfr dot qr> <525EB50F dot 2090003 at naturalbridge dot com> <87ppr56on1 dot fsf at talisman dot default> <525EFC33 dot 4010304 at naturalbridge dot com> <87hacg720z dot fsf at talisman dot default> <alpine dot LNX dot 2 dot 00 dot 1310171015000 dot 11149 at zhemvz dot fhfr dot qr>

Richard Biener <rguenther@suse.de> writes: > On Thu, 17 Oct 2013, Richard Sandiford wrote: > >> Kenneth Zadeck <zadeck@naturalbridge.com> writes: >> >> As mentioned in my message yesterday, I thought your new way of canonising >> >> unsigned tree constants meant that there was always an upper zero bit. >> >> Is that right? >> > i believe this is correct. >> >> If so, xprecision < precision is a no-op, because the number always >> >> has the right form for wider precisions. The only difficult case is >> >> xprecision == precision, since then we need to peel off any upper -1 HWIs. >> > say my HWI size is 8 bits (just to keep from typing a million 'f's. if >> > i have a 16 bit unsigned number that is all 1s in the tree world it is 3 >> > hwis >> > 0x00 0xff 0xff. >> > >> > but inside regular wide int, it would take 1 wide int whose value is 0xff. >> > inside of max it would be the same as the tree, but then the test >> > precision < xprecision + hbpwi never kicks in because precision is >> > guaranteed to be huge. >> > >> > inside of addr_wide_int, i think we tank with the assertion. >> >> It should be OK for addr_wide_int too. The precision still fits 2 HWIs. >> The initial length is greater than the maximum length of an addr_wide_int, >> but your "len = MAX (len, max_len)" deals with that. > > It's > > len = MIN (len, max_len) Oops, yeah, I meant MIN, sorry. > which looked suspicious to me, but with precision >= xprecision > precision can only be zero if xprecision is zero which looked to > me like it cannot happen - or rather it should be fixed. Despite the comment above the code, I don't think this MIN is there for the zero-precision case. I think it's there to handle the new tree representation. The new tree representation can have a length greater than max_len for an unsigned tree constant that occupies a whole number of HWIs. The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00. When extended to max_wide_int the representation is the same. But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero. The MIN trims the length from 3 to 2 in the last case. > > the case actually comes up on the ppc because they do a lot of 128 bit >> > math. I think i got thru the x86-64 without noticing this. >> >> Well, it'd be suspicious if we're directly using 128-bit numbers >> in addr_wide_int. The justification for the assertion was that we >> should explicitly truncate to addr_wide_int when deliberately >> ignoring upper bits, beyond bit or byte address width. 128 bits >> definitely falls into that category on powerpc. > > My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid > wide-int value if it has precision 16. No, for a 16-bit wide_int it should be 0xff. 0x00 0xff 0xff is correct for any wide_int wider than 16 bits though. > AFAIK that is what the code produces, In which case? This is: precision == 16 xprecision == 16 len == 3 max_len == 2 The MIN trims the len to 2 and then the loop Kenny added trims it again to 1, so the "0x00 0xff 0xff" becomes "0xff". The "0x00 0xff" is still there in the array, but not used. > but now Kenny says this is only for some kind > of wide-ints but not all? That is, why is > > inline wi::storage_ref > wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch, > unsigned int precision, const_tree > x) > { > unsigned int len = TREE_INT_CST_NUNITS (x); > const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, > 0); > return wi::storage_ref (val, len, precision); > } > > not a valid implementation together with making sure that the > INTEGER_CST tree rep has that extra word of zeros if required? The fundamental problem here is that we're trying to support two cases: (a) doing N-bit arithemtic in cases where the inputs have N bits (b) doing N-bit arithmetic in cases where the inputs have fewer than N bits and are extended according to TYPE_SIGN. Let's assume 32-bit HWIs. The 16-bit (4-hex-digit) constant 0x8000 is 0x8000 regardless of whether the type is signed or unsigned. But if it's extended to 32-bits you get two different numbers 0xffff8000 and 0x00008000, depending on the sign. So for one value of the "precision" parameter (i.e. xprecision), signed and unsigned constants produce the same number. But for another value of the "precision" parameter (those greater than xprecision), signed and unsigned constants produce different numbers. Yet at the moment the tree constant has a single representation. So I think the possibilities are: (1) Use the representation of an N-bit wide_int to store N-bit tree constants. Do work when extending them to wider wide_ints. (2) Use the representation of max_wide_int to store N-bit tree constants. Do work when creating an N-bit wide_int. (3) Store both representations in the tree constant. (4) Require all tree arithmetic to be done in the same way as rtl arithmetic, with explicit extensions. This gets rid of case (b). (5) Require all tree arithemtic to be done in wider wide_ints than the inputs, which I think is what you preferred. This gets rid of case (a). (6) Allow the same wide_int constant to have several different representations. Remember that this is to some extent what Kenny's original implementation did, since partial HWIs were filled with don't-cares. You and I are the ones who argued that each wide_int should have a single representation. > [This function shows another optimization issue: > > case BOOLEAN_TYPE: > /* Cache false or true. */ > limit = 2; > if (wi::leu_p (cst, 1)) > ix = cst.to_uhwi (); > > I would have expected cst <= 1 be optimized to cst.len == 1 && > cst.val[0] <= 1. It expands to > > <L27>: > MEM[(long int *)&D.50698 + 16B] = 1; > MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct > wide_int_ref_storage *)&D.50698].scratch; > MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1; > MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32; > _277 = MEM[(const struct wide_int_storage *)&cst + 260B]; > if (_277 <= 64) > goto <bb 42>; > else > goto <bb 43>; > > <bb 42>: > xl_491 = zext_hwi (1, 32); // ok, checking enabled and thus out-of-line > _494 = MEM[(const long int *)&cst]; > _495 = (long unsigned int) _494; > yl_496 = zext_hwi (_495, _277); > _497 = xl_491 < yl_496; > goto <bb 44>; > > <bb 43>: > _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage > *)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage > *)&cst].val, len_274, _277); > > this keeps D.50698 and cst un-SRAable - inline storage is problematic > for this reason. But the representation should guarantee the > compare with a low precision (32 bit) constant is evaluatable > at compile-time if len of the larger value is > 1, no? > > <bb 44>: > # _504 = PHI <_497(42), _503(43)> > D.50698 ={v} {CLOBBER}; > if (_504 != 0) > goto <bb 45>; > else > goto <bb 46>; > > <bb 45>: > pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B]; > goto <bb 229> (<L131>); > > <bb 46>: > _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0); > ix_66 = (int) _65; > goto <bb 91>; > > The question is whether we should try to optimize wide-int for > such cases or simply not use wi:leu_p (cst, 1) but rather > > if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1) > > ? I think we should do the first, trying to optimise wide_int. Thanks, Richard

**Follow-Ups**:**Re: [wide-int] int_traits <tree>***From:*Richard Biener

**References**:**[wide-int] int_traits <tree>***From:*Richard Biener

**Re: [wide-int] int_traits <tree>***From:*Kenneth Zadeck

**Re: [wide-int] int_traits <tree>***From:*Richard Sandiford

**Re: [wide-int] int_traits <tree>***From:*Kenneth Zadeck

**Re: [wide-int] int_traits <tree>***From:*Richard Sandiford

**Re: [wide-int] int_traits <tree>***From:*Richard Biener

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |