This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: wide-int branch now up for public comment and review
- From: Richard Biener <rguenther at suse dot de>
- To: Richard Sandiford <rdsandiford at googlemail dot com>
- Cc: Kenneth Zadeck <zadeck at naturalbridge dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Mike Stump <mikestump at comcast dot net>
- Date: Wed, 28 Aug 2013 14:48:31 +0200 (CEST)
- Subject: Re: wide-int branch now up for public comment and review
- Authentication-results: sourceware.org; auth=none
- References: <520A9DCC dot 6080609 at naturalbridge dot com> <87ppt4e9hg dot fsf at talisman dot default> <alpine dot LNX dot 2 dot 00 dot 1308281040230 dot 20077 at zhemvz dot fhfr dot qr> <878uzmp2lg dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <alpine dot LNX dot 2 dot 00 dot 1308281200380 dot 20077 at zhemvz dot fhfr dot qr> <87y57mnjjk dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <alpine dot LNX dot 2 dot 00 dot 1308281332491 dot 20077 at zhemvz dot fhfr dot qr> <871u5ec836 dot fsf at talisman dot default>
On Wed, 28 Aug 2013, Richard Sandiford wrote:
> Richard Biener <rguenther@suse.de> writes:
>
> > On Wed, 28 Aug 2013, Richard Sandiford wrote:
> >
> >> Richard Biener <rguenther@suse.de> writes:
> >> >> So the precision variable is good for the rtl level in several ways:
> >> >>
> >> >> - As you say, it avoids adding the explicit truncations that (in practice)
> >> >> every rtl operation would need
> >> >>
> >> >> - It's more efficient in that case, since we don't calculate high values
> >> >> and then discard them immediately. The common GET_MODE_PRECISION (mode)
> >> >> <= HOST_BITS_PER_WIDE_INT case stays a pure HWI operation, despite all
> >> >> the wide-int trappings.
> >> >>
> >> >> - It's a good way of checking type safety and making sure that excess
> >> >> bits aren't accidentally given a semantic meaning. This is the most
> >> >> important reason IMO.
> >> >>
> >> >> The branch has both the constant-precision, very wide integers that we
> >> >> want for trees and the variable-precision integers we want for rtl,
> >> >> so it's not an "either or". With the accessor-based implementation,
> >> >> there should be very little cost to having both.
> >> >
> >> > So what I wonder (and where we maybe disagree) is how much code
> >> > wants to inspect "intermediate" results. Say originally you have
> >> >
> >> > rtx foo (rtx x, rtx y)
> >> > {
> >> > rtx tem = simplify_const_binary_operation (PLUS, GET_MODE (x), x,
> >> > GEN_INT (1));
> >> > rtx res = simplify_const_binary_operation (MINUS, GET_MODE (tem), tem,
> >> > y);
> >> > return res;
> >> > }
> >> >
> >> > and with wide-int you want to change that to
> >> >
> >> > rtx foo (rtx x, rtx y)
> >> > {
> >> > wide_int tem = wide_int (x) + 1;
> >> > wide_int res = tem - y;
> >> > return res.to_rtx ();
> >> > }
> >> >
> >> > how much code ever wants to inspect 'tem' or 'res'?
> >> > That is, does it matter
> >> > if 'tem' and 'res' would have been calculated in "infinite precision"
> >> > and only to_rtx () would do the truncation to the desired mode?
> >> >
> >> > I think not. The amount of code performing multiple operations on
> >> > _constants_ in sequence is extremely low (if it even exists).
> >> >
> >> > So I'd rather have to_rtx get a mode argument (or a precision) and
> >> > perform the required truncation / sign-extension at RTX construction
> >> > time (which is an expensive operation anyway).
> >>
> >> I agree this is where we disagree. I don't understand why you think
> >> the above is better. Why do we want to do "infinite precision"
> >> addition of two values when only the lowest N bits of those values
> >> have a (semantically) defined meaning? Earlier in the thread it sounded
> >> like we both agreed that having undefined bits in the _representation_
> >> was bad. So why do we want to do calculations on parts of values that
> >> are undefined in the (rtx) semantics?
> >>
> >> E.g. say we're adding two rtx values whose mode just happens to be
> >> HOST_BITS_PER_WIDE_INT in size. Why does it make sense to calculate
> >> the carry from adding the two HWIs, only to add it to an upper HWI
> >> that has no semantically-defined value? It's garbage in, garbage out.
> >
> > Not garbage in, and not garbage out (just wasted work).
>
> Well, it's not garbage in the sense of an uninitialised HWI detected
> by valgrind (say). But it's semantic garbage.
>
> > That's the possible downside - the upside is to get rid of the notion
> > of a 'precision'.
>
> No, it's still there, just in a different place.
>
> > OTOH they still will be in some ways "undefined" if you consider
> >
> > wide_int xw = from_rtx (xr, mode);
> > tree xt = to_tree (xw, type);
> > wide_int xw2 = from_tree (xt);
> >
> > with an unsigned type xw and xw2 will not be equal (in the
> > 'extension' bits) for a value with MSB set.
>
> Do you mean it's undefined as things stand, or when using "infinite
> precision" for rtl? It shouldn't lead to anything undefined at
> the moment. Only the low GET_MODE_BITSIZE (mode) bits of xw are
> meaningful, but those are also the only bits that would be used.
>
> > That is, RTL chooses to always sign-extend, tree chooses to extend
> > according to sign information. wide-int chooses to ... ? (it seems
> > the wide-int overall comment lost the part that defined its encoding,
> > but it seems that we still sign-extend val[len-1], so (unsigned
> > HOST_WIDE_INT)-1 is { -1U, 0 } with len == 2 and (HOST_WIDE_INT)-1 is
> > { -1 } with len == 1.
>
> Only if the precision is > HOST_BITS_PER_WIDE_INT. If the precision
> is HOST_BITS_PER_WIDE_INT then both are { -1U }.
That wasn't my understanding on how things work.
> "len" is never
> greater than precision * HOST_BITS_PER_WIDE_INT.
"len" can be one larger than precision * HOST_BITS_PER_WIDE_INT as
I originally designed the encoding scheme. It was supposed to
be able to capture the difference between a positive and a negative
number (unlike the RTL rep).
I see canonize() truncates to blocks_needed.
That said, I'm still missing one of my most important requests:
- all references to HOST_WIDE_INT (and HOST_BITS_PER_WIDE_INT and
firends) need to go and be replaced with a private typedef
and proper constants (apart from in the _hwi interface API of course)
- wide_int needs to work with the storage not being HOST_WIDE_INT,
in the end it should be HOST_WIDEST_FAST_INT, but for testing coverage
it ideally should work for 'signed char' as well (at _least_ it needs
to work for plain 'int')
> > In RTL both would be encoded with len == 1 (no
> > distinction between a signed and unsigned number with all bits set),
>
> Same again: both are -1 if the mode is HOST_BITS_PER_WIDE_INT or smaller.
> If the mode is wider then RTL too uses { -1, 0 }. So the current wide_int
> representation matches the RTL representation pretty closely, except for
> the part about wide_int leaving excess bits undefined. But that's just
> a convenience, it isn't important in terms of what operators to.
>
> > on the current tree representation the encoding would be with len ==
> > 1, too, as we have TYPE_UNSIGNED to tell us the sign.
>
> OK.
>
> > So we still need to somehow "map" between those representations.
>
> Right, that's what the constructors, from_* and to_* routines do.
I wonder where the from_tree and to_tree ones are? Are they
from_double_int / wide_int_to_tree (what's wide_int_to_infinite_tree?)
> > Looking at the RTL representation from that wide-int representation
> > makes RTL look as if all constants are signed.
>
> Well, except for direct accessors like elt(), the wide-int representation
> is shielded from the interface (as it should be). It doesn't affect the
> result of arithmetic. The representation should have no visible effect
> for rtl or tree users who avoid elt().
True. Though it should be one that allows an efficient implementation.
Richard.