This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: wide-int branch now up for public comment and review

From: Richard Biener <rguenther at suse dot de>
To: Richard Sandiford <rdsandiford at googlemail dot com>
Cc: Kenneth Zadeck <zadeck at naturalbridge dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Mike Stump <mikestump at comcast dot net>
Date: Wed, 28 Aug 2013 13:57:54 +0200 (CEST)
Subject: Re: wide-int branch now up for public comment and review
Authentication-results: sourceware.org; auth=none
References: <520A9DCC dot 6080609 at naturalbridge dot com> <87ppt4e9hg dot fsf at talisman dot default> <alpine dot LNX dot 2 dot 00 dot 1308281040230 dot 20077 at zhemvz dot fhfr dot qr> <878uzmp2lg dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <alpine dot LNX dot 2 dot 00 dot 1308281200380 dot 20077 at zhemvz dot fhfr dot qr> <87y57mnjjk dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com>

On Wed, 28 Aug 2013, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> >> So the precision variable is good for the rtl level in several ways:
> >> 
> >> - As you say, it avoids adding the explicit truncations that (in practice)
> >>   every rtl operation would need
> >> 
> >> - It's more efficient in that case, since we don't calculate high values
> >>   and then discard them immediately.  The common GET_MODE_PRECISION (mode)
> >>   <= HOST_BITS_PER_WIDE_INT case stays a pure HWI operation, despite all
> >>   the wide-int trappings.
> >> 
> >> - It's a good way of checking type safety and making sure that excess
> >>   bits aren't accidentally given a semantic meaning.  This is the most
> >>   important reason IMO.
> >> 
> >> The branch has both the constant-precision, very wide integers that we
> >> want for trees and the variable-precision integers we want for rtl,
> >> so it's not an "either or".  With the accessor-based implementation,
> >> there should be very little cost to having both.
> >
> > So what I wonder (and where we maybe disagree) is how much code
> > wants to inspect "intermediate" results.  Say originally you have
> >
> > rtx foo (rtx x, rtx y)
> > {
> >   rtx tem = simplify_const_binary_operation (PLUS, GET_MODE (x), x, 
> > GEN_INT (1));
> >   rtx res = simplify_const_binary_operation (MINUS, GET_MODE (tem), tem, 
> > y);
> >   return res;
> > }
> >
> > and with wide-int you want to change that to
> >
> > rtx foo (rtx x, rtx y)
> > {
> >   wide_int tem = wide_int (x) + 1;
> >   wide_int res = tem - y;
> >   return res.to_rtx ();
> > }
> >
> > how much code ever wants to inspect 'tem' or 'res'?
> > That is, does it matter
> > if 'tem' and 'res' would have been calculated in "infinite precision"
> > and only to_rtx () would do the truncation to the desired mode?
> >
> > I think not.  The amount of code performing multiple operations on
> > _constants_ in sequence is extremely low (if it even exists).
> >
> > So I'd rather have to_rtx get a mode argument (or a precision) and
> > perform the required truncation / sign-extension at RTX construction
> > time (which is an expensive operation anyway).
> 
> I agree this is where we disagree.  I don't understand why you think
> the above is better.  Why do we want to do "infinite precision"
> addition of two values when only the lowest N bits of those values
> have a (semantically) defined meaning?  Earlier in the thread it sounded
> like we both agreed that having undefined bits in the _representation_
> was bad.  So why do we want to do calculations on parts of values that
> are undefined in the (rtx) semantics?
> 
> E.g. say we're adding two rtx values whose mode just happens to be
> HOST_BITS_PER_WIDE_INT in size.  Why does it make sense to calculate
> the carry from adding the two HWIs, only to add it to an upper HWI
> that has no semantically-defined value?  It's garbage in, garbage out.

Not garbage in, and not garbage out (just wasted work).  That's
the possible downside - the upside is to get rid of the notion of
a 'precision'.

But yes, it's good that we agree on the fact that undefined bits
in the _representation_ are wrong.

OTOH they still will be in some ways "undefined" if you consider

  wide_int xw = from_rtx (xr, mode);
  tree xt = to_tree (xw, type);
  wide_int xw2 = from_tree (xt);

with an unsigned type xw and xw2 will not be equal (in the
'extension' bits) for a value with MSB set.  That is, RTL
chooses to always sign-extend, tree chooses to extend according
to sign information.  wide-int chooses to ... ?  (it seems the
wide-int overall comment lost the part that defined its encoding,
but it seems that we still sign-extend val[len-1], so
(unsigned HOST_WIDE_INT)-1 is { -1U, 0 } with len == 2 and
(HOST_WIDE_INT)-1 is { -1 } with len == 1.  In RTL both
would be encoded with len == 1 (no distinction between a signed
and unsigned number with all bits set), on the current
tree representation the encoding would be with len == 1, too,
as we have TYPE_UNSIGNED to tell us the sign.

So we still need to somehow "map" between those representations.

Coming from the tree side (as opposed to from the RTL side) I'd
have argued you need a 'signed' flag ;)  We side-stepped that
by doing the extension trick in a way that preserves sign information.

Looking at the RTL representation from that wide-int representation
makes RTL look as if all constants are signed.  That's fine if
code that wants to do unsigned stuff properly extends.  So - do
we need a from_rtx_unsigned () constructor that does this?
I'm worried about all the extensions done in operations like add ():

  if (p1 <= HOST_BITS_PER_WIDE_INT)
    {
      result.len = 1;
      result.precision = p1;
      result.val[0] = val[0] + s[0];
      if (p1 < HOST_BITS_PER_WIDE_INT)
        result.val[0] = sext_hwi (result.val[0], p1);
      if (sgn == SIGNED)
        {
          HOST_WIDE_INT x
            = (((result.val[0] ^ val[0]) & (result.val[0] ^ s[0]))
               >> (p1 - 1)) & 1;
          *overflow = (x != 0);
        }
      else
        *overflow = ((unsigned HOST_WIDE_INT) result.val[0]
                     < (unsigned HOST_WIDE_INT) val[0]);
      }

that's supposed to be a cheap operation ... and as far as I can see
it even computes wrong outcome :/  Given precision 32 and the
unsigned value 0xfffffffe, thus { 0x00000000fffffffe }, and
adding 0 it will produce { 0xfffffffffffffffe }, the signed value -2.

So that

      if (p1 < HOST_BITS_PER_WIDE_INT)
        result.val[0] = sext_hwi (result.val[0], p1);

is clearly wrong.  Yes, it may be needed when constructing a
RTX with that value.

Richard.

Follow-Ups:
- Re: wide-int branch now up for public comment and review
  - From: Richard Sandiford

References:
- wide-int branch now up for public comment and review
  - From: Kenneth Zadeck
- Re: wide-int branch now up for public comment and review
  - From: Richard Sandiford
- Re: wide-int branch now up for public comment and review
  - From: Richard Biener
- Re: wide-int branch now up for public comment and review
  - From: Richard Sandiford
- Re: wide-int branch now up for public comment and review
  - From: Richard Biener
- Re: wide-int branch now up for public comment and review
  - From: Richard Sandiford

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]