This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: wide-int branch now up for public comment and review

On 08/23/2013 11:02 AM, Richard Sandiford wrote:
Hi Kenny,

This is the first time I've looked at the implementation of wide-int.h
(rather than just looking at the rtl changes, which as you know I like
in general), so FWIW here are some comments on wide-int.h.  I expect
a lot of them overlap with Richard B.'s comments.

I also expect many of them are going to be annoying, sorry, but this
first one definitely will.  The coding conventions say that functions
should be defined outside the class:

and that opening braces should be on their own line, so most of the file
needs to be reformatted.  I went through and made that change with the
patch below, in the process of reading through.  I also removed "SGN
must be SIGNED or UNSIGNED." because it seemed redundant when those are
the only two values available.  The patch fixes a few other coding standard
problems and typos, but I've not made any actual code changes (or at least,
I didn't mean to).
I had started the file with the functions outside of the class and mike had stated putting them in the class, and so i went with putting them in the class since many of them were one liners and so having them out of line just doubled the size of everything. however, we did not look at the coding conventions and that really settles the argument. Thanks for doing this.
Does it look OK to install?
you can check it in.
I'm still unsure about these "infinite" precision types, but I understand
the motivation and I have no objections.  However:

     * Code that does widening conversions.  The canonical way that
       this is performed is to sign or zero extend the input value to
       the max width based on the sign of the type of the source and
       then to truncate that value to the target type.  This is in
       preference to using the sign of the target type to extend the
       value directly (which gets the wrong value for the conversion
       of large unsigned numbers to larger signed types).

I don't understand this particular reason.  Using the sign of the source
type is obviously right, but why does that mean we need "infinite" precision,
rather than just doubling the precision of the source?
in a sense, "infinite" does not really mean infinite, it really just means large enough so that you never loose any information from the top. For widening all that you really need to be "infinite" is one more bit larger than the destination type. We could have had an abi where you specified the precision of every operation based on the precisions of the inputs. Instead, for these kinds of operations, we decided to sniff the port and determine a fixed width that was large enough for everything that was needed for the port. We call that number infinite. This sort of follows the convention that double-int was used with were infinite was 128 bits, but with our design/implementation, we (hopefully) have no bugs where the size of the datatypes needed never runs into implementation limits.

   * When a constant that has an integer type is converted to a
     wide-int it comes in with precision 0.  For these constants the
     top bit does accurately reflect the sign of that constant; this
     is an exception to the normal rule that the signedness is not
     represented.  When used in a binary operation, the wide-int
     implementation properly extends these constants so that they
     properly match the other operand of the computation.  This allows
     you write:

                tree t = ...
                wide_int x = t + 6;

     assuming t is a int_cst.
This seems dangerous.  Not all code that uses "unsigned HOST_WIDE_INT"
actually wants it to be an unsigned value.  Some code uses it to avoid
the undefinedness of signed overflow.  So these overloads could lead
to us accidentally zero-extending what's conceptually a signed value
without any obvious indication that that's happening.  Also, hex constants
are unsigned int, but it doesn't seem safe to assume that 0x80000000 was
meant to be zero-extended.

I realise the same thing can happen if you mix "unsigned int" with
HOST_WIDE_INT, but the point is that you shouldn't really do that
in general, whereas we're defining these overloads precisely so that
a mixture can be used.

I'd prefer some explicit indication of the sign, at least for anything
other than plain "int" (so that the compiler will complain about uses
of "unsigned int" and above).

There is a part of me that finds this scary and a part of me that feels that the concern is largely theoretical. It does make it much easier to read and understand the code to be able to write "t + 6" rather than "wide_int (t) + wide_int::from_uhwi" (6) but of course you loose some control over how 6 is converted.

   Note that the bits above the precision are not defined and the
   algorithms used here are careful not to depend on their value.  In
   particular, values that come in from rtx constants may have random
I have a feeling I'm rehashing a past debate, sorry, but rtx constants can't
have random bits.  The upper bits must be a sign extension of the value.
There's exactly one valid rtx for each (value, mode) pair.  If you saw
something different then that sounds like a bug.  The rules should already
be fairly well enforced though, since something like (const_int 128) --
or (const_int 256) -- will not match a QImode operand.

This is probably the part of the representation that I disagree most with.
There seem to be two main ways we could hande the extension to whole HWIs:

(1) leave the stored upper bits undefined and extend them on read
(2) keep the stored upper bits in extended form
It is not a matter of opening old wounds. I had run some tests on x86-64 and was never able to assume that the bits above the precision had always been canonized. I will admit that i did not try to run down the bugs because i thought that since the rtx constructors did not have a mode associated with them now was one required to in the constructors, that this was not an easily solvable problem. So i have no idea if i hit the one and only bug or was about to start drinking from a fire hose. But the back ends are full of GEN_INT (a) where a came from god knows where and you almost never see it properly canonized. I think that until GEN_INT takes a mandatory non VOIDmode mode parameter, and that constructor canonizes it, you are doomed chasing this bug forever. My/our experience on the dataflow branch was that unless you go clean things up AND put in a bunch of traps to keep people honest, you are never going to be able to make this assumption.

Having said that, we actually do neither of (1) or (2) inside of wide-int. For rtl to wide-int, we leave the upper bits undefined and never allow you to look at them because the constructor has a mode and that mode allows you to draw a line in the sand. There is no constructor for the "infinite" wide ints from rtl so you have no way to look.

Doing this allows us to do something that richi really wanted: avoiding copying. We do not get to do as much richi would like and when he comes back from leave, he may have other places where can apply it, but right now if you say w = t + 6 as above, it "borrows" the rep from t to do the add, it does not really build a wide-int. We also do this if t is an rtx const. But if we had to canonize the number, then we could not borrow the rep.
The patch goes for (1) but (2) seems better to me, for a few reasons:

* As above, constants coming from rtl are already in the right form,
   so if you create a wide_int from an rtx and only query it, no explicit
   extension is needed.

* Things like logical operations and right shifts naturally preserve
   the sign-extended form, so only a subset of write operations need
   to take special measures.
right now the internals of wide-int do not keep the bits above the precision clean. as you point out this could be fixed by changing lshift, add, sub, mul, div (and anything else i have forgotten about) and removing the code that cleans this up on exit. I actually do not really care which way we go here but if we do go on keeping the bits clean above the precision inside wide-int, we are going to have to clean the bits in the constructors from rtl, or fix some/a lot of bugs.

But if you want to go with the stay clean plan you have to start clean, so at the rtl level this means copying. and it is the not copying trick that pushed me in the direction we went.

At the tree level, this is not an issue. There are no constructors for tree-csts that do not have a type and before we copy the rep from the wide-int to the tree, we clean the top bits. (BTW - If i had to guess what the bug is with the missing messages on the mips port, it would be because the front ends HAD a bad habit of creating constants that did not fit into a type and then later checking to see if there were any interesting bits above the precision in the int-cst. This now does not work because we clean out those top bits on construction but it would not surprise me if we missed the fixed point constant path.) So at the tree level, we could easily go either way here, but there is a cost at the rtl level with doing (2).

* You have a public interface that exposes the underlying HWIs
   (which is fine with me FWIW), so it seems better to expose a fully-defined
   HWI rather than only a partially-defined HWI.

E.g. zero_p is:


   if (precision && precision < HOST_BITS_PER_WIDE_INT)
     x = sext_hwi (val[0], precision);
   else if (len == 0)
       gcc_assert (precision == 0);
       return true;
the above test for len==0 has been removed because it is rot.

     x = val[0];

   return len == 1 && x == 0;

but I think it really ought to be just:

   return len == 1 && val[0] == 0;
If we did your 2, it would be this way.
   When the precision is 0, all the bits in the LEN elements of
   VEC are significant with no undefined bits.  Precisionless
   constants are limited to being one or two HOST_WIDE_INTs.  When two
   are used the upper value is 0, and the high order bit of the first
   value is set.  (Note that this may need to be generalized if it is
   ever necessary to support 32bit HWIs again).
I didn't understand this.  When are two HOST_WIDE_INTs needed for
"precision 0"?
if a large unsigned constant comes in with the top bit set, the canonized value takes 2 hwis, the top hwi being 0.
#define addr_max_bitsize (64)
#define addr_max_precision \
These should either be lower-case C++ constants or upper-case macros.
this will be fixed.
  /* VAL is set to a size that is capable of computing a full
     multiplication on the largest mode that is represented on the
     target.  Currently there is a part of tree-vrp that requires 2x +
     2 bits of precision where x is the precision of the variables
     being optimized.  */
This comment seems redundant with the one above WIDE_INT_MAX_ELTS
and likely to get out of date.
this will be fixed
     So real hardware only looks at a small part of the shift amount.
     On IBM machines, this tends to be 1 more than what is necessary
     to encode the shift amount.  The rest of the world looks at only
     the minimum number of bits.  This means that only 3 gate delays
     are necessary to set up the shifter.
I agree that it makes sense for wide_int to provide a form of shift
in which the shift amount is truncated.  However, I strongly believe
wide-int.h should not test SHIFT_COUNT_TRUNCATED directly.  It should
be up to the callers to decide when truncation is needed (and to what width).
richi does not like this either so i will get rid of it.

We really need to get rid of the #include "tm.h" in wide-int.h.
MAX_BITSIZE_MODE_ANY_INT should be the only partially-target-dependent
thing in there.  If that comes from tm.h then perhaps we should put it
into a new header file instead.
I will talk to mike about fixing this.
/* Return THIS as a signed HOST_WIDE_INT.  If THIS does not fit in
    PREC, the information is lost. */
to_shwi (unsigned int prec = 0) const
Just dropping the excess bits seems dangerous.  I think we should assert
instead, at least when prec is 0.
there are times when this is useful. there is a lot of code that just wants to look at the bottom bits to do some alignment stuff. I guess that code could just grab the bottom elt of the array, but has not generally been how these this has been done.
/* Return true if THIS is negative based on the interpretation of SGN.
    For UNSIGNED, this is always false.  This is correct even if
    precision is 0.  */
inline bool
wide_int::neg_p (signop sgn) const
It seems odd that you have to pass SIGNED here.  I assume you were doing
it so that the caller is forced to confirm signedness in the cases where
a tree type is involved, but:

* neg_p kind of implies signedness anyway
* you don't require this for minus_one_p, so the interface isn't consistent
* at the rtl level signedness isn't a property of the "type" (mode),
   so it seems strange to add an extra hoop there
it was done this way so that you can pass in TYPE_SIGN (t) in as the second parameter. We could default the parameter to SIGNED and that would solve both cases. I will look into minus_one_p.

/* Return true if THIS fits in an unsigned HOST_WIDE_INT with no
    loss of precision.  */
inline bool
wide_int_ro::fits_uhwi_p () const
   return len == 1 || (len == 2 && val[1] == 0);
This doesn't look right, since len == 1 could mean that you have a
gazillion-bit all-ones number.  Also, the val[1] == 0 check seems
to assume the upper bits are defined when the precision isn't a multiple
of the HWI size (although as above I that's a good thing and should be
you are correct.
sign_mask has:

   gcc_unreachable ();
#if 0
   return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
Maybe remove this?

The inline div routines do:
i will work on this this weekend. tree vrp has not been our friend and sometimes does not like to compile this function.

   if (overflow)
     *overflow = false;
but then just pass overflow to divmod_internal.  Seems better to initialise
*overflow there instead.

div_floor has:

     return divmod_internal (true, val, len, p1, s, cl, p2, sgn,
			    &remainder, false, overflow);

     if (quotient.neg_p (sgn) && !remainder.zero_p ())
       return quotient - 1;
     return quotient;
where the last bit is unreachable.
not to mention that the compiler never complained.

/* Divide DIVISOR into THIS producing the remainder.  The result is
    the same size as the operands.  The sign is specified in SGN.
    The output is floor truncated.  OVERFLOW is set to true if the
    result overflows, false otherwise.  */
template <typename T>
inline wide_int_ro
wide_int_ro::mod_floor (const T &c, signop sgn, bool *overflow = 0) const
It's really the quotient that's floor-truncated, not the output
(the remainder).  I was a bit confused at first why you'd need to
truncate an integer remainder.  Same for the other functions.
The comments needs work, not the code. You do have to adjust the remainder in some cases, but it is not truncation.
   debug_vwa ("wide_int_ro:: %d = (%s == %s)\n", result, *this, s, cl, p2);
I think these are going to bitrot quickly if we #if 0 then out.
I think we should either use:

     debug_vwa ("wide_int_ro:: %d = (%s == %s)\n", result, *this, s, cl, p2);

or drop them.
my plan is to leave these in while the branch is still being developed and then get rid of them before it is merged. My guess is that i am going to need them still when i try the 32bit hwi test.

The implementations of the general to_shwi1 and to_shwi2 functions look
identical.  I think the comment should explain why two functions are needed.
I will check this
/* Negate THIS.  */
inline wide_int_ro
wide_int_ro::operator - () const
   wide_int_ro r;
   r = wide_int_ro (0) - *this;
   return r;

/* Negate THIS.  */
inline wide_int_ro
wide_int_ro::neg () const
   wide_int_ro z = wide_int_ro::from_shwi (0, precision);

   gcc_checking_assert (precision);
   return z - *this;
Why do we need both of these, and why are they implemented slightly
neg should go away.
template <int bitsize>
inline bool
fixed_wide_int <bitsize>::multiple_of_p (const wide_int_ro &factor,
					 signop sgn,
					 fixed_wide_int *multiple) const
   return wide_int_ro::multiple_of_p (factor, sgn,
				     reinterpret_cast <wide_int *> (multiple));
The patch has several instances of this kind of reinterpret_cast.
It looks like an aliasing violation.

The main thing that's changed since the early patches is that we now
have a mixture of wide-int types.  This seems to have led to a lot of
boiler-plate forwarding functions (or at least it felt like that while
moving them all out the class).  And that in turn seems to be because
you're trying to keep everything as member functions.  E.g. a lot of the
forwarders are from a member function to a static function.

Wouldn't it be better to have the actual classes be light-weight,
with little more than accessors, and do the actual work with non-member
template functions?  There seems to be 3 grades of wide-int:

   (1) read-only, constant precision  (from int, etc.)
   (2) read-write, constant precision  (fixed_wide_int)
   (3) read-write, variable precision  (wide_int proper)

but we should be able to hide that behind templates, with compiler errors
if you try to write to (1), etc.

To take one example, the reason we can't simply use things like
std::min on wide ints is because signedness needs to be specified
explicitly, but there's a good reason why the standard defined
std::min (x, y) rather than x.min (y).  It seems like we ought
to have smin and umin functions alongside std::min, rather than
make them member functions.  We could put them in a separate namespace
if necessary.

I might have a go at trying this last part next week, unless Richard is
already in the process of rewriting things :-)
mike will answer this.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]