This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: wide-int branch now up for public comment and review
- From: Richard Biener <rguenther at suse dot de>
- To: Richard Sandiford <rdsandiford at googlemail dot com>
- Cc: Kenneth Zadeck <zadeck at naturalbridge dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, Mike Stump <mikestump at comcast dot net>, r dot sandiford at uk dot ibm dot com
- Date: Wed, 28 Aug 2013 10:57:09 +0200 (CEST)
- Subject: Re: wide-int branch now up for public comment and review
- Authentication-results: sourceware.org; auth=none
- References: <520A9DCC dot 6080609 at naturalbridge dot com> <87ppt4e9hg dot fsf at talisman dot default>
On Fri, 23 Aug 2013, Richard Sandiford wrote:
> Hi Kenny,
>
> This is the first time I've looked at the implementation of wide-int.h
> (rather than just looking at the rtl changes, which as you know I like
> in general), so FWIW here are some comments on wide-int.h. I expect
> a lot of them overlap with Richard B.'s comments.
>
> I also expect many of them are going to be annoying, sorry, but this
> first one definitely will. The coding conventions say that functions
> should be defined outside the class:
>
> http://gcc.gnu.org/codingconventions.html
>
> and that opening braces should be on their own line, so most of the file
> needs to be reformatted. I went through and made that change with the
> patch below, in the process of reading through. I also removed "SGN
> must be SIGNED or UNSIGNED." because it seemed redundant when those are
> the only two values available. The patch fixes a few other coding standard
> problems and typos, but I've not made any actual code changes (or at least,
> I didn't mean to).
>
> Does it look OK to install?
>
> I'm still unsure about these "infinite" precision types, but I understand
> the motivation and I have no objections. However:
>
> > * Code that does widening conversions. The canonical way that
> > this is performed is to sign or zero extend the input value to
> > the max width based on the sign of the type of the source and
> > then to truncate that value to the target type. This is in
> > preference to using the sign of the target type to extend the
> > value directly (which gets the wrong value for the conversion
> > of large unsigned numbers to larger signed types).
>
> I don't understand this particular reason. Using the sign of the source
> type is obviously right, but why does that mean we need "infinite" precision,
> rather than just doubling the precision of the source?
The comment indeed looks redundant - of course it is not correct
to extend the value "directly".
> > * When a constant that has an integer type is converted to a
> > wide-int it comes in with precision 0. For these constants the
> > top bit does accurately reflect the sign of that constant; this
> > is an exception to the normal rule that the signedness is not
> > represented. When used in a binary operation, the wide-int
> > implementation properly extends these constants so that they
> > properly match the other operand of the computation. This allows
> > you write:
> >
> > tree t = ...
> > wide_int x = t + 6;
> >
> > assuming t is a int_cst.
>
> This seems dangerous. Not all code that uses "unsigned HOST_WIDE_INT"
> actually wants it to be an unsigned value. Some code uses it to avoid
> the undefinedness of signed overflow. So these overloads could lead
> to us accidentally zero-extending what's conceptually a signed value
> without any obvious indication that that's happening. Also, hex constants
> are unsigned int, but it doesn't seem safe to assume that 0x80000000 was
> meant to be zero-extended.
>
> I realise the same thing can happen if you mix "unsigned int" with
> HOST_WIDE_INT, but the point is that you shouldn't really do that
> in general, whereas we're defining these overloads precisely so that
> a mixture can be used.
>
> I'd prefer some explicit indication of the sign, at least for anything
> other than plain "int" (so that the compiler will complain about uses
> of "unsigned int" and above).
I prefer the automatic promotion - it is exactly what regular C types
do. Now, consider
wide_int x = ... construct 5 with precision 16 ...
wide_int y = x + 6;
now, '6' is 'int' (precision 32), but with wide-int we treat it
as precision '0' ('infinite'). For x + 6 we the _truncate_ its
precision to that of 'x'(?) not exactly matching C behavior
(where we'd promote 'x' to 'int', perform the add and then truncate
to the precision of 'y' - which for wide-int gets its precision
from the result of x + 6).
Mimicing C would support dropping those 'require equal precision'
asserts but also would require us to properly track a sign to be
able to promote properly (or as I argued all the time always
properly sign-extend values so we effectively have infinite precision
anyway).
The fits_uhwi_p implementation changes scares me off that
"upper bits are undefined" thing a lot again... (I hate introducing
'undefinedness' into the compiler by 'design')
> > Note that the bits above the precision are not defined and the
> > algorithms used here are careful not to depend on their value. In
> > particular, values that come in from rtx constants may have random
> > bits.
Which is a red herring. It should be fixed. I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL. I don't see accesses masking out 'undefined' bits
anywhere.
> I have a feeling I'm rehashing a past debate, sorry, but rtx constants can't
> have random bits. The upper bits must be a sign extension of the value.
> There's exactly one valid rtx for each (value, mode) pair. If you saw
> something different then that sounds like a bug. The rules should already
> be fairly well enforced though, since something like (const_int 128) --
> or (const_int 256) -- will not match a QImode operand.
See. We're saved ;)
> This is probably the part of the representation that I disagree most with.
> There seem to be two main ways we could hande the extension to whole HWIs:
>
> (1) leave the stored upper bits undefined and extend them on read
> (2) keep the stored upper bits in extended form
>
> The patch goes for (1) but (2) seems better to me, for a few reasons:
I agree whole-heartedly.
> * As above, constants coming from rtl are already in the right form,
> so if you create a wide_int from an rtx and only query it, no explicit
> extension is needed.
>
> * Things like logical operations and right shifts naturally preserve
> the sign-extended form, so only a subset of write operations need
> to take special measures.
>
> * You have a public interface that exposes the underlying HWIs
> (which is fine with me FWIW), so it seems better to expose a fully-defined
> HWI rather than only a partially-defined HWI.
>
> E.g. zero_p is:
>
> HOST_WIDE_INT x;
>
> if (precision && precision < HOST_BITS_PER_WIDE_INT)
> x = sext_hwi (val[0], precision);
> else if (len == 0)
> {
> gcc_assert (precision == 0);
> return true;
> }
> else
> x = val[0];
>
> return len == 1 && x == 0;
>
> but I think it really ought to be just:
>
> return len == 1 && val[0] == 0;
Yes!
But then - what value does keeping track of a 'precision' have
in this case? It seems to me it's only a "convenient carrier"
for
wide_int x = wide-int-from-RTX (y);
machine_mode saved_mode = mode-available? GET_MODE (y) : magic-mode;
... process x ...
RTX = RTX-from-wide_int (x, saved_mode);
that is, wide-int doesn't do anything with 'precision' but you
can extract it later to not need to remember a mode you were
interested in?
Oh, and of course some operations require a 'precision', like rotate.
> > When the precision is 0, all the bits in the LEN elements of
> > VEC are significant with no undefined bits. Precisionless
> > constants are limited to being one or two HOST_WIDE_INTs. When two
> > are used the upper value is 0, and the high order bit of the first
> > value is set. (Note that this may need to be generalized if it is
> > ever necessary to support 32bit HWIs again).
>
> I didn't understand this. When are two HOST_WIDE_INTs needed for
> "precision 0"?
For the wide_int containing unsigned HOST_WIDE_INT ~0. As we
sign-extend the representation (heh, yes, we do or should!) we
require an extra HWI to store the fact that ~0 is unsigned.
> The main thing that's changed since the early patches is that we now
> have a mixture of wide-int types. This seems to have led to a lot of
> boiler-plate forwarding functions (or at least it felt like that while
> moving them all out the class). And that in turn seems to be because
> you're trying to keep everything as member functions. E.g. a lot of the
> forwarders are from a member function to a static function.
>
> Wouldn't it be better to have the actual classes be light-weight,
> with little more than accessors, and do the actual work with non-member
> template functions? There seems to be 3 grades of wide-int:
>
> (1) read-only, constant precision (from int, etc.)
> (2) read-write, constant precision (fixed_wide_int)
> (3) read-write, variable precision (wide_int proper)
>
> but we should be able to hide that behind templates, with compiler errors
> if you try to write to (1), etc.
Yeah, I'm probably trying to clean up the implementation once I
got past recovering from two month without GCC ...
Richard.