This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

On Wed, Apr 3, 2013 at 2:05 PM, Kenneth Zadeck <> wrote:
> On 04/03/2013 05:17 AM, Richard Biener wrote:
>> In the end you will have a variable-size storage in TREE_INT_CST thus
>> you will have at least to emit _code_ copying over meta-data and data
>> from the tree representation to the wide-int (similar for RTX
>> I'm objecting to the amount of code you emit and agree that the runtime
>> cost is copying the meta-data (hopefully optimizable via CSE / SRA)
>> and in most cases one (or two) iterations of the loop copying the data
>> (not optimizable).
> i did get rid of the bitsize in the wide-int patch so at this point the meta
> data is the precision and the len.
> not really a lot here.   As usual we pay a high price in gcc for not pushing
> the tree rep down into the rtl level, then it would have been acceptable to
> have the tree type bleed into the wide-int code.
>>> 2)  You present this as if the implementor actually should care about the
>>> implementation and you give 3 alternatives:  the double_int, the current
>>> one, and HWI.     We have tried to make it so that the client should not
>>> care.   Certainly in my experience here, I have not found a place to
>>> care.
>> Well, similar as for the copying overhead for tree your approach requires
>> overloading operations for HOST_WIDE_INT operands to be able to
>> say wi + 1 (which is certainly desirable), or the overhead of using
>> wide_int_one ().
>>> In my opinion double_int needs to go away.  That is the main thrust of my
>>> patches.   There is no place in a compiler for an abi that depends on
>>> constants fitting into 2 two words whose size is defined by the host.
>> That's true.  I'm not arguing to preserve "double-int" - I'm arguing to
>> preserve a way to ask for an integer type on the host with (at least)
>> N bits.  Almost all double-int users really ask for an integer type on the
>> host that has at least as many bits as the pointer representation (or
>> word_mode) on
>> the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
>> targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
>> precision - that is just what happens to be there.  Thus I am providing
>> a way to say get me a host integer with at least N bits (VRP asks for
>> this, for example).
>> What I was asking for is that whatever can provide the above should share
>> the functional interface with wide-int (or the othert way around).  And I
>> was claiming that wide-int is too fat, because current users of double-int
>> eventually store double-ints permanently.
> The problem is that, in truth, double int is too fat. 99.something% of all
> constants fit in 1 hwi and that is likely to be true forever (i understand
> that tree vpn may need some thought here).  The rtl level, which has, for as
> long as i have known it, had 2 reps for integer constants. So it was
> relatively easy to slide the CONST_WIDE_INT in.  It seems like the right
> trickery here rather than adding a storage model for wide-ints might be a
> way to use the c++ to invisibly support several (and by "several" i really
> mean 2) classes of TREE_CSTs.

The truth is that _now_ TREE_INT_CSTs use double-ints and we have
CONST_INT and CONST_DOUBLE.  What I (and you) propose would
get us to use variable-size storage for both, allowing to just use a single
HOST_WIDE_INT in the majority of cases.  In my view the constant
length of the variable-size storage for TREE_INT_CSTs is determined
by its type (thus, it doesn't have "optimized" variable-size storage
but "unoptimized" fixed-size storage based on the maximum storage
requirement for the type).  Similar for RTX CONST_INT which would
have fixed-size storage based on the mode-size of the constant.
Using optimized space (thus using the encoding properties) requires you
to fit the 'short len' somewhere which possibly will not pay off in the end
(for tree we do have that storage available, so we could go with optimized
storage for it, not sure with RTL, I don't see available space there).

>>> This is not a beauty contest argument, we have public ports are beginning
>>> to
>>> use modes that are larger than two x86-64 HWIs and i have a private port
>>> that has such modes and it is my experience that any pass that uses this
>>> interface has one of three behaviors: it silently gets the wrong answer,
>>> it
>>> ices, or it fails to do the transformation.  If we leave double_int as an
>>> available option, then any use of it potentially will have one of these
>>> three behaviors.  And so one of my strong objections to this direction is
>>> that i do not want to fight this kind of bug for the rest of my life.
>>> Having a single storage model that just always works is in my opinion a
>>> highly desirable option.  What you have never answered in a concrete
>>> manner
>>> is, if we decide to provide this generality, what it would be used for.
>>> There is no place in a portable compiler where the right answer for every
>>> target is two HOST wide integers.
>>> However, i will admit that the HWI option has some merits.   We try to
>>> address this in our implementation by dividing what is done inline in
>>> wide-int.h to the cases that fit in an HWI and then only drop into the
>>> heavy
>>> code in wide-int.c if mode is larger (which it rarely will be).
>>> However, a
>>> case could be made that for certain kinds of things like string lengths
>>> and
>>> such, we could use another interface or as you argue, a different storage
>>> model with the same interface.   I just do not see that the cost of the
>>> conversion code is really going to show up on anyone's radar.
>> What's the issue with abstracting away the model so a fixed-size 'len'
>> is possible?  (let away the argument that this would easily allow an
>> adaptor to tree)
> I have a particularly pessimistic perspective because i have already written
> most of this patch.   It is not that i do not want to change that code, it
> is that i have seen a certain set of mistakes that were made and i do not
> want to fix them more than once.   At the rtl level you can see the
> transition from only supporting 32 bit ints to supporting 64 bit its to
> finally supporting two HWIs and that transition code is not pretty.  My rtl
> patch fixes the places where an optimization was only made if the data type
> was 32 bits or smaller as well as the places where the optimization was made
> only if the data type is smaller than 64 bits (in addition to fixing all of
> the places where the code ices or simply gets the wrong answer if it is
> larger than TImode.)  The tree level is only modestly better, I believe only
> because it is newer. I have not seen any 32 bit only code, but it is
> littered with transformations that only work for 64 bits.   What is that 64
> bit only code going to look like in 5 years?

The user interface of wide-int does not depend on whether a storage model
is abstracted or not.  If you take advantage of the storage model by
making its interface leaner then it will.  But I guess converting everything
before settling on the wide-int interface may not have been the wisest
choice in the end (providing a wide-int that can literally replace double-int
would have got you testing coverage without any change besides
double-int.[ch] and wide-int.[ch]).

> I want to make it easier to write the general code than to write the code
> that only solves the problem for the size port that the implementor is
> currently working on.   So I perceive the storage model as a way to keep
> having to fight this battle forever because it will allow the implementor to
> make a decision that the optimization only needs to be done for a particular
> sized integer.
> However, i get the fact that from your perspective, what you really want is
> a solution to the data structure problem in tree-vrp.

No, that's just a convenient example.  What I really want is a wide-int
that is less visibly a replacement for CONST_DOUBLE.

>  My patch for tree vrp
> scans the entire function to find the largest type used in the function and
> then does all of the math at 2x that size.  But i have to admit that i
> thought it was weird that you use tree cst as your long term storage.   If
> this were my pass, i would have allocated a 2d array of some type that was
> as large as the function in 1d and twice as large as the largest int used in
> the other dimension and not overloaded tree-cst and then had a set of friend
> functions in double int to get in and out. Of course you do not need friends
> in double int because the rep is exposed, but in wide-int that is now hidden
> since it now is purely functional.
> I just have to believe that there is a better way to do tree-vrp than
> messing up wide-int for the rest of the compiler.

It's not "messing up", it's making wide-int a generally useful thing and
not tying it so closely to RTL.

>>> 3) your trick will work at the tree level, but not at the rtl level.
>>> The
>>> wide-int code cannot share storage with the CONST_INTs.    We tried this,
>>> and there are a million bugs that would have to be fixed to make it work.
>>> It could have worked if CONST_INTs had carried a mode around, but since
>>> they
>>> do not, you end up with the same CONST_INT sharing the rep for several
>>> different types and that just did not work unless you are willing to do
>>> substantial cleanups.
>> I don't believe you.  I am only asking for the adaptors to tree and RTL to
>> work in an RVALUE-ish way (no modification, as obviously RTL and tree
>> constants may be shared).  I think your claim is because you have that
>> precision and bitsize members in your wide-int which I believe is a
>> design red herring.  I suppose we should concentrate on addressing that
>> one first.  Thus, let me repeat a few questions on your design to
>> eventually
>> let you understand my concerns:
>> Given two wide-ints, a and b, what precision will the result of
>>       a + b
>> have?  Consider a having precision 32 and b having precision 64
>> on a 32-bit HWI host.
>> You define wide-int to have no sign information:
>> +   The representation does not contain any information about
>> +   signedness of the represented value, so it can be used to represent
>> +   both signed and unsigned numbers.  For operations where the results
>> +   depend on signedness (division, comparisons), the signedness must
>> +   be specified separately.  For operations where the signness
>> +   matters, one of the operands to the operation specifies either
>> +   wide_int::SIGNED or wide_int::UNSIGNED.
>> but the result of operations on mixed precision operands _does_ depend
>> on the sign, nevertheless most operations do not get a signedness
>> argument.
>> Nor would that help, because it depends on the signedness of the
>> individual
>> operands!
>> double-int get's around this by having a common "precision" to which
>> all smaller precision constants have to be sign-/zero-extended.  So
>> Note that even with same precision you have introduced the same problem
>> with the variable len.
>> My proposal is simple - all wide-ints are signed!  wide-int is basically
>> an arbitrary precision signed integer format.  The sign is encoded in
>> the most significant bit of the last active HWI of the representation
>> (val[len - 1] & (1 << HOST_BITS_PER_WIDE_INT - 1)).  All values
>> with less precision than len * HOST_BITS_PER_WIDE_INT are
>> properly sign-/zero-extended to precision len * HOST_BITS_PER_WIDE_INT.
>> This let's you define mixed len operations by implicitely
>> sign-/zero-extending
>> the operands to whatever len is required for the operation.
>> Furthermore it allows you to get rid of the precision member (and
>> bitsize anyway).
>> Conversion from tree / RTL requires information on the signedness of the
>> input (trivial for tree, for RTL all constants are signed - well,
>> sign-extended).
>> Whenever you want to transfer the wide-int to tree / RTL you have to
>> sign-/zero-extend according to the desired precision.  If you need sth
>> else
>> than arbitrary precision arithmetic you have to explicitely truncate /
>> extend
>> at suitable places - with overflow checking being trivial here.  For
>> optimization
>> purposes selected operations may benefit from a combined implementation
>> receiving a target precision and signedness.  Whatever extra meta-data
>> RTL requires does not belong in wide-int but in the RTX.  Trivially
>> a mode comes to my mind (on tree we have a type), and trivially
>> each RTX has a mode.  And each mode has a precision and bitsize.
>> It lacks a sign, so all RTL integer constants are sign-extended for
>> encoding efficiency purposes.  mixed-mode operations will not
>> occur (mixed len operations will), mixed-mode ops are exclusively
>> sign-/zero-extensions and truncations.
>> Representation of (unsigned HOST_WIDE_INT)-1 would necessarily
>> be { 0, (unsigned HOST_WIDE_INT)-1 }, representation of -1 in any
>> precision would be { -1 }.
>> That was my proposal.  Now, can you please properly specify yours?

And you chose to not answer that fundamental question of how your
wide-int is _supposed_ to work?  Ok, maybe I shouldn't have distracted
you with the bits before this.


>> Thanks,
>> Richard.
>>> On 04/02/2013 11:04 AM, Richard Biener wrote:
>>>> On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
>>>> <> wrote:
>>>>> This patch contains a large number of the changes requested by Richi.
>>>>> It
>>>>> does not contain any of the changes that he requested to abstract the
>>>>> storage layer.   That suggestion appears to be quite unworkable.
>>>> I of course took this claim as a challenge ... with the following
>>>> result.
>>>> It is
>>>> of course quite workable ;)
>>>> The attached patch implements the core wide-int class and three storage
>>>> models (fixed size for things like plain HWI and double-int, variable
>>>> size
>>>> similar to how your wide-int works and an adaptor for the double-int as
>>>> contained in trees).  With that you can now do
>>>> wi_test (tree x)
>>>> {
>>>>     // template argument deduction doesn't do the magic we want it to do
>>>>     // to make this kind of implicit conversions work
>>>>     // overload resolution considers this kind of conversions so we
>>>>     // need some magic that combines both ... but seeding the overload
>>>>     // set with some instantiations doesn't seem to be possible :/
>>>>     // wide_int<> w = x + 1;
>>>>     wide_int<> w;
>>>>     w += x;
>>>>     w += 1;
>>>>     // template argument deduction doesn't deduce the return value type,
>>>>     // not considering the template default argument either ...
>>>>     // w = wi (x) + 1;
>>>>     // we could support this by providing rvalue-to-lvalue promotion
>>>>     // via a traits class?
>>>>     // otoh it would lead to sub-optimal code anyway so we should
>>>>     // make the result available as reference parameter and only support
>>>>     // wide_int <> res; add (res, x, 1); ?
>>>>     w = wi (x).operator+<wide_int<> >(1);
>>>>     wide_int<>::add(w, x, 1);
>>>>     return w.to_hwi ();
>>>> }
>>>> we are somewhat limited with C++ unless we want to get really fancy.
>>>> Eventually providing operator+ just doesn't make much sense for
>>>> generic wide-int combinations (though then the issue is its operands
>>>> are no longer commutative which I think is the case with your wide-int
>>>> or double-int as well - they don't suport "1 + wide_int" for obvious
>>>> reasons).
>>>> So there are implementation design choices left undecided.
>>>> Oh, and the operation implementations are crap (they compute nonsense).
>>>> But you should get the idea.
>>>> Richard.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]