This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch to fix constant math - 4th patch - the wide-int class.


On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
>> <zadeck@naturalbridge.com> wrote:
>>>
>>> On 10/25/2012 06:42 AM, Richard Biener wrote:
>>>>
>>>> On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump <mikestump@comcast.net> wrote:
>>>>>
>>>>> On Oct 24, 2012, at 2:43 AM, Richard Biener <richard.guenther@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
>>>>>> <zadeck@naturalbridge.com> wrote:
>>>>>>>
>>>>>>> On 10/23/2012 10:12 AM, Richard Biener wrote:
>>>>>>>>
>>>>>>>> +  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
>>>>>>>> HOST_BITS_PER_WIDE_INT];
>>>>>>>>
>>>>>>>> are we sure this rounds properly?  Consider a port with max byte mode
>>>>>>>> size 4 on a 64bit host.
>>>>>>>
>>>>>>> I do not believe that this can happen.   The core compiler includes all
>>>>>>> modes up to TI mode, so by default we already up to 128 bits.
>>>>>>
>>>>>> And mode bitsizes are always power-of-two?  I suppose so.
>>>>>
>>>>> Actually, no, they are not.  Partial int modes can have bit sizes that
>>>>> are not power of two, and, if there isn't an int mode that is bigger, we'd
>>>>> want to round up the partial int bit size.  Something like ((2 *
>>>>> MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
>>>>> HOST_BITS_PER_WIDE_INT should do it.
>>>>>
>>>>>>>> I still would like to have the ability to provide specializations of
>>>>>>>> wide_int
>>>>>>>> for "small" sizes, thus ideally wide_int would be a template templated
>>>>>>>> on the number of HWIs in val.  Interface-wise wide_int<2> should be
>>>>>>>> identical to double_int, thus we should be able to do
>>>>>>>>
>>>>>>>> typedef wide_int<2> double_int;
>>>>>>>
>>>>>>> If you want to go down this path after the patches get in, go for it.
>>>>>>> I
>>>>>>> see no use at all for this.
>>>>>>> This was not meant to be a plug in replacement for double int. This
>>>>>>> goal of
>>>>>>> this patch is to get the compiler to do the constant math the way that
>>>>>>> the
>>>>>>> target does it.   Any such instantiation is by definition placing some
>>>>>>> predefined limit that some target may not want.
>>>>>>
>>>>>> Well, what I don't really like is that we now have two implementations
>>>>>> of functions that perform integer math on two-HWI sized integers.  What
>>>>>> I also don't like too much is that we have two different interfaces to
>>>>>> operate
>>>>>> on them!  Can't you see how I come to not liking this?  Especially the
>>>>>> latter …
>>>>>
>>>>> double_int is logically dead.  Reactoring wide-int and double-int is a
>>>>> waste of time, as the time is better spent removing double-int from the
>>>>> compiler.  All the necessary semantics and code of double-int _has_ been
>>>>> refactored into wide-int already.  Changing wide-int in any way to vend
>>>>> anything to double-int is wrong, as once double-int is removed, then all the
>>>>> api changes to make double-int share from wide-int is wasted and must then
>>>>> be removed.  The path forward is the complete removal of double-int; it is
>>>>> wrong, has been wrong and always will be wrong, nothing can change that.
>>>>
>>>> double_int, compared to wide_int, is fast and lean.  I doubt we will
>>>> get rid of it - you
>>>> will make compile-time math a _lot_ slower.  Just profile when you for
>>>> example
>>>> change get_inner_reference to use wide_ints.
>>>>
>>>> To be able to remove double_int in favor of wide_int requires _at least_
>>>> templating wide_int on 'len' and providing specializations for 1 and 2.
>>>>
>>>> It might be a non-issue for math that operates on trees or RTXen due to
>>>> the allocation overhead we pay, but in recent years we transitioned
>>>> important
>>>> paths away from using tree math to using double_ints _for speed reasons_.
>>>>
>>>> Richard.
>>>
>>> i do not know why you believe this about the speed.     double int always
>>> does synthetic math since you do everything at 128 bit precision.
>>>
>>> the thing about wide int, is that since it does math to the precision's
>>> size, it almost never does uses synthetic operations since the sizes for
>>> almost every instance can be done using the native math on the machine.
>>> almost every call has a check to see if the operation can be done natively.
>>> I seriously doubt that you are going to do TI mode math much faster than i
>>> do it and if you do who cares.
>>>
>>> the number of calls does not effect the performance in any negative way and
>>> it fact is more efficient since common things that require more than one
>>> operation in double in are typically done in a single operation.
>>
>> Simple double-int operations like
>>
>> inline double_int
>> double_int::and_not (double_int b) const
>> {
>>   double_int result;
>>   result.low = low & ~b.low;
>>   result.high = high & ~b.high;
>>   return result;
>> }
>>
>> are always going to be faster than conditionally executing only one operation
>> (but inside an offline function).
>
> OK, this is really in reply to the 4.8 thing, but it felt more
> appropriate here.
>
> It's interesting that you gave this example, since before you were
> complaining about too many fused ops.  Clearly this one could be
> removed in favour of separate and() and not() operations, but why
> not provide a fused one if there are clients who'll make use of it?

I was more concerned about fused operations that use precision
or bitsize as input.  That is for example

>> +  bool only_sign_bit_p (unsigned int prec) const;
>> +  bool only_sign_bit_p () const;

The first is construct a wide-int with precision prec (and sign- or
zero-extend it) and then call only_sign_bit_p on it.  Such function
should not be necessary and existing callers should be questioned
instead of introducing it.

In fact wide-int seems to have so many "fused" operations that
we run out of sensible recognizable names for them.  Which results
in a lot of confusion on what the functions actually do (at least for me).

> I think Kenny's API is just taking that to its logical conclusion.
> There doesn't seem to be anything sacrosanct about the current choice
> of what's fused and what isn't.

Maybe.  I'd rather have seen an initial small wide-int API and fused
operations introduced separately together with the places they are
used.  In the current way it's way too tedious to go over all of them
and match them with callers, lookup enough context and then
make up my mind on whether the caller should do sth different or not.

Thus, consider the big initial API a reason that all this review takes
so long ...

> The speed problem we had using trees for internal arithmetic isn't
> IMO a good argument for keeping double_int in preference to wide_int.
> Allocating and constructing tree objects to hold temporary values,
> storing an integer representation in it, then calling tree arithmetic
> routines that pull out the integer representation again and create a
> tree to hold the result, is going to be much slower than using either
> double_int or wide_int.  I'd be very surprised if we notice any
> measurable difference between double_int and wide_int here.
>
> I still see no reason to keep double_int around.  The width of a host
> wide integer really shouldn't have any significance.
>
> Your main complaint seems to be that the wide_int API is different
> from the double_int one, but we can't literally use the same API, since
> double_int has an implicit precision and bitsize, and wide_int doesn't.
> Having a precision that is separate from the underlying representation
> is IMO the most important feature of wide_int, so:
>
>    template wide_int<2> double_int;
>
> is never going to be a drop-in, API-compatible replacement for double_int.

My reasoning was that if you strip wide-int of precision and bitsize
you have a double_int<N> class.  Thus wide-int should have a base
of that kind and just add precision / bitsize ontop of that.  It wouldn't
be a step forward if we end up replacing double_int uses with
wide_int uses with precision of 2 * HOST_BITS_PER_WIDE_INT,
would it?

> FWIW, I like the idea about having a class that wraps up the mode,
> tree, and (precision, bitsize) choice.

Yeah, if just to make the API look leaner.

Richard.

> Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]