This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch to fix constant math


On 10/04/2012 12:58 PM, Richard Guenther wrote:
On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck <zadeck@naturalbridge.com> wrote:
Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math
that is done in optimizations is done exactly the way that it would be done
on the target machine.   What we have now is a compiler that only does this
if it convenient to do on the host.   I admit that i care about this more
than others right now, but if intel adds a couple of more instructions to
their vector units, other people will start to really care about this issue.
If you take an OImode value with the current compiler and left shift it by
250 the middle end will say that the result is 0.   This is just wrong!!!

What this means is that the bitsize and precision of the operations need to
be carried along when doing math. when wide-int  checks for overflow on the
multiply or add, it is not checking the if the value overflowed on two HWIs,
it is checking if the add overflowed in the mode of the types that are
represented on the target.   When we do shift, we are not doing a shift
within two HWIs, we are truncating the shift value (if this is appropriate)
according to the bitsize and shifting according the precision.

I think that an argument could be made that storing the mode should be
changed to an explicit precision and bitsize.  (A possible other option
would be to store a tree type, but this would make the usage at the rtl
level very cumbersome since types are rare.) Aside from the work, you would
not get much push back.

But the signess is a different argument.   At the rtl level, the signess is
a matter of context.   (you could argue that this is a mistake and i would
agree, but that is an even bigger change.)   But more to the point, at the
tree level, there are a surprising number of places where the operation
desired does not follow the sign of the types that were used to construct
the constants.   Furthermore, not carrying the sign is more consistent with
the double int code, which as you point out carries nothing.
Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
yes, there is a complete enough set of operations that allow you to specify the signess where this matters.

double-int tries to present a sign-less twos-complement entity of size
2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
obvious reasons should not change.  Both tree and RTL rely on this.
What we do not want is that up to TImode you get an internal representation
done one way (twos-complement) and on OImode and larger you
suddenly get subtly different behavior.  That's a recepie for desaster.

This is the main difference between double-int and wide-int. Wide int does the math the way the machine does it or the way the front end would expect it to be done. There is nothing about the host that is visible in the interfaces.


I reiterate, our world is already bigger than 128 bits and the intel world is likely to be soon. Double int is stuck in a 64/128 bit world. these patches, which i admit are huge, are a way out of that box.


I'd like to clean up the interface to double-int some more (now with the
nice C++ stuff we have).  double-int should be pure twos-complement,
there should be no operations on double-ints that behave differently
when done signed or unsigned, instead we have signed and unsigned
versions of the operations (similar to how signedness is handled on
the RTL level).  With some trivial C++ fu you could have a
double_sint and double_uint type that would get rid of the bool
sign params we have to some functions (and then you could
write double_sint >> n using operator notation).

The problem is that size does matter. wide int is effectively infinite precision twos complement. In practice, we can get by by just looking at the bitsize and precision of the types/modes involved and this makes the implementation faster than true infinite precision.


I went done the road trying to fix all of the places where the compiler either iced or got the wrong answer. I showed this to Sandiford and he talked me out of it. He was right, it was a rat hole. It could have been a smaller patch but it was there were places where it was clearly going to take monumental work just to be able to back out and say that you had nothing. The number of places in the compiler where you compare against the largest and smallest representation of an integer is not small and some of them are buried very deep down chains that were not designed to say "i cannot answer that question".

I believe that i have all of the functionality of double int in wide int, it is just the calls look different because there are not all of the interfaces that take two HWI's. As mentioned before, all of the places where the overflow is computed for the purpose of asking if this is ok in two hwi's is gone.


I'd like wide-int (whatever it's internal representation is) to behave
exactly like double-ints with respect to precision and signedness
handling.  Ideally all static functions we have that operate on
double-ints would be 1:1 available for wide-ints, so I can change
the type of entities in an algorithm from double-ints to wide-ints
(or vice versa) and do not have to change the code at all.
As mentioned above, this is not that easy. the calls look very similar, but there are huge places in the both the rtl level and the tree level that just go away because you are always guaranteed to be doing things the way the target wants. Look closely at the patch for simplify-rtx. Huge blocks of code are gone AND THE COMPILER IS DOING MORE STUFF!!!!!

I will admit that the wide-int api is larger than the double-int one. This means that a lot of things turn out to be very easy to do with wide ints that are cumbersome in double-int, like you can just return a number with a single bit set, or make a shifted, complemented mask in one call.

Thus as first step I'd like you to go over the double-int stuff,
compare it to the wide-int stuff you introduce and point out
differences (changing double-ints or wide-ints to whatever is
the more general concept).
My goal here is to get rid of double-int and have wide-int be the replacement. The only real substantive difference between the two (aside from the divmod rounding issue i mentioned at the end of my first email) is that the clients no longer have to worry about overflow with respect to the host representation. When overflow is returned, you know it is because the math overflowed in the type or mode of the integer operands.


Now, as for 'modes' - similar to signedness some functions
that operate on double-ints take a precision argument (like
the various extensions).  You can add a similar wrapper
type like double_sint, but this time with a cost - a new precision
member, that can be constructed from a double_int (or wide_int)
that ends up specifying the desired precision (be it in terms
of a mode or a type).

You didn't question my suggestion to have the number of
HOST_WIDE_INTs in a wide-int be compile-time constant - was
that just an oversight on your side?  The consequence is that
code wanting to deal with arbitrary length wide-ints needs to
be a template.
It was an oversight. The number of HWIs is a compile time constant, it is just a compile time constant that is determined for you by looking at the target. I went the extra mile to have this be automatically computed based on the target rather than putting in yet another target hook.

There is the issue about vrp needing something two times the size of the largest mode. one way to solve that is just to get rid of mode inside the wide-int and keep an explicit precision and bitsize. Then we could change the wide-int type to just use a 2x buffer. Doing this would not be that expensive, except for vrp (which needed it anyway). In almost all of the rest of the compiler, wide-ints are very short lived objects that just live on the stack. So their size does not really effect anything.


As for the splitting out the patch in smaller pieces, i am all for it.   I
have done this twice already and i could get the const_scalar_int_p patch
out quickly.    But you do not get too far along that before you are still
left with a big patch.   I could split out wide-int.* and just commit those
files with no clients as a first step.   My guess is that Richard Sandiford
would appreciate that because while he has carefully checked the rtl stuff,
i think that the code inside wide-int is not in his comfort zone of things
he would approve.

As far as your btw - noticed this last night.   it is an artifact of the way
i produced the patch and "responsible people have been sacked".   However,
it shows that you read the patch carefully, and i really appreciate that.
i owe you a beer (not that you need another at this time of year).
You also didn't mention the missing tree bits ... was this just a 1/n patch
or is it at all usable for you in this state?  Where do the large integers
magically come from?

Richard.

Yes, the tree bits are missing!!! i am still working on them. The plan was to have 2 patches, one for the rtl and one for the tree. the idea was that the rtl one could go in first. Of course, if we can find ways to break this into more pieces, i agree that that helps. i am a few days away from getting the tree patch out the door, and unlike the rtl patch, richard sandiford is not going to give the private comments.


My plan is to submit the tree patch as soon as possible, but before stage 1 closes (unless i could get an extension from a friendly release manager).


Kenny



On 10/04/2012 08:48 AM, Richard Guenther wrote:
On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck <zadeck@naturalbridge.com>
wrote:
The enclosed patch is the third of at least four patches that fix the
problems associated with supporting integers on the target that are
wider than two HOST_WIDE_INTs.

While GCC claims to support OI mode, and we have two public ports that
make minor use of this mode, in practice, compilation that uses OImode
mode commonly gives the wrong result or ices.  We have a private port
of GCC for an architecture that is further down the road to needing
comprehensive OImode and we have discovered that this is unusable. We
have decided to fix it in a general way that so that it is most
beneficial to the GCC community.  It is our belief that we are just a
little ahead of the X86 and the NEON and these patches will shortly be
essential.

The first two of these patches were primarily lexigraphical and have
already been committed.    They transformed the uses of CONST_DOUBLE
so that it is easy to tell what the intended usage is.

The underlying structures in the next two patches are very general:
once they are added to the compiler, the compiler will be able to
support targets with any size of integer from hosts of any size
integer.

The patch enclosed deals with the portable RTL parts of the compiler.
The next patch, which is currently under construction deals with the
tree level.  However, this patch can be put on the trunk as is, and it
will eleviate many, but not all of the current limitations in the rtl
parts of the compiler.

Some of the patch is conditional, depending on a port defining the
symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
symbol to be non zero is declaring that the port has been converted to
use the new form or integer constants.  However, the patch is
completely backwards compatible to allow ports that do not need this
immediately to convert at their leasure.  The conversion process is
not difficult, but it does require some knowledge of the port, so we
are not volinteering to do this for all ports.

OVERVIEW OF THE PATCH:

The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
That it has a mode sounds odd to me and makes it subtly different
from HOST_WIDE_INT and double-int.  Maybe the patch will tell
why this is so.

array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.

A new rtx type is created, the CONST_WIDE_INT, which contains a
garbage collected array of HOST_WIDE_INTS that is large enough to hold
the constant.  For the targets that define TARGET_SUPPORTS_WIDE_INT to
be non zero, CONST_DOUBLES are only used to hold floating point
values.  If the target leaves TARGET_SUPPORTS_WIDE_INT defined as 0,
CONST_WIDE_INTs are not used and CONST_DOUBLEs are as they were
before.

CONST_INT does not change except that it is defined to hold all
constants that fit in exactly one HOST_WIDE_INT.  Note that is slightly
different than the current trunk.  Before this patch, the TImode
constant '5' could either be in a CONST_INT or CONST_DOUBLE depending
on which code path was used to create it.  This patch changes this so
that if the constant fits in a CONST_INT then it is represented in a
CONST_INT no matter how it is created.

For the array inside a CONST_WIDE_INT, and internally in wide-int, we
use a compressed form for integers that need more than one
HOST_WIDE_INT.  Higher elements of the array are not needed if they
are just a sign extension of the elements below them.  This does not
imply that constants are signed or are sign extended, this is only a
compression technique.

While it might seem to be more esthetically pleasing to have not
introduced the CONST_WIDE_INT and to have changed the representation
of the CONST_INT to accomodate larger numbers, this would have both
used more space and would be a time consuming change for the port
maintainers.  We believe that most ports can be quickly converted with
the current scheme because there is just not a lot of code in the back
ends that cares about large constants.  Furthermore, the CONST_INT is
very space efficient and even in a program that was heavy in large
values, most constants would still fit in a CONST_INT.

All of the parts of the rtl level that deal with CONST_DOUBLE as an
now conditionally work with CONST_WIDE_INTs depending on the value
of TARGET_SUPPORTS_WIDE_INT.  We believe that this patch removes all
of the ices and wrong code places at the portable rtl level. However,
there are still places in the portable rtl code that refuse to
transform the code unless it is a CONST_INT.  Since these do not cause
failures, they can be handled later.  The patch is already very large.

It should be noted that much of the constant overflow checking in the
constant math dissappears with these patches.  The overflow checking
code in the current compiler is really divided into two cases:
overflow on the host and overflow on the target.  The overflow
checking on the host was to make sure that the math did overflow when
done on two HOST_WIDE_INTs.  All of this code goes away.  These
patches allow the constant math to be done exactly the way it is done
on the target.

This patch also aids other cleanups that are being considered at the
rtl level:

    1) These patches remove most of the host dependencies on the
    optimizations.  Currently a 32 bit GCC host will produce different
    code for a specific target than a 64 bit host will.  This is because
    many of the transformations only work on constants that can be a
    represented with a single HWI or two HWIs.  If the target has larger
    integers than the host, the compilation suffers.

    2) Bernd's need to make GCC correctly support partial its is made
    easier by the wide-int library.  This library carefully does all
    arithmetic in the precision of the mode included in it.  While there
    are still places at the rtl level that still do arithmetic inline,
    we plan to convert those to use the library over time.   This patch
    converts a substantial number of those places.

    3) This patch is one step along the path to add modes to rtl integer
    constants.  There is no longer any checking to see if a CONST_DOUBLE
    has VOIDmode as its mode.  Furthermore, all constructors for various
    wide ints do take a mode and require that it not be VOIDmode. There
    is still a lot of work to do to make this conversion possible.

Richard Sandiford has been over the rtl portions of this patch a few
times.  He has not looked at the wide-int files in any detail.  This
patch has been heavily tested on my private ports and also on x86-64.


CONVERSION PROCESS


Converting a port mostly requires looking for the places where
CONST_DOUBLES are used with VOIDmode and replacing that code with code
that accesses CONST_WIDE_INTs.  "grep -i const_double" at the port
level gets you to 95% of the changes that need to be made.  There are
a few places that require a deeper look.

    1) There is no equivalent to hval and lval for CONST_WIDE_INTs.
    This would be difficult to express in the md language since there
    are a variable number of elements.

    Most ports only check that hval is either 0 or -1 to see if the int
    is small.  As mentioned above, this will no longer be necessary
    since small constants are always CONST_INT.  Of course there are
    still a few exceptions, the alpha's constraint used by the zap
    instruction certainly requires careful examination by C code.
    However, all the current code does is pass the hval and lval to C
    code, so evolving the c code to look at the CONST_WIDE_INT is not
    really a large change.

    2) Because there is no standard template that ports use to
    materialize constants, there is likely to be some futzing that is
    unique to each port in this code.

    3) The rtx costs may have to be adjusted to properly account for
    larger constants that are represented as CONST_WIDE_INT.

All and all it has not taken us long to convert ports that we are
familiar with.

OTHER COMMENTS

I did find what i believe is one interesting bug in the double-int
code.  I believe that the code that performs divide and mod with round
to nearest is seriously wrong for unsigned integers.  I believe that
it will get the wrong answer for any numbers that are large enough to
look negative if they consider signed integers.  Asside from that,
wide-int should perform in a very similar manner to double-int.

I am sorry for the size of this patch.   However, there does not appear
to change the underlying data structure to support wider integers
without doing something like this.
Some pieces can be easily split out, like the introduction and use
of CONST_SCALAR_INT_P.

As my general comment I would like to see double-int and wide-int
unified from an interface perspective.  Which means that double-int
should be a specialization of wide-int which should be a template
(which also means its size is constant).  Thus,

typedef wide_int<2> double_int;

should be the way to expose the double_int type.

The main question remains - why does wide_int have a mode?
That looks redundant, both with information stored in types
and the RTL constant, and with the len field (that would be
always GET_MODE_SIZE () / ...?).  Also when you associate
a mode it's weird you don't associate a signedness.

Thus I'd ask you to rework this to be a template on 'len'
(number of HOST_WIDE_INT words), drop the mode member
and unify double-int and wide-int.  Co-incidentially incrementally
doing this by converting double-int to a typedef of a
wide_int<2> specialization (thus moving double-int implementation
stuff to be wide_int<2> specialized) would be prefered.

Btw,

+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+    v = cst.zext (TYPE_PRECISION (type));
+  else
+    v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}

is surely broken.  A wide-int does not fit a double-int.  How are you
going to "fix" this?

Thanks,
Richard.

kenny



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]