This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [wide-int] int_traits <tree>


On 10/17/2013 09:16 AM, Richard Biener wrote:
On Thu, 17 Oct 2013, Kenneth Zadeck wrote:

On 10/17/2013 08:29 AM, Richard Biener wrote:
On Thu, 17 Oct 2013, Richard Sandiford wrote:

Richard Biener <rguenther@suse.de> writes:
The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading
zero.
The MIN trims the length from 3 to 2 in the last case.
Oh, so it was the tree rep that changed?  _Why_ was it changed?
We still cannot use it directly from wide-int and the extra
word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE
()).
It means that we can now use the tree's HWI array directly, without any
copying, for addr_wide_int and max_wide_int.  The only part of decompose
()
that does a copy is the small_prec case, which is trivially compiled out
for addr_wide_int and max_wide_int.
"     2) addr_wide_int.  This is a fixed size representation that is
       guaranteed to be large enough to compute any bit or byte sized
       address calculation on the target.  Currently the value is 64 + 4
       bits rounded up to the next number even multiple of
       HOST_BITS_PER_WIDE_INT (but this can be changed when the first
       port needs more than 64 bits for the size of a pointer).

       This flavor can be used for all address math on the target.  In
       this representation, the values are sign or zero extended based
       on their input types to the internal precision.  All math is done
       in this precision and then the values are truncated to fit in the
       result type.  Unlike most gimple or rtl intermediate code, it is
       not useful to perform the address arithmetic at the same
       precision in which the operands are represented because there has
       been no effort by the front ends to convert most addressing
       arithmetic to canonical types.

       In the addr_wide_int, all numbers are represented as signed
       numbers.  There are enough bits in the internal representation so
       that no infomation is lost by representing them this way."

so I guess from that that addr_wide_int.get_precision is always
that "64 + 4 rounded up".  Thus decompose gets that constant precision
input and the extra zeros make the necessary extension always a no-op.
Aha.
it is until someone comes up with a port that this will not work for, then
they will have to add some machinery to sniff the port and make this bigger.
I am hoping to be retired by the time this happens.
For max_wide_int the same rules apply, just its size is larger.

Ok.  So the reps are only canonical wide-int because we only
ever use them with precision > xprecision (maybe we should assert
that).
It is now asserted for (as of a few days ago).

Btw, we are not using them directly, but every time we actually
build a addr_wide_int / max_wide_int we copy them anyway:

/* Initialize the storage from integer X, in precision N.  */
template <int N>
template <typename T>
inline fixed_wide_int_storage <N>::fixed_wide_int_storage (const T &x)
{
    /* Check for type compatibility.  We don't want to initialize a
       fixed-width integer from something like a wide_int.  */
    WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
    wide_int_ref xi (x, N);
    len = xi.len;
    for (unsigned int i = 0; i < len; ++i)
      val[i] = xi.val[i];
}

it avoids a 2nd copy though, which shows nicely what was rummaging in
my head for the last two days - that the int_trais <> abstraction
was somehow at the wrong level - it should have been traits that
are specific to the storage model?  or the above should use
int_traits<>::decompose manually with it always doing the
copy (that would also optimize away one copy and eventually
would make the extra zeros not necessary).
this came in with richard's storage manager patch.    In my older code, we
tried and succeeded many times to just borrow the underlying rep.    I think
that richard needs to work this out.
I originally thought that extra zeros get rid of all copying from trees
to all wide-int kinds.

What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.
I am not following you here.   In trees, the msb is effectively a sign bit,
even for unsigned numbers because we add that extra block.

but inside of wide int, we do not add extra blocks beyond the precision.
That would be messy for a lot of other reasons.
Can you elaborate?  It would make tree and RTX reps directly usable,
only wide-int-to-tree and wide-int-to-rtx need special handling.

Richard.
Richi,

The reason that you had me do the branch was that you wanted to look at the whole thing. I understand that it is large, but one of the things that comes out is that most of the tree level and all of the rtl level wants to do fixed precision math. Furthermore, some of the places that use the max wide ints, like tree-ssa-ccp should not, because not only is it slower, but it means that they miss some important constants. I did not convert that because you did not want me to make changes that would have made it harder to do a/b comparisons of the machine code (and you were completely right about this). But the fact remains that if you mess with the fixed precision and make it look like infinite precision it will not match what the compiler needs. (and as soon as this patch goes in, tree-ssa-ccp will get a face lift.) We did not have an extra block on top of double-int and we really do not want one on top of wide-int. it is just more garbage that we have to carry around for add, sub, mul, div and shift for the 90% of the code that does not care because the decision was already made that those bits would not be looked at.

I went along with keeping the upper bits of short blocks canonical because I guessed that it made enough things faster to pay for itself, not to mention that it is unpleasing to have random bits.

We can go back to the case where the large unsigned trees look more like the wide-ints. But there is still hair on the conversion back and forth for the things with small precisions.

Kenny


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]