This is the mail archive of the
mailing list for the GCC project.
Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon
I spent a good part of the afternoon talking to Mike about this. He is
on the c++ standards committee and is a much more seasoned c++
programmer than I am.
He convinced me that with a large amount of engineering and c++
"foolishness" that it was indeed possible to get your proposal to
POSSIBLY work as well as what we did.
But now the question is why would any want to do this?
At the very least you are talking about instantiating two instances of
wide-ints, one for the stack allocated uses and one for the places where
we just move a pointer from the tree or the rtx. Then you are talking
about creating connectors so that the stack allocated functions can take
parameters of pointer version and visa versa.
Then there is the issue that rather than just saying that something is a
wide int, that the programmer is going to have to track it's origin.
In particular, where in the code right now i say.
wide_int foo = wide_int::from_rtx (r1);
wide_int bar = wide_int::from_rtx (r2) + foo;
now i would have to say
wide_int_ptr foo = wide_int_ptr::from_rtx (r1);
wide_int_stack bar = wide_int_ptr::from_rtx (r2) + foo;
then when i want to call some function using a wide_int ref that
function now must be either overloaded to take both or i have to choose
one of the two instantiations (presumably based on which is going to be
more common) and just have the compiler fix up everything (which it is
likely to do).
And so what is the payoff:
1) No one except the c++ elite is going to understand the code. The rest
of the community will hate me and curse the ground that i walk on.
2) I will end up with a version of wide-int that can be used as a medium
life container (where i define medium life as not allowed to survive a
gc since they will contain pointers into rtxes and trees.)
3) An no clients that actually wanted to do this!! I could use as an
example one of your favorite passes, tree-vrp. The current double-int
could have been a medium lifetime container since it has a smaller
footprint, but in fact tree-vrp converts those double-ints back into
trees for medium storage. Why, because it needs the other fields of a
tree-cst to store the entire state. Wide-ints also "suffer" this
problem. their only state are the data, and the three length fields.
They have no type and none of the other tree info so the most obvious
client for a medium lifetime object is really not going to be a good
match even if you "solve the storage problem".
The fact is that wide-ints are an excellent short term storage class
that can be very quickly converted into our two long term storage
classes. Your proposal is requires a lot of work, will not be easy to
use and as far as i can see has no payoff on the horizon. It could be
that there could be future clients for a medium lifetime value, but
asking for this with no clients in hand is really beyond the scope of a
I remind you that the purpose of these patches is to solve problems that
exist in the current compiler that we have papered over for years. If
someone needs wide-ints in some way that is not foreseen then they can
On 11/26/2012 11:30 AM, Richard Biener wrote:
On Mon, Nov 26, 2012 at 5:03 PM, Kenneth Zadeck
On 11/26/2012 10:03 AM, Richard Biener wrote:
I thought they were all short-lived.
On Mon, Nov 5, 2012 at 2:59 PM, Kenneth Zadeck <firstname.lastname@example.org>
On 11/04/2012 11:54 AM, Richard Biener wrote:
On Thu, Nov 1, 2012 at 2:10 PM, Richard Sandiford
Kenneth Zadeck <email@example.com> writes:
I would like you to respond to at least point 1 of this email. In it
there is code from the rtl level that was written twice, once for the
case when the size of the mode is less than the size of a HWI and once
for the case where the size of the mode is less that 2 HWIs.
my patch changes this to one instance of the code that works no matter
how large the data passed to it is.
you have made a specific requirement for wide int to be a template
can be instantiated in several sizes, one for 1 HWI, one for 2 HWI.
would like to know how this particular fragment is to be rewritten in
this model? It seems that I would have to retain the structure where
there is one version of the code for each size that the template is
I think richi's argument was that wide_int should be split into two.
There should be a "bare-metal" class that just has a length and HWIs,
and the main wide_int class should be an extension on top of that
that does things to a bit precision instead. Presumably with some
template magic so that the length (number of HWIs) is a constant for:
typedef foo<2> double_int;
and a variable for wide_int (because in wide_int the length would be
the number of significant HWIs rather than the size of the underlying
array). wide_int would also record the precision and apply it after
the full HWI operation.
So the wide_int class would still provide "as wide as we need"
as in your rtl patch. I don't think he was objecting to that.
That summarizes one part of my complaints / suggestions correctly. In
mails I suggested to not make it a template but a constant over object
'bitsize' (or maxlen) field. Both suggestions likely require more
I put into them. The main reason is that with C++ you can abstract from
wide-int information pieces are stored and thus use the arithmetic /
workers without copying the (source) "wide-int" objects. Thus you
be able to write adaptors for double-int storage, tree or RTX storage.
We had considered something along these lines and rejected it. I am not
really opposed to doing something like this, but it is not an obvious
winning idea and is likely not to be a good idea. Here was our thought
if you abstract away the storage inside a wide int, then you should be
to copy a pointer to the block of data from either the rtl level integer
constant or the tree level one into the wide int. It is certainly true
that making a wide_int from one of these is an extremely common operation
and doing this would avoid those copies.
However, this causes two problems:
1) Mike's first cut at the CONST_WIDE_INT did two ggc allocations to
the object. it created the base object and then it allocated the array.
Richard S noticed that we could just allocate one CONST_WIDE_INT that had
the array in it. Doing it this way saves one ggc allocation and one
indirection when accessing the data within the CONST_WIDE_INT. Our plan
to use the same trick at the tree level. So to avoid the copying, you
to have to have a more expensive rep for CONST_WIDE_INT and INT_CST.
I did not propose having a pointer to the data in the RTX or tree int.
the short-lived wide-ints (which are on the stack) would have a pointer to
the data - which can then obviously point into the RTX and tree data.
There is the issue then what if some wide-ints are not short lived. It makes
me nervous to create internal pointers to gc ed memory.
No, those would simply use the embedded storage model.
2) You are now stuck either ggcing the storage inside a wide_int when
are created as part of an expression or you have to play some game to
represent the two different storage plans inside of wide_int.
Hm? wide-ints are short-lived and thus never live across a garbage
point. We create non-GCed objects pointing to GCed objects all the time
and everywhere this way.
Again, this makes me nervous but it could be done. However, it does mean
that now the wide ints that are not created from rtxes or trees will be more
expensive because they are not going to get their storage "for free", they
are going to alloca it.
however, it still is not clear, given that 99% of the wide ints are going to
fit in a single hwi, that this would be a noticeable win.
Currently even if they fit into a HWI you will still allocate 4 times the
larges integer mode size. You say that doesn't matter because they
are short-lived, but I say it does matter because not all of them are
short-lived enough. If 99% fit in a HWI why allocate 4 times the
largest integer mode size in 99% of the cases?
is where you think that we should be going by suggesting that we abstract
away the internal storage. However, this comes at a price: what is
currently an array access in my patches would (i believe) become a
No, the workers (that perform the array accesses) will simply get
a pointer to the first data element. Then whether it's embedded or
external is of no interest to them.
so is your plan that the wide int constructors from rtx or tree would just
copy the pointer to the array on top of the array that is otherwise
allocated on the stack? I can easily do this. But as i said, the gain
seems quite small.
And of course, going the other way still does need the copy.
The proposal was to template wide_int on a storage model, the embedded
one would work as-is (embedding 4 times largest integer mode), the
external one would have a pointer to data. All functions that return a
wide_int produce a wide_int with the embedded model. To avoid
the function call penalty you described the storage model provides
a way to get a pointer to the first element and the templated operations
simply dispatch to a worker that takes this pointer to the first element
(as the storage model is designed as a template its abstraction is going
to be optimized away by means of inlining).
Of course - GCing wide-ints is a non-starter.
From a performance point of view, i believe that this is a non
starter. If you can figure out how to design this so that it is not a
function call, i would consider this a viable option.
On the other side of this you are clearly correct that we are copying the
data when we are making wide ints from INT_CSTs or CONST_WIDE_INTs.
this is why we represent data inside of the wide_ints, the INT_CSTs and
CONST_WIDE_INTs in a compressed form. Even with very big types, which
generally rare, the constants them selves are very small. So the copy
operation is a loop that almost always copies one element, even with
tree-vrp which doubles the sizes of every type.
There is the third option which is that the storage inside the wide int
just ggced storage. We rejected this because of the functional nature of
wide-ints. There are zillions created, they can be stack allocated,
they last for very short periods of time.
As is probably obvious, I don't agree FWIW. It seems like an
complication without any clear use. Especially since the number of
Maybe the double_int typedef is without any clear use. Properly
abstracting from the storage / information providers will save
compile-time, memory and code though. I don't see that any thought
was spent on how to avoid excessive copying or dealing with
long(er)-lived objects and their storage needs.
I actually disagree. Wide ints can use a bloated amount of storage
because they are designed to be very short lived and very low cost
that are stack allocated. For long term storage, there is INT_CST at
tree level and CONST_WIDE_INT at the rtl level. Those use a very compact
storage model. The copying entailed is only a small part of the overall
Well, but both trees and RTXen are not viable for short-lived things
the are GCed! double-ints were suitable for this kind of stuff because
the also have a moderate size. With wide-ints size becomes a problem
(or GC, if you instead use trees or RTXen).
Everything that you are suggesting along these lines is adding to the
of a wide-int object.
On the contrary - it lessens their weight (with external already
or does not do anything to it (with the embedded storage).
You have to understand there will be many more
wide-ints created in a normal compilation than were ever created with
double-int. This is because the rtl level had no object like this at
and at the tree level, many of the places that should have used double
short cut the code and only did the transformations if the types fit in a
Your argument shows that the copy-in/out from tree/RTX to/from wide-int
will become a very frequent operation and thus it is worth optimizing it.
I'm sure you did.
This is why we are extremely defensive about this issue. We really did
think a lot about it.