[PATCH] irange_pool class

Sun Sep 20 00:40:28 GMT 2020

On 9/19/20 4:32 PM, Martin Sebor wrote:
> On 9/18/20 3:09 PM, Andrew MacLeod wrote:
>> O
>>> What I meant by proxy is a substitute class each of whose objects
>>> stands in for a single instance of int_range<N> where N is
>>> a runtime value.   There's no class like that.
>>>
>
> [Just to be clear, I don't meant for this discussion to hold up
> the patch if you need the pool internally.  Anothert class like
> the one I propose can be added later if necessary.]
>
It won't  :-)

>>
>> no, but that doesnt serve a lot of purpose either.     you can call 
>> allocator.allocate(N) which is effectively that... ?
>
> Yes, but the allocator object isn't always conveniently accessible.
> Imagine needing to allocate a dynamically sized irange is in a copy
> ctor of some class that's stored in a vec, as the vec is being
> resized.  The allocator would need to be a global pointer.  That's
> of course possible but not the most elegant design.
>
I am now imagining an overly complex c++ system that doesnt really fit 
with GCC:-)  I dont think GCCs vec could deal with that model.

when thats a reality, we'll deal with it. for now, Im not overly concerned.
If we reintroduce a dynamic object, we'll worry about it then, and make 
sure a dynamic object can be associated with an allocation object..
<....>

>>>
>>>
>>> A more fancy class would be parameterized on an Allocator policy
>>> that it would allocate memory with, much like C++ standard classes
>>> do.  That would let you define an obstack-based allocator class to
>>> use the way you need, as well and any others.  (Providing
>>> the allocator as a template argument to just the ctor as opposed
>>> to the whole class itself would make different "instances"
>>> interchangeable for one another.)
>>>
>>> Martin
>>
>>   We had a dynamic sized irange, I told aldy to kill it off and we 
>> replaced it with int_range_max with 255 ranges because the combo of 
>> int_range_max for calculating and allocation to store seemed to solve 
>> all the problems with far less allocation overhead, and wasnt 
>> particularly onerous.
>>
>> int_range_max use to have a small vector of something like 5 pairs. 
>> If a range was being created and we needed more by accessing elements 
>> higher than that, , it would allocate a hunk of memory to be able to 
>> handle it, plus a little extra buffer space, and point to that 
>> instead. So it was a dynamically size int_range_max that managed it 
>> own memory. we found that in practice, the pairing of the current 
>> int_range-max and the allocation pool was far more efficient 99% of 
>> the time.  so we just eliminated it.
>>
>> Something like that could certainly be revived...  but most of the 
>> time it doesnt seem necessary.  Generally, you need to ask for a 
>> range and then store it.  Ask for it with int_range_max, and store it 
>> with the allocator if you are goignto need a lot fo them.  so instead of
>>
>> range_of_expr (vec[x], name..)
>>
>> you do
>>
>> int_range_max m;
>> range_of_expr (m, name)
>> vec[x] = allocato(m);
>>
>> Do you really need 6 or 10 subranges to find out the answer to the 
>> questions you are looking for?  most of the time, 2 or 3 pairs 
>> carries all the information anyone needs and its efficient switches 
>> are the biggest beneficiary of the multiple ranges, allowing us to be 
>> quite precise on what reaches the interior of a case or the default.
>>
>> the next question is, how many of these do you need?  The range is 
>> doing it with there allocator because it could in theory have #BB * 
>> #SSA_NAMES, which could be a lot.    if you have just a single or 2 
>> vectors over ssa-names, and that is sparsley filled, just use 
>> int-range-max.
>
> The use case I'm thinking of would have an irange of some size for
> every decl or result of an allocation call accessed in a function,
> plus another irange for every SSA_NAME that's an object pointer.
> The size of the first irange would be that of the allocation
> argument in the first case.  In the second case it would be
> the combined range of the offsets the pointer from whatever it
> points to (e.g., in p0 = &x; p1 = p0 + i; p2 = p1 + j; p2's
> offset range would be Ri + Rj where R is the value range of i
> or j.
>
> It probably doesn't makes sense to keep track of 255 subranges
> (or even many more than 5) but in compliance with the guidance
> in the irange best practices document to write code for [as if]
> infinite subranges, the data structures should be free of any
> hardwired limit.  So I envision I might have something like
I think you are interpreting that guidance too literally.
It would perhaps be better to word it more in line with what is 
practical . If what you are interested in is the upper and lower limit, 
then 1 or 2 sub-ranges is plenty.  you really dont care about the middle 
ranges
if you are a switch operation, then infinite might be appropriate.
pointers tracking null and non-null?  int_range<1> is likely enough.

you are dealing with offsets.. you really dont care about those middle 
sub-ranges.  really.  I see little benefit for you from that.  You 
benefit for you is the more accurate/easy to use ranges.

for now, i would simply use an int_range<2> or <3> and see if you miss 
something you expect.  Thats what I started with within internally  the 
ranger, and 95% (made up stat) of ranges in the compiler are represented 
by a int_range_<3>.
remove switches and we're at 98.5% (another made up number)

Perhaps we should  rework that part of the document.. I think maybe the 
exuberance of being able to work in infinite subranges is over-stated.

Its probably worth reintroducing the dynamic version at some point, but 
I really dont think you need it.

> a pair of dynamic_range members in each of these objects (along
> with the SSA_NAME and other stuff), and some runtime parameter
> to cap the number of subranges to some reasonable limit, merging
> those in excess of it.
>

If you pass in an int_range<3>, internally the ranger will work with 
infinite ranges, but it will compress the result automatically to an 
int_range<3> when it returns the result for you.  there wont be any loss 
of precision during calculations, only what is passed back to you.   
Perhaps that information is not clear either.   And i think you mostly 
care about the upper and lower limits?  (and overflow, but thats 
different story) int_range<2> is probably sufficient, maybe even 
int_range<1>...

Andrew