[PATCH] irange_pool class

Fri Sep 18 20:35:12 GMT 2020

On 9/18/20 11:36 AM, Andrew MacLeod wrote:
> On 9/18/20 1:07 PM, Martin Sebor wrote:
>> On 9/18/20 8:10 AM, Aldy Hernandez via Gcc-patches wrote:
>>>
>>>
>>> On 9/18/20 2:28 PM, David Malcolm wrote:
>>>> On Fri, 2020-09-18 at 07:49 +0200, Aldy Hernandez wrote:
>>>>>
>>>>> On 9/18/20 3:43 AM, David Malcolm wrote:
>>>>>> On Thu, 2020-09-17 at 12:36 +0200, Aldy Hernandez via Gcc-patches
>>>>>> wrote:
>>>>>>> This is the irange storage class. It is used to allocate the
>>>>>>> minimum
>>>>>>> amount of storage needed for a given irange.  Storage is
>>>>>>> automatically
>>>>>>> freed at destruction.
>>>>>>>
>>>>>>> It is meant for long term storage, as opposed to int_range_max
>>>>>>> which
>>>>>>> is
>>>>>>> meant for intermediate temporary results on the stack.
>>>>>>>
>>>>>>> The general gist is:
>>>>>>>
>>>>>>>     irange_pool pool;
>>>>>>>
>>>>>>>     // Allocate an irange of 5 sub-ranges.
>>>>>>>     irange *p = pool.allocate (5);
>>>>>>>
>>>>>>>     // Allocate an irange of 3 sub-ranges.
>>>>>>>     irange *q = pool.allocate (3);
>>>>>>>
>>>>>>>     // Allocate an irange with as many sub-ranges as are currently
>>>>>>>     // used in "some_other_range".
>>>>>>>     irange *r = pool.allocate (some_other_range);
>>>>>>
>>>>>> FWIW my first thoughts reading this example were - "how do I
>>>>>> deallocate
>>>>>> these iranges?" and "do I need to call pool.deallocate on them, or
>>>>>> is
>>>>>> that done for me by the irange dtor?"
>>>>>
>>>>> Thus the description front and center of the header file:
>>>>>
>>>>> "Storage is automatically freed at destruction..."
>>>>>
>>>>>> I think of a "pool allocator" as something that makes a small
>>>>>> number of
>>>>>> large allocation under the covers, and then uses that to serve
>>>>>> large
>>>>>> numbers of fixed sized small allocations and deallocations with
>>>>>> O(1)
>>>>>> using a free list.
>>>>>
>>>>> Ah, I didn't know pool had a different meaning.
>>>>
>>>> See e.g. gcc/alloc-pool.h
>>>>
>>>>
>>>>>> [...]
>>>>>>
>>>>>>> +// This is the irange storage class.  It is used to allocate the
>>>>>>> +// minimum amount of storage needed for a given irange.  Storage
>>>>>>> is
>>>>>>> +// automatically freed at destruction.
>>>>>>
>>>>>> "at destruction" of what object - the irange or the irange_pool?
>>>>>> Reading the code, it turns out to be "at destruction of the
>>>>>> irange_pool", and it turns out that irange_pool is an obstack under
>>>>>> the
>>>>>> covers (also called a "bump allocator") and thus, I believe, the
>>>>>> lifetime of the irange instances is that of the storage instance.
>>>>>
>>>>> The sentence is talking about the storage class, so I thought it was
>>>>> obvious that the destruction we talk about is the storage class
>>>>> itself.
>>>>> I suppose if it isn't clear we could say:
>>>>>
>>>>> "Storage is automatically freed at destruction of the storage class."
>>>>
>>>>
>>>>>> I think it would be clearer to name this "irange_obstack", or
>>>>>> somesuch.
>>>>>
>>>>> I'd prefer something more generic.  We don't want to tie the name of
>>>>> the
>>>>> allocator to the underlying implementation.  What if we later change
>>>>> to
>>>>> malloc?  We'd have to change the name to irange_malloc.
>>>>
>>>>> irange_allocator?  Or is there something more generically appropriate
>>>>> here?
>>>>
>>>> How about "irange_bump_allocator?"   Rather long, but it expresses the
>>>> key point that the irange instances coming out of it don't have
>>>> independent lifetimes - their lifetimes are those of the allocator; the
>>>> client code doesn't need to find and clean up all of those individual
>>>> iranges, right?  (assuming I'm reading the code correctly) (That name
>>>> also avoids mentioning the implementation detail that it uses obstack).
>>>
>>> I'm sorry, but that's _really_ ugly.
>>>
>>> Surely irange_allocator can't be that confusing.  A casual look at 
>>> the header file would dispel all doubts.
>>
>> David's point abut different strategies was also in the back my
>> mind but it took me a bit to formulate a question about the design.
>> Is a pool of ranges with a fixed allocation policy the right design
>> for long-term storage of irange objects?  What are some example use
>> cases?
>>
>> Here's one that comes to mind based on what I want to do in
>> gimple-ssa-isolate-paths.c.  I need to store irange instances as
>> members of my own class, but I don't know how many subranges each
>> instance might need to store (it depends on the IL).  I store
>> objects of this class a container (e.g., hash_map or set).
>> I don't want to use int_range_max because that would be wasteful
>> but I can't use the pool as a proxy because it's not copyable.
>> So I either have to dynamically allocate the pool and store
>> a pointer to it instead, in addition to the instances, or derive
>> my own class from irange and manage the tree arrays myself.  In
>> both cases I'm adding a layer of memory management on top of what
>> that the pool is there to provide.  So the design doesn't seem
>> very well suited for my use case.
>>
>> I don't mean this as an objection to the patch (I'm sure there's
>> a use case I'm not thinking of), but more as a question.
>>
>> Martin
> 
> 
> I don't know why  this is confusing...
> 
> it works exactly like one would expect a simple allocator to work.. as 
> long as the allcoator is "live", its allocations are live.  once it is 
> destructed, all the memory it manages is freed..    It purpose is to 
> avoid having to walk/find everything that was allocated so it can be freed.
> 
> I'll give you the use case from the ranger. in fact, it *is* the 
> ranger's allocator, exposed for other passes to use.
> 
> Ranger uses the allocator to store the live-on-entry ranges for 
> ssa-names.  Each basic block has a vec<irange *> allocated as needed 
> which is indexed by ssa-name.
> 
> int_range_max is passed to range_on_entry() to go off and find the 
> range..  when it comes back, it could have 0-255 subranges,. it doesnt 
> matter.
> we call allocate(range) to get a pointer to an efficent copy and store 
> it in the vector for the ssa name in that block.
> If the updater later discovers that the range can be made more accurate, 
> it checks if the new range fits in the memory allocated and if it does, 
> simply copies.  if it doesnt fit, then it frees the old hunk, and 
> allocates  a new one and stores that.
> 
> The ranger creates an allocator when it starts up, and then frees it 
> when its being destructed.  Thats the life of the on-entry cache, so 
> thats matches the life of the allocator..
> 
> I don't understand the proxy comment, or why one would ever want to copy 
> the allocator itself? or why would you derive from irange? why cant you 
> just create an allocator when the pass starts, use it when you need to 
> store a range, and then let it go at the end of the pass with the other 
> memory?

The int_range template is derived from irange and provides the array
of trees that the irange works with.  The pool also provides memory
for iranges and effectively returns objects "derived" from irange
(they're bigger than it is).

What I meant by proxy is a substitute class each of whose objects
stands in for a single instance of int_range<N> where N is
a runtime value.   There's no class like that.

> 
> its mean to be a convenient way to get a range allocated to store via a 
> pointer. nothing more.  if you have more complex needs,then you need to 
> manage those needs.  The ranger manages the live on entry vectors 
> separately from the ranges..

What I'm thinking of is actually more basic than that: an irange
class with a runtime number of subranges, one that can be either
directly stored in a container like vec:

   vec<dynamic_irange>

where dynamic_range is such a class, or that can be a member of
a class that's stored in it.  I expect that will be the default
use case for the passes that walk the IL looking for the sizes
and offsets into the objects, accesses to which they validate.
This can be done with int_range<N> for constant N but not with
irange because it doesn't own the memory it works with).

To illustrate what I mean here's a completely untested outline
of a plain-Jane dynamic_irange class (it won't compile because
it accesses private and protected members of irange, but it
should give you the idea):

   class dynamic_irange: public irange
   {
   public:
     dynamic_irange (unsigned num_pairs)
       : irange (new tree[num_pairs], num_pairs) { }

     dynamic_irange (const dynamic_irange &rhs)
       : irange (new tree[rhs.m_max_ranges], rhs.m_num_ranges)
     { irange::operator= (rhs); }

     dynamic_irange (dynamic_irange &&rhs)
       : irange (rhs.m_base, rhs.m_max_ranges)
     { rhs.m_base = 0; }

     dynamic_irange& operator= (const dynamic_irange &rhs)
     {
       if (this != &rhs)
         {
           delete[] m_base;
           m_base = new tree[rhs.m_max_ranges];
           m_num_ranges = rhs.m_num_ranges;
           irange::operator= (rhs);
         }
       return *this;
     }
     ~dynamic_irange () { delete[] m_base; }
   };

A more fancy class would be parameterized on an Allocator policy
that it would allocate memory with, much like C++ standard classes
do.  That would let you define an obstack-based allocator class to
use the way you need, as well and any others.  (Providing
the allocator as a template argument to just the ctor as opposed
to the whole class itself would make different "instances"
interchangeable for one another.)

Martin

> 
> It currently based on an obstack, and it works exactly like an obstack 
> does...  provide hunks of memory with a specific known lifetime. nothing 
> more, nothing less.
> 
> 
> Andrew
>