[trunk r143197] patch adding optional extra marking to GGC

Basile STARYNKEVITCH basile@starynkevitch.net
Mon Jan 12 19:59:00 GMT 2009


Hello All

Ian Lance Taylor wrote:
> Basile STARYNKEVITCH <basile@starynkevitch.net> writes:
>
>   
>> I forgot to insist that I am talking about calling the GGC collector 
>> from *inside* passes. It being called from the pass manager is ok
>> for 
>> me, but there are situations where it is not enough. I repeat that my
>> focus is the rare situations where invoking the GGC collector inside a
>> pass is useful. I have no issues with the way it is called from the
>> pass manager.
>>     
>
> I think I'm confused.  Memory which is allocated entirely within a
> single pass, and freed by that pass, should not use the garbage
> collector.
The point is that there are situation where you cannot predict if a data 
is only inside a pass; or if it lives after your pass ends. The MELT 
framework (which does not claim to be "general purpose" it is only for 
specialized stuff, mostly long-lasting analysis or AST processing or 
prototyping) takes the opposite approach and has already implemented it. 
"Everything" is garbage collected. And MELT has a "specialized" garbage 
collector, backed up by GGC. So the intuition is that some MELT pass 
allocate a lot of data, a tiny fraction of which may be useful by 
further passes. The point is that we don't want to care about which one 
(we don't know if a data is temporary or not).

But my proposal would be useful for any kind of data which could 
potentially remain after a pass. Sometimes you don't know when a data is 
temporary to a pass and when it remains after the pass ended.

>   I appreciate that in some cases it is easier to not bother
> to keep track of your memory usage.  But in C/C++ that is ultimately
> sloppy coding.  In fact, the reason that we use a garbage collector in
> gcc itself amounts to sloppy coding.  We should strive to eliminate
> uses of the garbage collector, not add to them.
>   
When coding in MELT you don't bother about memory management. You just 
allocate stuff, which will be sometimes deallocated  (either by MELT GC 
or by GGC, depending if it is freed in minor or full GC). I really 
believe that this approach has some benefits (for information, ASTREE & 
FramaC are both static analysers coded in Ocaml where you don't care 
about manual memory management), but I don't claim that all of GCC 
should go that way.

> Within a single pass, why not just use an obstack?  Does your pass
> really use so much memory that you need to free in the middle, and you
> are also unable to track which memory you are using and which you
> should free?  That seems unusually complicated for a single pass.
>
>
>   

Again, I don't track it. But remember that MELT is a Lisp dialect. You 
(or any MELT user) code in the Lisp dialect (you never code in C within 
MELT [except very small C chunks in few MELT primitives, which are for 
MELT experts]), and you really don't want to bother about manual memory 
deallocation (this was IIRC one of the earliest Mac Carty's LISP insight 
50 years ago: don't manage memory by hand).
>>>  Would your
>>> needs be satisfied if there were a way to dynamically add roots to the
>>> GC?
>>>   
>>>       
>> It would be satisfied if I can add dynamically a marking *routine*
>> (not only plain struct-s static data). I believe this could be hided
>> as adding an extra root [a fake structure] and providing its marking
>> routine, perhaps using tricky mark_hook or if_marked GTY
>> options. http://gcc.gnu.org/onlinedocs/gccint/GTY-Options.html#GTY-Options
>> But I really don't think it is simpler or safer. My  proposed patch
>> ggc_collect_extra_marking is extremely short (most of the patch is
>> documentation or comments, the real added code is 18 lines).
>>     
>
> The patch is short but it means that people must effectively write
> their own GC traversal routines.  If they are doing that for
> pass-specific data, I see little point to hooking into the main GC at
> all.  Allocate the memory in your own allocation pool instead.
>   
What if the memory is data which persists? A typical example would be 
some kind of (peculiar) optimisation pass coded in MELT. It has to 
allocate GIMPLE nodes!
> In any case, a short patch doesn't mean one that is easy to use.  If
> it is really desirable to hook into the GC, I think it will be easier
> for people to write their own traversal routine and their own root
> set, and let the generic GC code handle it, 
I don't understand concretely this suggestion (probably my English is 
too poor). What is meant by "people writing their traversal routine"? 
The MELT runtime already has code [mostly generated] which traverse all 
of the MELT call stack (for local pointers). How does it get called? 
What do you concretely suggest? I'm open to suggestions, but I really 
don't understand what you mean... How can the generic GC handle 
traversal of pointers without calling some code which mark stuff?

> than it will be for them
> to understand the details of how the GC works and when their
> additional traversal code will be invoked.
>
> Your patch, although it is short, appears to me to move gcc toward
> greater complexity.  I'm certainly open to hearing different opinions
> from other people.

I don't see the point here. I already have code (all the MELT runtime) 
which scan all relevant data.

And I am extremely surprised that the discussion around my patch is 
getting so long... Much longer that the patch...

Again, the feature I am pushing would just be used by very few people, 
those (rare) that need to call the GGC collector, from inside their 
passes, while having some local data (non-static, non-external, ie 
either on the C stack or in some heap) which points to  GC-ed data (like 
for instance incomplete GIMPLE nodes, ...)

And I still miss how future plugins (if they happen) would add their 
data to be handled by the GGC.

Actually, I believe that
* for most people GGC is basically a mistake, and a good compiler should 
manage all its memory manually, perhaps using C++ like reference 
counters (but then handling cycles is not that simple...)
* for some people (and I am definitely one of them), a garbage collector 
is required in any complex compiler (because a compiler manage complex, 
cyclic, data, with unknown lifetime). I am not particularly happy with 
GGC specifically [in particular its abnormal lack of local pointer 
handling] but I do think that some good garbage collector is required 
inside GCC. I did implement MELT on that idea.

So assuming GGC would stay (which I still hope), I don't understand why 
is my ggc_collect_extra_marking proposal bringing issues... By 
definition, it should be used by people understanding it (but this is 
tautologically true of any feature) and as you have noticed, the code 
patch (18 lines) is smaller than the documentation patch. What kind of 
specific trouble could it bring? I would guess that, much like the 
mark_hook GTY option (which I did propose and implement, and the patch 
was accepted more than a year ago), the ggc_collect_extra_marking 
routine would be very rarely used, but for the few people using it, it 
would be unavoidable. So I am extremely surprised of the length of this 
discussion!

Perhaps the ggc_collect_extra_marking name is badly choosen. Feel free 
to suggest a better name.

And how could I invoke GGC with additional marking (required by MELT, 
and already implemented) without a trick similar to 
ggc_collect_extra_marking?

I am sure that future plugins are also concerned with such issues.

(Sorry, my english is poor, I am not a native English speaker)

Regards.

-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***
membre de l'APRIL "promouvoir et défendre le logiciel libre"
Rejoignez maitenant pplus de 3900 adhérents http://www.april.org



More information about the Gcc-patches mailing list