This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Combine location with block using block_locations


On Wed, Sep 12, 2012 at 10:44 PM, Dehao Chen <dehao@google.com> wrote:
> Attached is the memory consumption report for a very large source
> file. Looks like this patch actually reduced the memory consumption by
> 2%.

Please make sure to test large C++ expression template users.  Large
regular programs do not stress this part.

Richard.

> Dehao
>
> On Thu, Sep 13, 2012 at 1:18 AM, Xinliang David Li <davidxl@google.com> wrote:
>> On Wed, Sep 12, 2012 at 10:05 AM, Dehao Chen <dehao@google.com> wrote:
>>> There are two parts that needs memory management:
>>>
>>> 1. The BLOCK structure. This is managed by GC. I originally thought
>>> that removing blocks from tree.gsbase would paralyze GC. This turned
>>> out not to be a concern because DECL_INITIAL will still mark those
>>> used tree nodes. This patch may decrease the memory consumption by
>>> removing blocks from tree/gimple. However, as it makes more blocks
>>> become used, they also increase the memory consumption.
>>
>> You mean when you also make the location table GC root.
>>
>> Can you share the mem-stats information for the large program with and
>> without your patch?
>>
>> thanks,
>>
>> David
>>
>>> 2. The data structure in libcpp that maintains the hashtable for the
>>> location->block mapping. This is relatively minor because for the
>>> largest source I've seen, it only maintains less than 100K entries in
>>> the array (less than 1M total memory consumption). However, as it is a
>>> global data structure, it may make LTO unhappy. Honza is helping
>>> testing the memory consumption on LTO (but we first need to make this
>>> patch work for LTO). If the LTO result turns out ok, we probably don't
>>> want to put these under GC because: 1. it'll make things much more
>>> complicated. 2. using self managed memory is more efficient (as this
>>> is frequently used in many passes). 3. not using GC actually saves
>>> memory because even though the block is in the map, it can still be
>>> GCed as soon as it's not reachable from DECL_INITIAL.
>>>
>>> I've tested this on some very large C++ files (each one takes more
>>> than 10s to build), the memory consumption does not see noticeable
>>> increase/decrease.
>>>
>>> Thanks,
>>> Dehao
>>>
>>> On Wed, Sep 12, 2012 at 9:39 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>> On Wed, Sep 12, 2012 at 2:13 AM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Wed, Sep 12, 2012 at 7:06 AM, Dehao Chen <dehao@google.com> wrote:
>>>>>> Now I think we are facing a more complex problem. The data structure
>>>>>> we use to store the location_adhoc_data are file-static in linemap.c
>>>>>> in libcpp. These data structures are not guarded by GTY(()).
>>>>>> Meanwhile, as we have removed the block data structure from
>>>>>> gimple.gsbase as well as tree.exp (encoding them into an location_t).
>>>>>> This could cause block being GCed and the LOCATION_BLOCK becoming
>>>>>> dangling pointers.
>>>>>
>>>>> Uh.  Note that it is quite important that we are able to garbage-collect unused
>>>>> BLOCKs, this is the whole point of removing unused BLOCK scopes in
>>>>> remove_unused_locals.  So this indeed becomes much more complicated ...
>>>>> What would be desired is that the garbage collector can NULL an entry in
>>>>> the mapping table when it is not referenced in any other way (that other
>>>>> reference would be the BLOCK tree as stored in a FUNCTION_DECLs DECL_INITIAL).
>>>>
>>>> It would be nice to GC those unused BLOCKS. I wonder how many BLOCKS
>>>> are created for a large C++ program. This patch saves memory by
>>>> shrinking tree size, is it a net win or loss without GC those BLOCKS?
>>>>
>>>> thanks,
>>>>
>>>> David
>>>>
>>>>
>>>>>
>>>>>> I tried to manipulate GTY to make it recognize the LOCATION_BLOCK from
>>>>>> gimple.gsbase.location. However, neigher nested_ptr nor mark_hook can
>>>>>> help me.
>>>>>>
>>>>>> Another approach would be guard the location_adhoc_data and related
>>>>>> data structures in GTY(()). However, this is non-trivial because tree
>>>>>> is not visible in libcpp. At the same time, my implementation heavily
>>>>>> relies on hashtable to make the code efficient, thus it's quite tricky
>>>>>> to make "param_is" and "use_params" work.
>>>>>>
>>>>>> The final approach, which I'll try tomorrow, would be move all my
>>>>>> implementation from libcpp to gcc, and guard them with GTY(()). I
>>>>>> still haven't thought of any potential problem of this approach. Any
>>>>>> comments?
>>>>>
>>>>> I think moving the mapping to GC in a lazy manner as I described above
>>>>> would be the way to go.  For hashtables GC already supports if_marked,
>>>>> not sure if similar support is available for arrays/vecs.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Thanks,
>>>>>> Dehao
>>>>>>
>>>>>> On Tue, Sep 11, 2012 at 9:00 AM, Dehao Chen <dehao@google.com> wrote:
>>>>>>> I saw comments in tree-streamer-out.c:
>>>>>>>
>>>>>>>   /* Do not stream BLOCK_SOURCE_LOCATION.  We cannot handle debug information
>>>>>>>      for early inlining so drop it on the floor instead of ICEing in
>>>>>>>      dwarf2out.c.  */
>>>>>>>   streamer_write_chain (ob, BLOCK_VARS (expr), ref_p);
>>>>>>>
>>>>>>> However, what the code is doing seemed contradictory with the comment.
>>>>>>> Or am I missing something?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz <matz@suse.de> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Tue, 11 Sep 2012, Dehao Chen wrote:
>>>>>>>>
>>>>>>>>> Looks like we have two choices:
>>>>>>>>>
>>>>>>>>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t)
>>>>>>>>
>>>>>>>> This will actually not work correctly in some cases.  The problem is, if
>>>>>>>> the prevailing decl is already part of another chain (say in another
>>>>>>>> block_var list) you would break the current chain.  Hence block vars need
>>>>>>>> special handling in the lto streamer (another reason why tree_chain is not
>>>>>>>> the most clever think to use for this chain).  This problem area needs to
>>>>>>>> be solved somehow if block info is to be preserved correctly.
>>>>>>>>
>>>>>>>>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL
>>>>>>>>> (TREE_CHAIN (t)).
>>>>>>>>
>>>>>>>> That's also a large hammer as it basically will mean no debug info after
>>>>>>>> LTO :-/ Sigh, at this point I have no good solution that doesn't involve
>>>>>>>> quite some work, perhaps your hack is good enough for the time being,
>>>>>>>> though I hate it :)
>>>>>>>
>>>>>>> I got it. Then I'll keep the patch as it is (remove the
>>>>>>> LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and
>>>>>>> then we should be good to check in?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Dehao
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ciao,
>>>>>>>> Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]