[RFC, PATCH]: Introduction of callgraph annotation class

Thu Oct 16 11:44:00 GMT 2014

On 10/16/2014 01:31 PM, Richard Biener wrote:
> On Wed, Oct 15, 2014 at 6:26 PM, Martin LiÅ¡ka <mliska@suse.cz> wrote:
>> Hello.
>>
>> Following patch introduces a new class called callgraph_annotation. Idea
>> behind the patch is to provide a generic interface one can use to register
>> custom info related to a cgraph_node. As you know, symbol_table provides
>> hooks for creation, deletion and duplication of a cgraph_node. If you have a
>> pass, you need to handle all these hooks and store custom data in your data
>> structure.
>>
>> As an example, after discussion with Martin, I chose usage in ipa-prop.h:
>>
>> data structure:
>> vec<ipa_node_params> ipa_node_params_vector
>>
>> if the pass handles an event, following chunk is executed:
>> if (ipa_node_params_vector.length () <= (unsigned) symtab->cgraph_max_uid)
>>      ipa_node_params_vector.safe_grow_cleared (symtab->cgraph_max_uid + 1);
>>
>> The problem is that you can have sparse UIDs of cgraph_nodes and every time
>> you have to allocate a vector of size equal to cgraph_max_uid.
>>
>> As a replacement, I implemented first version of cgraph_annotation that
>> internally uses hash_map<cgraph_unique_identifier, T>.
>> Every time a node is deleted, we remove corresponding data associated to the
>> node.
>>
>> What do you think about it?
>
> I don't like "generic annotation" facilities at all.  Would it be possible
> to make cgraph UIDs not sparse?  (keep a free-list of cgraph nodes
> with UID < cgraph_max_uid, only really free nodes at the end)
> Using a different data structure than a vector indexed by cgraph UID
> should also be easily possible (a map from UID to data, hash_map <int, T>).

Hello.

If I recall correctly, we recycle cgraph_nodes and it's possible that an UID is given to different nodes:
symbol_table::allocate_cgraph_symbol (void). Such uid is problematic from perspective that it cannot be used as a index to a vector.

It was also Honza's note that one can choose inner implementation of such annotation class. We can implement both sparse (hash_map) and consecutive vector data structure.

According to first numbers I was given, Inkscape allocates about ~64k cgraph_nodes in WPA. After function merging is processed, it shrinks to about a half. So that, our free list contains the half of nodes. If we use consecutive vector, our memory impact is bigger thank necessary.

Martin

>
> Richard.
>
>> Thank you,
>> Martin