This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Cgraph Modification Plan


On Wed, Sep 5, 2012 at 10:14 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Wed, Sep 5, 2012 at 5:41 PM, Lawrence Crowl <crowl@googlers.com> wrote:
>> > On 9/5/12, Xinliang David Li <davidxl@google.com> wrote:
>> >> On Sep 5, 2012 Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> > OK, the basic idea is that symtab_node is basetype of
>> >> > cgraph_node and varpool_node.  We may want to drop the historica
>> >> > cgraph/varpool names here, since function_node/variable_node
>> >> > would sound better. Cgraph still exists, but varpool is more
>> >> > or less historical relic.
>> >>
>> >> The cgraph redesign probably deserves more discussion.
>> >
>> > Hence, the post!
>> >
>> >> 1) It may be worthwhile to abstract the graph manipulation code
>> >> into a utility class which is templatized.
>> >>
>> >> graph<T>, node<T> with node inheriting from T.
>> >
>> > While I won't dispute the statement, getting from here to there
>> > will likely involve some intermediate steps.
>>
>> Yes. My point is that the intermediate steps should be driven by the
>> final goal. It is important to first have  the design of final class
>> hierarchy and interfaces nailed and then the intermediate steps
>> defined instead of the other way around.
>
> Yes, it seems resonable to me. We just need to keep in mind the specifics
> callgraph has (i.e. many types of edges that are relevant in different
> contextes and have good reason for different kind of in memory reppresentation)
> There are also different types of nodes - functions, variables, aliases. I plan
> to add labels in 4.8 timeframe, too).

Areas that are confusing and need clean up (IMO) include:
1) handling of aliases and clones
2) reachability, needed, analyzed bits.  The needed bit is not in sync
with the call to check if a function is needed.  Bits and attributes
used in analysis should be isolated out from the core data structure.
3) ipa references -- should they be modeled as special edges?

>>
>> >
>> >> 2) Introduce a global symbol table containing a function table
>> >> and a global variable table. The function table should replace
>> >> the current cgraph node link list, and the variable table replaces
>> >> the varpool.  The symbol table should provide basic interfaces to
>> >> do named based lookup, traversal, alias handling etc.  I noticed
>> >> trunk already has some of that -- but it seems more abstraction
>> >> is needed.
>> >
>> > Do you mean moving away from a pointer-based approach?
>>
>> See above. I mean it is important to have well defined symtab
>> interfaces that hide implementation as much as possible. It will make
>> the interfaces more stable. It is currently quite difficult for
>> cgraph/varpool related changes in gcc branches to keep up with trunk
>> without stable core APIs.
>
> Well, the APIs did not changed much over years (LTO brought in some more busy
> changes, but many were just hacks that needed redesign anyway), in 4.8 we
> changed quite a lot because of introduction of the symbol table. It would be
> really nice to get as much as C++ conversion into 4.8, too, since we broke most
> of existing patches anyway, to get more stability frm 4.9 up.
>
>> >> 3) it seems natural to drop 'node' in the naming scheme
>> >>
>> >> symbol (symtab_entry) --> base class;
>> >> function --> derived from symbol;  (It conflicts with the existing
>> >> struct function though);
>> >> variable --> derived from symbol;
>> >>
>> >> typedef node<function> cnode;
>> >
>> > A node<function> is not a derived class of node<symbol> even when
>> > function is derived from symbol.  That property is helpful in
>> > ensuring usable type safety.
>>
>>
>> We don't need node<symbol> -- only node<function> is needed, and it is
>> derived from function and function is derived from symbol.
>
> Yes, this is where I expected to land.  I expect to see more complex inheritance
> tree.  Currently we have:
>
> symbol
>   |
>   +-function (cgraph)
>   |
>   +-variable (varpool)
>
> While function can have five types - just external declarations, actual
> definitions with body (analyzed flag is set), thunk, alias and virtual clone
>
> Similarly variables are declarations, ones with body (analyzed flag is set),
> and aliases.
>
> We do not represent labels and const_decl references. We ought: those are also
> symbols and they are important for partitioning.
>
> We probably want rither structure here in longer term:
>
> symbol
>   |
>   +-function declaration
>   | |
>   | +-function definition
>   | |
>   | +-function alternative entry point (thunk)
>   | |
>   | +-function alias
>   | |
>   | +-virtual clone
>   |
>   +-variable declaration
>   | |
>   | +-variable definition (possibly constant pool references via const_decl here, too)
>   | |
>   | +-variable alias
>   |
>   +-function label definition
>
> In particular we do not want to store all the stuff we store for function
> definitions for declarations.

True, but that is at the cost of too complex class hierarchy. Is it
enough to have 'has_body()' to differentiate def vs declarations (the
meaning of 'is_external' is overloaded)? The additional information
can be accessed via indirection (to additional data structure).

For  aliases, why can't those nodes be merged into one node (mapped
from all assembler names in the symtab)? aliases will just be
attribute of the symbol).

Thunks are only referenced from vtables. Is there a need to create
cgraph nodes for them?

The clone handling code is also quite complicated. Is it possible to
simplify it?

>
> variable/function aliases do have a lot in common (similarly for
> function/variable definitions), too. The reason why I did not introduce
> separate alias_node type is that function aliases are different from variable
> aliases in several ways. For example, they can not be destinations of function
> calls.
>>
>> >
>> >> 4) coding convention is needed for functions that do 'casting'
>> >> and 'trial casting'.
>> >
>> > The functions I have suggestion are ptr_... and try_... respectively.
>> > Are you suggesting amending the coding conventions?
>> >
>>
>> Is this something agreed upon in previous discussions or something
>> newly brought up? How about using  template function in base to do
>> casting:
>>
>> trial cast:
>>
>> template<typename T> T* symbol::cast_to(symbol* p) {
>>    if (p->is<T>())
>>       return static_cast<T*>(p);
>>    return 0;
>>  }
>>
>> cast:
>>
>> template<typename T> T& symbol:as(symbol* p) {
>>    assert(p->is<T>())
>>    return static_cast<T&>(*p);
>>
>>  }
>
> This also seems fine to me (shorter API names are preferable for me, even if I
> still conisder <...> in code disturbing ;)
>
> This is not the only case, it really depends on conding standards we employ for
> GCC datastructures in general.  I suppose cgraph/varpool is just firt of several
> conversions we plan to do in foreseeable future.
>
> Another thing that calls for C++ API is the way we annotate cgraph/varpool with
> additional information in the IPA passes.  I.e. the whole area using node/edge
> creation/duplicatoin/deletion hooks.

I smell plugin classes such as Listeners, Visitors etc.

thanks,

David

>
> Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]