This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Proposed gcc plugin plugin API mk 2 (this time without camel case!)


On Tue, 2012-04-03 at 15:23 +0200, Richard Guenther wrote:
> On Tue, Apr 3, 2012 at 12:03 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
> > On Mon, Apr 2, 2012 at 7:21 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> >> I wrote a script and ported my proposed API for GCC plugins from my
> >> CamelCase naming convention to an underscore_based_convention (and
> >> manually fixed up things in a few places also).
> >>
> >> The result compiles and passes the full test suite for the Python
> >> plugin; that said, I'm still breaking the encapsulation in quite a few
> >> places (hey, this is an experimental prototype).
> >>
> >> You can see the latest version of it within the "proposed-plugin-api"
> >> branch of the Python plugin here:
> >> http://git.fedorahosted.org/git/?p=gcc-python-plugin.git;a=shortlog;h=refs/heads/proposed-plugin-api
> >>
> >> within the "proposed-plugin-api" subdirectory.
> >
> > Hmm, how do I get it?  I did
> >
> > git clone http://git.fedorahosted.org/git/proposed-plugin-api
> >
> > but there is nothing in gcc-python-plugin/.  And
> >
> > git checkout proposed-plugin-api
> >
> > says I'm already there ...?
> 
> Meanwhile the directory magically appeared (heh ...).

[The ways of git are something of a mystery to me: 95% of the time it's
the best revision control system I've ever used, but 5% of the time the
most obtuse]

> Overall it looks good
Thanks for taking a look.

>  - but it seems to leak GCC headers into the
> plugin API (via gcc-plugin.h and input.h inclusion).  Thus, it
> lacks separating the plugin API headers from the plugin API implementation
> headers?  
That's true.  The big information "leak" happens inside
gcc-semiprivate-types.h, which defines the various structs that act like
pointers, each with a decl like this:

struct gcc_something {
   some_private_gcc_pointer_type inner;
};

It would be possible to make this more opaque like this:

struct gcc_something {
   struct some_private_gcc_struct *inner;
};

given that you then don't need a full definition of that inner struct
visible to users.  Though location_t is leaked, and in this approach,
there isn't a way around that, I think.


> Or maybe I'm not looking at the right place.
> I also dislike the use of GCC_PUBLIC_API, etc. - everything in
> the plugin API headers should be obviously public - and the implementation
> detail should be an implementation detail that should not need to care.

I added that macro based on the example of other plugin/embedding
systems I've seen (e.g. Python), where it's handy to make that macro be
non-trivial on some platforms.  See e.g. CPython's pyport.h:
 http://hg.python.org/cpython/file/9599f091faa6/Include/pyport.h
where the macros PyAPI_FUNC and PyAPI_DATA have a 44-line definition,
handling various different compatibility cases.

For example, GCC_PRIVATE_API could have some magic linker flag that lets
us automatically strip out those symbols so that they're not visible
externally after the compiler has been linked.

> >> If this landed e.g. in GCC 4.8, would this be usable by other plugins?
> 
> For sure.  I'd say get the copyright stuff sorted out and start pushing things
> into the main GCC repository - where the obvious starting point is to
> get the makefile, configure and install parts correct.
> 
> I'd concentrate on providing enough API for introspection, like simple
> things, counting basic-blocks, counting calls, etc.  I'm not sure we
> want to expose gcc_gimple_walk_tree or gcc_gimple_print (or
> the gcc_printers for a start) - the latter would something that the
> python plugin
> would provide, resorting to introspecting the stmt itself.
FWIW the Python plugin already heavily uses gcc's pretty-printer code,
so that *is* something I'd want to wrap (but it's fairly easy to do so).

> I also wonder about gcc_gimple_phi_upcast - why would you specialize
> PHI nodes but not any others?  I'd have exposed gcc_gimple_code.
> In general the plugin API should provide exactly one (and the most flexible)
> way to do a thing (thus, not have gcc_gimple_assign_single_p) and
> the high-level consumers like the python plugin should provide
> nice-to-use interfaces.

This touches on the biggest thing that's missing in the API: the
multitude of tree types, gimple statement types, and rtl types.  All I
added were the "base classes", and gimple-phi is the only instance so
far in the API of a subclass.

I would like the API to expose these (my plugin uses them to good
effect), but doing it in C is likely to be messy: lots of upcasting and
downcasting functions.

An alternative approach might be to bite the bullet and go to C++; maybe
something like this (caveat: I last did any C++ about a decade ago, no
idea if the following compiles):

// Everything exposed in the public headers are pure virtual functions
// leading to abstract classes: no public data
namespace gcc {

   // Abstract base class for objects managed by GCC's garbage-collector
   class gcmanaged {
   public:
      virtual void mark_in_use() const = 0;
   };

   class location : public gcmanaged {
       // it isn't gcmanaged yet, but might become so
   public:
      virtual std::string get_filename() const = 0;
      virtual int get_line() const = 0;
      virtual int get_column() const = 0;
   };

   namespace gimple {
       // enum to assist in RTTI by non-C++ code:
       enum code {
           INVALID_GIMPLE_CODE = -1,
	   GIMPLE_NOP = 0,
	   GIMPLE_ASSIGNMENT,
	   GIMPLE_SWITCH,
	   //etc; don't have to expose all of the regular gimple codes.
           NUM_CODES,
       };

       class statement : public gcmanaged {
       public:
	   virtual location *get_location() const = 0;
	   virtual enum code get_code() const = 0;
       };

       class nop : public statement {
       };

       class phi : public statement {
       public:
           virtual tree::ssaname* get_lhs() const = 0;
           virtual std::list<tree::expr*, cfg::edge*> get_rhs() const = 0;
           // or whatever
       };
   };

   namespace tree {
       class declaration : public gcmanaged {
       public:
	   virtual std::string get_name() const = 0;
	   virtual location*   get_location() const = 0;
       };

       class funcdecl :: public declaration {
       public:
	   virtual function* get_function() const = 0;
	   virtual std::list<parmdecl*> get_arguments() const = 0;
	   virtual resultdecl* get_result() const = 0;
           // etc: accessor functions
       };

       class type : public gcmanaged {
       // etc...
       };

       class integertype :: public type {
       // etc...
       };

       // does there actually need to be a "tree" base class?  I think
       // the tcc_codes express the base classes, at a first approximation
       // could even lose the "tree" concept altogether
   };
};

[though I'd greatly prefer to capitalize the classnames (sigh) :) ]

The implementation would have entirely separate (private) headers, with
something like:

class location_impl : public gcc::location {
private:
   location_t loc;

public:
    // ctor:
    location_impl(location_t loc) : loc(loc) {} 

    // implement the public API:
    void mark_in_use() const {
         // empty: location_t currently isn't gc managed
    }

    std::string get_filename() const;
    int get_line() const;
    int get_column() const;
};

class phi_impl : public gcc::gimple::phi {
private:
    gimple stmt; // of correct subtype

public:
    // gcmanaged:
    void mark_in_use() const;

    // ctor:
    phi(gimple stmt);

    // accessors:
    location *get_location() const;
    enum code get_code() const;

    tree::ssaname *get_lhs() const;
    std::list<tree::expr*, cfg::edge*> get_rhs() const;
};

That way the implementation is *entirely* hidden, with entirely separate
headers (though during prototyping I'd need an escape hatch).  But it
does mean that the API is handing off pointers-to-pointers, rather than
just pointers (allocating lots of tiny wrapper objects, like the ones
above, so that the caller doesn't know the size, and so there can be a
vtable).  

Does this sound sane?

Long-term, perhaps these impl classes could perhaps *become* the
implementations - everything's hidden.

Thanks again for looking at this
Dave


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]