[TODO] The discussion should be eventually merged with GCC Plugin API page?.. (probably not so much the whole discussion, but rather the findings/results of it?)
[TODO] Add Sean's plugin system comparison
[TODO] Agree on the minimalistic common API for now and later extend it ...
Motivation
The plugins are intended to be a way of extending the functionality of GCC. We foresee three complementary categories of GCC plugins, depending on the nature of the extension:production,quick prototyping/experimentation/research, and new pass integration. Each category calls for slightly different API features:
production (low-level) plugins (Mozilla, MELT et al) which extend the functionality as a production compiler need efficiency and close integration with GCC internal structures. Such plugins will be written by GCC developers, who by definition have an intimate knowledge of GCC internals. The API should be as efficient as possible to minimize the overheads for performance-critical operations (e.g., fine-grain data dependence analysis, cf. Mozilla's interests). To achieve this goal, the API should provide unlimited direct access to GCC internal data structures, with the plugin code fully responsible for their consistency. The set of triggering events is stable and clearly defined, and therefore, can be directly hard-coded in the compiler.Eventually, this plugin system can also be used to integrate new passes written in different languages other than C, etc.
experimental (high-level) plugins (ICI, MELT, MILEPOST et al) should promote experimentation and rapid prototyping of new features for GCC to foster its technology innovation rate. GCC is the only open-source production-quality compiler that supports more than 30 different families of architectures and a large number of optimizations. It is of great interest to the large research community which may have a lot of relevant compiler technology knowledge, but little familiarity with GCC internals.
To make GCC more accessible to this community, the plugins should abstract the internal state of the compiler, require minimal changes to GCC and should offer a very generic API to enable quick prototyping of new ideas and boost innovation in compiler technology. For the experimentation plugins, it is not so much the complete program representation that is interesting as its selected properties, and only those should be exported. Also, the feedback from the plugin is typically not a transformed program, but a decision different from what the default GCC behaviour would suggest (different pass order, different parameter values etc.).
In this experimentation context, trigger events can be arbitrary (any decision point in the compiler can trigger an "interesting event"), and event parameters can be arbitrary as well, calling for a free-form binding of event parameter names to actual internal data of GCC. To achieve this, the "experimentation plugin" API should offer a flexible pass-by-name interface with a simple set of generic concepts (passes, events, callbacks, callback parameters, general compiler state hooks). Where and when necessary, the indirection level (and thus, the overhead) introduced by the explicit naming of manipulated entities could be shunted by directly accessing the corresponding internal data structures, meaning that the "high-level plugin API" could be seen as an extension of the efficient "production plugin" API.
Experimental features implemented using the experimental plugin API should be moved to the production API if/when they stabilize. An additional benefit is that new features developed in this "staged" way are likely to use only the strictly necessary information, eventually contributing to the streamlining of GCC's internal design.
ICI specific note: of course, one could object that this is opening the door for non-GPL code, but then, the GPL v3 FAQ says that the GPLed code can communicate with non-GPL code "at arm's length", and the interface we propose does exactly that: no awareness of / access to GCC header files, zero knowledge of GCC's internal data structures, meaning that the plugin code has no way to access/use GCC internals other than the information "judged acceptable" by the hooks in the compiler.
Side-by-side comparison of production and experimental plugin APIs
Aspect |
Production (low-level) plugins |
Experimental (high-level) plugins |
Purpose |
Implement features for release (production) compilers, simplify development of new passes (potentially write them in other languages) |
Quick experimentation and prototyping of the new features for the future GCC versions while abstracting GCC internals |
Expected impact |
Addition of new features without compromizing speed nor efficiency of the compiler |
Increase rate of innovation in GCC by attracting new technologies and new developers |
Targeted community |
GCC developers |
General compiler developers and researchers (not necessarily familiar with all GCC internals) |
Design goals and constraints |
- efficiency and stability - access to all aspects of the compiler - access to complete programs |
- flexibility - access to abstracted information - access to selected information |
Implementation complexity and status |
Depends, prototype exists |
Minimal changes to existing GCC code, prototype exists |
Prototype |
Mozilla, MELT et al |
Interacive Compilation Interface (ICI), MILEPOST GCC |
GCC branch |
GCC API branch |
Tracks latest GCC trunk (currently SVN r144014), should be merged with GCC API branch |
Current usage (available plugins) |
to be updated |
INRIA, IBM, ARC, STMicroelectronics, Univ. of Edinburgh, HiPEAC institutions: |
Current development and support |
to be updated |
INRIA, IBM, HiPEAC community and other users |
Approach details:
The ICI plugin approach focusses on providing access to a simplified view of compiler state. There are five kinds of entities available in the ICI "prototyping/experimentation plugin" API:
- passes,
- events,
- callback functions,
- callback arguments (parameters),
- features.
Features provide an interface to selected elements of compiler state and configuration, in particular:
- name of the current function (cfun),
- name of the current and the next pass (print names must be defined for all passes!),
- list and current values of command-line options,
- list and current values of compiler parameters,
- compiler version string,
- current working directory of the compiler, etc.
All entities except passes and features are identified by free-form textual names, and can be dynamically registered and unregistered. Both passes and features have predefined names (for passes - the print name), and cannot be registered or unregistered dynamically. The association between names and actual objects is maintained using hash tables, making the access to actual object data reasonably fast.
The key differences from Mozilla's "production plugin" API are:
- possibility of creating an arbitrary pass manager which can invoke any pass by its name.
- The well-foundedness of executing the pass at any given point of the compilation process is not (yet) verified and must be checked independently;
- the passing of callback parameters through separate named objects instead of a single opaque pointer to non-const user data.
- Prior to triggering an event, the triggering side must set up the parameters for this event instance. In the ICI API this is done by registering the parameters, which creates temporary bindings between parameter names and values.
- the use of textual names to designate all entities; for events this is actually a minor difference, since the enum-based and name-based designations are interchangeable as long as the printable event names are all defined and unique; the events in the Mozilla API have already both an enum-based ID and a printable name, so the two approaches can be easily merged;
- the generic concept of features to access informations about compiler state/configuration.
The Way Forward
Implement both - because they are complementary and both plugin API has already their community so we may need both to make GCC lead ...
ICI Patch and plugins are available.
The "production" and "experimentation" APIs have each an already existing developer community. It will be good to combine them as seamlessly as possible, factoring out the common parts, preventing any duplication of work and leveraging the expertise built so far.
API design
The plugin API could be layered, with the Mozilla "production plugin" API as the base layer, and the "experimentation plugin" ICI API extensions on top. The Mozilla API will provide a fast mechanism for direct communication between the plugins and the internal data structures of GCC, including low-level information. The experimentation (abstraction?) layer will provide an additional mechanism for exchanging higher-level information between GCC and the plugins. When using the "experimentation" ICI API layer, the plugin will have less access to the compiler internals, but will be freed from ensuring the consistency of low-level data structures. This purposely limits the functionality scope of prototyping/experimental plugins, but it will also make their development easier and faster.
API implementation process
The API implementation process should happen on the plugins branch. Since the plugin support is quite orthogonal to other compiler features, a regular tracking of the trunk developements should not be a big overhead, therefore reducing the cost of the future integration into the trunk.
Testing and QA
The QA constraints are the same as for other branches: bootstrap builds on multiple platforms, addition of unit tests to the testsuite, regular testuite runs, etc. [edit] Details of the different APIs
Details of the different APIs:
Production (low-level) plugins:
1 enum plugin_event { 2 PLUGIN_PASS_MANAGER_SETUP, // to hook into pass manager 3 PLUGIN_FINISH_STRUCT, 4 PLUGIN_FINISH_UNIT, // useful for summary processing 5 PLUGIN_CXX_CP_PRE_GENERICIZE, // allows to see low level AST in C++ FE 6 PLUGIN_FINISH, // called before GCC exits 7 ... 8 }; 9 10 // The prototype for a plugin callback function. 11 // gcc_data - event-specific data provided by GCC 12 // user_data - plugin-specific data provided by the plug-in 13 typedef void (*plugin_callback_func)(void *gcc_data, 14 void *user_data); 15 16 struct callback_registration { 17 char* name; // display name for this plug-in 18 enum plugin_event event; // which event the callback is for 19 plugin_callback_func callback; // the callback to be called at the event 20 void* user_data; // user-specified data passed in user_data 21 }; 22 23 // Called from inside GCC. Invokes all plug-in callbacks compatible with an event. 24 // id - the event identifier. There should be exactly one for each call. 25 // gcc_data - event-specific data provided by the compiler 26 void invoke_plugin_callbacks(enum plugin_event id, 27 void *gcc_data); 28 29 // Called from the plugin's initialization code. Registers a number of callbacks. 30 // This function can be called multiple times. 31 // nregistrations - the size of the registrations array 32 // registrations - an array of registrations 33 void register_callbacks(int nregistrations, 34 struct callback_registration *registrations); 35 36 // A wrapper for the above that registers one callback. 37 // All parameters are as described for struct plugin_registration. 38 void register_callback(char* name, 39 enum plugin_event event, 40 plugin_callback_func callback, 41 void* user_data); 42 43 struct plugin_argument { 44 char *key; 45 char *value; 46 }; 47 48 // The prototype for a module initialization function. Each module should define this 49 // as an externally-visible function with name "plugin_init." 50 // argc - the size of the argv array 51 // argv - an array of key-value pairs 52 typedef void (*plugin_init_func)(int argc, struct plugin_argument *argv);
Experimental (high-level) plugins:
1 // ----------------------------------------------------------------------- 2 // Pass manager. Two primitives: 3 // - initialize the pass table 4 // - invoke a pass identified by its (print) name 5 void init_passes (void); 6 void run_pass (const char *pass_name); 7 8 // ----------------------------------------------------------------------- 9 // Event parameter: associates a name with a value, typically abstracted 10 // from the program representation. 11 struct event_parameter { 12 const char *name; // name for the parameter 13 void *value; // pointer to data 14 }; 15 16 // ----------------------------------------------------------------------- 17 // Callback function: argument-less. 18 // All args (both incoming and outgoing) are passed through named parameters. 19 typedef void (*callback_func) (void); 20 21 // ----------------------------------------------------------------------- 22 // Events associate textual name with the callback. 23 struct plugin_event { 24 const char *name; // name for the event 25 callback_func run; // callback function 26 }; 27 28 // ----------------------------------------------------------------------- 29 // Event parameters are accessed by name. Primitives: 30 // - register and set an event parameter, 31 // - unregister (and clear) an event parameter, 32 // - get the value of an event parameter, 33 // - list all registered event parameter names. 34 void register_event_parameter (const char *name, void *value); 35 void unregister_event_parameter (const char *name); 36 const void *get_event_parameter (const char *name); 37 const char **list_event_parameters (void); 38 39 // ----------------------------------------------------------------------- 40 // Events are accessed by event name. Primitives: 41 // - register an event callback (FORNOW: only one callback per event), 42 // - unregister an event completely. 43 // - trigger an event (return true if event was handled), 44 // - list all registered event names. 45 void register_plugin_event (const char *name, callback_func func); 46 void unregister_plugin_event (const char *name); 47 bool call_plugin_event (const char event_name); 48 const char **list_plugin_events (void); 49 50 // ----------------------------------------------------------------------- 51 // Features: interface to compiler configuration or state (cmdline options, 52 // params, cfun, version string, cwd, etc.). Sub-features are used to 53 // identify individual cmdline options or params. 54 struct feature { 55 const char *name; // feature name 56 void *data; // pointer for static feature data or memoization 57 long int data_size; // size of the data - for copying etc. 58 const void * (*callback) (void); 59 // callback for dynamic feature extraction 60 const void * (*get_subfeature) (const char *name); 61 // callback for dynamic subfeature extraction 62 void * (*set_subfeature) (const char *name, void *value); 63 // callback to set values of subfeatures 64 }; 65 66 // ----------------------------------------------------------------------- 67 // Features and subfeatures are accessed by name. Primitives: 68 // - initialize the feature table, 69 // - get a feature, 70 // - get a subfeature of a given feature, 71 // - set a subfeature of a given feature (return value: user-definable), 72 // - get the size of the stored data of a feature (in bytes), 73 // - get the size of the stored data of a subfeature (in bytes), 74 // - get the array of names of all available feature, 75 // - get the count of all available features, 76 // - get the array of names of all subfeatures of a given feature, 77 // - get the count of all subfeatures of a given feature. 78 void init_features (void); 79 const void *get_feature (const char *feature_name); 80 const void *get_subfeature (const char *feature_name, const char *subfeat_name); 81 void *set_subfeature (const char *feature_name, const char *subfeat_name, void *value); 82 int get_feature_size (const char *feature_name); 83 const char **get_available_features (int type); 84 int get_num_available_features (int type); 85 const char **get_available_subfeatures (const char *feat_name, int type); 86 int get_num_available_subfeatures (const char *feat_name, int type);
Current Experimental (ICI) plugin usage examples : recording and changing pass order:
GCC : plugin-framework.c, passes.c:
callback_func plugin_start, plugin_stop; ... static int load_plugin (char *dynlib_file) { … void *dl_handle; bool error = 0; dl_handle = dlopen (dynlib_file, RTLD_LAZY); error |= check_for_dlerror (); plugin_start = (callback_func) dlsym (dl_handle, ""); error |= check_for_dlerror (); plugin_stop = (callback_func) dlsym (dl_handle, ""); error |= check_for_dlerror (); ... }
In execute_one_pass(): events for gate override and pass execution
... current_pass = pass; /* Override the default gate behavior. The plugin controls the state of the gate through event parameter "gate_status". */ register_event_parameter ("gate_status", (void *) &gate_status); call_plugin_event ("avoid_gate"); unregister_event_parameter ("gate_status"); if (!gate_status) return false; /* gate was open (or forced so): run the exec callback */ register_event_parameter ("pass_name", (void *) (pass->name)); call_plugin_event ("pass_execution"); unregister_event_parameter ("pass_name"); if (!quiet_flag && !cfun) fprintf (stderr, " <%s>", pass->name ? pass->name : ""); ...
GCC : pass manager substitution in tree-optimize.c
In tree_rest_of_compilation: override the default pass manager, falling back on the default pass ordering if the callback invocation fails for whatever reason:
... gimple_register_cfg_hooks (); bitmap_obstack_initialize (®_obstack); /* FIXME, only at RTL generation*/ /* Perform all tree transforms and optimizations. */ /* Plugin event: substitution of pass manager. 'plugin_process_all_passes': if non-zero, let the plugin process all passes, even if the gate is false. */ plugin_process_all_passes = 1; register_event_parameter ("all_passes", &plugin_process_all_passes); /* if the callback fails, fall back on default pass chain */ if(!call_plugin_event ("all_passes_manager")) execute_pass_list (all_passes); unregister_event_parameter ("all_passes"); bitmap_obstack_release (®_obstack); /* Release the default bitmap obstack. */ bitmap_obstack_release (NULL); ...
Plugin : pass tracker - save-executed-passes.c
Iterate over all pass executions to report cfun/pass combinations
#include "include/plugin-interface.h" ... void executed_pass (void) /* event callback function */ { const char *pass_name; const char *func_name; func_name = (const char *) get_feature ("function_name"); pass_name = (const char *) get_event_parameter ("pass_name"); printf ("%s %s\n", func_name, pass_name); } } void start (void) { register_plugin_event ("pass_execution", &executed_pass); } void stop (void) { unregister_plugin_event ("all_passes_manager"); unregister_plugin_event ("avoid_gate"); }
Plugin : pass manager with gate override - use-pass-order.c
#include "include/plugin-interface.h" ... void skip_gate (void) { const int *all_passes = get_event_parameter ("all_passes"); if ((all_passes != NULL) && (*all_passes != 0)) { unsigned char *gate = get_event_parameter ("gate_status"); if (gate!=NULL) *gate = 1; } } void run_passes_from_list (void) { const char **pass_list = (const char **) get_event_parameter ("passes_to_run"); if (pass_list == NULL) return; while (current_pass = *pass_list++) /* look up matching opt_pass, then runs execute_one_pass on it */ run_pass (current_pass); } void start (void) { register_plugin_event ("all_passes_manager", &run_passes_from_list); register_plugin_event ("avoid_gate", &skip_gate); } void stop (void) { unregister_plugin_event ("all_passes_manager"); unregister_plugin_event ("avoid_gate"); }