This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Compiler support for write barrier insertion ?


Hello all,

I have a question about possible cooperation between the compiler and hypothetical
garbage collector. Unfortunately, my experience around GCC internals is too small,
so I would like to ask compiler specialists before re-inventing an ugly bicycle...

The most non-straightforward garbage collection methods, including, but not limited to,
the most frequently used generational and incremental techniques, requires a 'write
barrier' or 'store barrier' - a piece of code which is executed when a pointer (within
one object, usually) to some another object is written. The following methods are
used widely:
 1) call barrier code explicitly when it's needed, determining such places
    by static analysis performed by the programmer;
 2) allocate objects from the heaps with OS 'page-aware' structures, then use OS
    memory protection for the underlying pages and handle protection faults;
 3) rely on the compiler support, which means the compiler should emit some
    code when the pointer store is generated.

Each of these methods has their own pitfails. In short, 1) it's very error-prone - one
missed barrier may break everything. For 2), it's system-dependend and slow due to
signal handling. For 3), an insertion of a write barrier at each pointer store is
obviously redundant and will introduce an enormous overhead for any real program.

So I'm investigating the possibility (and usability) of a hybrid scheme which is based
on both 1) and 3). An idea is to use GCC attributes machinery to inform the compiler
about special treatment of some pointers.

As an example, consider the following structure, with hypothetical attribute attached
to one of it's member:

struct obj {
  int value;
  char *name;
  struct obj *next __attribute__((trapped));
};

Here 'next' is 'write-barriered' pointer. During compilation, the compiler should
see that at least one member of 'struct obj' is trapped, and emit a call for special
function, for example '__builtin_obj_trap' when seeing a write to 'next' via pointer
of type 'struct obj *', for example:

struct obj *prev, *curr = alloc_obj ();
...
curr = alloc_obj ();
curr->value = 1234; /* Works as usual. */
curr->name = name;  /* Works as usual. */
curr->next = prev;  /* A call of __builtin_obj_trap() is arranged here,
                       for example, immediately after store instruction */
...

The special function may have the following prototype:

void __builtin_obj_trap (void *obj, int offset)

where 'obj' is a pointer to the structure contains trapped member ('curr' in this
example) and 'offset' is equal to 'offsetof (struct obj, next)'.

This functon must be provided by programmer (which is similar to providing
__cyg_profile_func_* when '-finstrument-functions' is used).

This method has an obvious pitfail: 'memset (curr, 0, sizeof (struct obj))' or
'memcpy (curr, otherobj, sizeof (struct obj))' probably can't be catched by the
compiler.

Another interesting situation is:

struct obj **opp, *prev, *curr = alloc_obj ();
...
opp = &curr->next;
*opp = prev;

For this case, taking an address of trapped pointer may issue a warning since it
creates a way to rewrite trapped pointer bypassing write barrier. (The real hack
is to treat 'trapped' attribute as 'promoted by assignment', i.e. 'opp' becomes
'trapped' automagically after initialization and writes through 'opp' are also
surrounded by write barrier).

Of course, you may ask: why not just having

'void set_obj_next (struct obj *obj, struct obj *next)'

with write barrier inside ? The short answers are:
 1) If 'struct obj' has 100 trapped members, having 100 set_XXX functions
    or macros just to set the fields is ugly;
 2) Migration from explicit memory management to garbage collection - if you
    have 1M lines of code which uses 'XXX->next = ...', it's quite hard
    to rewrite all stuff even with the help of modern refactoring tools.

I realize that the whole thing is very specific and probably will never used
by the most of compiler users. But, anyway, is it technically possible to
implement such thing ? How much overhead it may introduce ?

Thanks,
Dmitry


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]