This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[tree-ssa][rfc] plans for nested functions


This is stuff that was running around my head this morning.
I mostly wanted to write this down so that I don't forget...

The problem I'm trying to solve is that, at present, we have
ordering issues wrt compilation of nested functions.

(1) We depend heavily on rtl-level state in order to compile 
    nested functions.

    A couple of weeks ago I tried to un-nest compilation of nested
    functions with cgraph.  It turns out there is so much rtl state
    needed that one can not compile the nested function either before
    or after the parent -- one *must* compile it *during* the parent.

(2) We cannot inline the parent function, because that would leave
    the nested function without a well-defined static-chain frame
    to access outer variables.

The solution would be to generate a "virtual frame" directly early
in the translation of the parent/nested function pair.  Consider:

	int f1(int p1)
	{
	  int a1 = 0;
	  auto int f2(void)
	  {
	    int a2 = 0;
	    auto int f3(void)
	    {
	      return a2++ + a1++;
	    }
	    bar(f3);
	    return a2 + p1;
	  }
	  bar(f2);
	  return a1;
	}

=>

	int f1(int p1)
	{
	  struct f1_frame
	  {
	    int p1;
	    int a1;
	  } frame;

	  auto int f2(void)
	  {
	    struct f2_frame
	    {
	      struct f1_frame * const chain;
	      int a2;
	    } frame;

	    auto int f3(void)
	    {
	      struct f2_chain * const chain = <<<incomming_static_chain>>>;
	      return chain->a2++ + chain->chain->a1++;
	    }

	    frame.chain = <<<incomming_static_chain>>>;
	    frame.a2 = 0;

	    bar(__builtin_construct_trampoline(f3, &frame));

	    return frame.a2 + frame.chain->p1;
	  }

	  frame.p1 = p1;
	  frame.a1 = 0;

	  bar(__builtin_construct_trampoline(f2, &frame));

	  return frame.a1;
	}

I envision the order of operations as follows:

(1) Nested functions are initially translated, then ignored until 
    they are actually referenced.  (Indeed, all inline functions
    ought to be treated this way.)

(2) Before gimplification, f1 undergoes a remove_useless pass.
    During this pass, we notice if any references to nested 
    functions exist.

(3) Process all referenced nested functions:

    (a) Recurse to step 2.

    (b) For each decl that comes from an outer scope, enter it into
	a hash-table that maps var_decl/parm_decl -> field_decl, 
	where the field_decl is going to be an element of the frame
	structures above.  This includes synthetic decls such as the
	static chain itself.

    (c) Replace the direct decl references with the appropriate
	chain->field reference.

(4) Sort the fields for nice packing and layout the structure type.
    Construct the struct as with an anonymous struct, such that
    debugging turns out ok.  (Not sure exactly how to handle 
    debugging for the variables accessed from the inner scope, but
    since I'm certain that doesn't work at the moment, there's no
    loss of functionality.)

(5) Replace references to decls in the parent with field accesses.
    For all parm_decls (and the static chain), insert code to copy
    the value from the incomming location to the structure.

(6) Replace references to pointers to nested functions with a 
    trampoline constructed via a builtin.  This we still want to
    defer to target-specific ugliness.

Now we've got something completely independent from rtl that suffers
from neither of the discussed problems.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]