This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gimplefe] [gsoc16] Gimple Front End Project


Hi,

On Thu, 10 Mar 2016, Richard Biener wrote:

> Then I'd like to be able to re-construct SSA without jumping through 
> hoops (usually you can get close but if you require copies propagated in 
> a special way you are basically lost for example).
> 
> Thus my proposal to make the GSoC student attack the unit-testing 
> problem by doing modifications to the pass manager and "extending" an 
> existing frontend (C for simplicity).

I think it's wrong to try to shoehorn the gimple FE into the C FE.  C is 
fundamentally different from gimple and you'd have to sprinkle 
gimple_dialect_p() all over the place, and maintaining that while 
developing future C improvements will turn out to be much work.  Some 
differences of C and gimple:

* C has recursive expressions, gimple is n-op stmts, no expressions at all
* C has type promotions, gimple is explicit
* C has all other kinds of automatic conversion (e.g. pointer decay)
* C has scopes, gimple doesn't (well, global and local only), i.e. symbol 
  lookup is much more complicated
* C doesn't have exceptions
* C doesn't have class types, gimple has
* C doesn't have SSA (yes, I'm aware of your suggestions for that)
* C doesn't have self-referential types
* C FE generates GENERIC, not GIMPLE (so you'd need to go through the
  gimplifier and again would feed gimple directly into the passes)

I really don't think changing the C FE to accept gimple is a useful way 
forward.

I agree with others that the gimple FE is rather more similar to a textual 
form of LTO bytecode.  In difference to the byte code its syntax has to be 
defined and kept stable, and yes, it's sensible to get inspiration from C 
syntax.

> It's true that in the ideal world a "GIMPLE frontend" has to work 
> differently but then to get a 100% GIMPLE frontend you indeed arrive at 
> what LTO does and then I agree we should attack that from the LTO side 
> and not develop another thing besides it.  But this is a _much_ larger 
> task and I don't see anyone finishing that.

So, you think the advantage of starting with the C FE would be faster 
return on investment?  Even if it were true (and I'm not sure it is), what 
would it help if it ultimately wouldn't be acceptable for inclusion in GCC 
proper?

> Also keep in mind that the LTO side will likely simplify a lot by 
> stripping out the ability to stream things that are only required for 
> debug information (I hopefully will finish LTO early debug for GCC 7).

Btw. I'm not suggesting to fit the gimple FE into the LTO infrastructure, 
that's also leading to a mess IMO.  But I do think creating a gimple FE 
from scratch is easier than fitting it into the C FE.  Taking some 
infrastructure from the LTO frontend (namely all the tree/gimple building 
routines) and some from the C frontend (namely parts of the parsers, at 
least for types).

> So - can we please somehow focus on the original question about a GSoC 
> project around the "GIMPLE frontend"?  Of course you now can take Davids 
> promise of spending some development time on this into consideration.  
> Try drafting something that can be reasonably accomplished with giving 
> us something to actually use in the GCC 7 timeframe.

Okay, here's a rough plan:

1) take the LTO frontend, remove all bytecode 
   routines, partitioning stuff, lto symtab merging stuff, tree merging 
   stuff, WPA stuff; i.e. retain only the few pieces needed for creating 
   tree and those that every frontend needs to provide.  Now that 
   frontend should be able to generate an empty compilation unit.
2) implement two stub functions: parsetype and lookuptype, the first 
   always generates a type named 'int32' with the obvious INTEGER_TYPE 
   tree, the second always gives back that type.  Now hack on the frontend
   long enough that this simple input can be parsed and the obvious asm 
   file is generated: "int32 i;".  Then "int32 i = 0;".

   Now the frontend knows global decls and initializers, but the type
   parsing problem is deferred.
3) implement function parsing.  Hack on the frontend long enough
   that this program is accepted and generates the obvious gimple:
     int32 main(int32 argc) {
       int32 a;
       a_1 = argc_0;
       _2 = a_1 - argc_0;
       return _2;
     }
   While doing this, think about how to represent default defs, and 
   generally SSA names.
4) implement some more operators for one and two-op statements
5) implement syntax for conditional code, for instance:
     {
       L0: if (argc_0 == 1) goto L1; else goto L2;
       L1: $preds (L0)
           return 1;
       L2: $preds (L0)
           return 2;
     }
5) think about and implement syntax for PHI nodes, e.g.:
     {
       int a;
       L0: if (argc_0 == 1) goto L1; else goto L2;
       L1: $preds (L0)
           a_1 = 1;
           goto L3;
       L2: $preds (L0)
           a_2 = 42 + argc_0;
           goto L3;
       L3: $preds (L1, L2)
           a_3 = $phi (a_1, a_2);
           return a_3;
     }

   See for instance how it's important above that all destinations that
   are reached by multiple gotos make the order of incoming edges 
   explicit.  It can't be left implicit (or needs very strict and 
   checkable rules).
6) implement function calls, requires parsing global function decls (still 
   only using 'int' types) and parsing/generating gcall insns.
7) implement more scalar base types, including pointers,
   add parsing/generating conversion statements
8) think about and implement syntax for _specifying_ layout of scalar 
   types, as well as attributes (i.e. how to say that type "T" in an
   signed integer type of 47 bits, aligned 8 bit, size 64 bits)
9) think about and implement syntax for aggregate types
10) implement more parsing for accessing elements of aggregate types
11) hack on the whole think long enough that this can be parsed and 
    generates the obvious gimple:

    type T1 *char;
    type T2 const C1;
    type T3 (C2, ...) : int;  // function type
    decl T3 *printf;
    type T {int a[10];}
    T x;
    int main(int argc, char *argv[]) {
      T *t;
      t_1 = &x;
      t_1->a[0] = argc_0;
      printf ("you passed %d arguments\n", t1->a[0]);
      return 0;
    }

Some of the later steps can be done in different order.

I think this would give enough to do for GSoC, but with mentoring 
(especially with the first part, stripping down the LTO frontend to 
something trivial) it seems achievable.  Bonus: implement exceptions, 
implement generic mean to annotate either statements or types or decls, 
implement parsing the various GCC builtins.

Obviously our gimple dumper should be amended to also be able to emit 
whatever syntax is decided in the course of implementation.  In the 
process focus should be on generatic a grammar that's easy to parse while 
retaining some visual pleasance (not all of the above might be good 
examples).  At least the gimple statement syntax should probably be quite 
similar to our current dump format.  At least during development we 
shouldn't fear revamping the syntax even severely if it furthers those 
goals, even if testcases then need to be thrown away.


Ciao,
Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]