This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gimplefe] [gsoc16] Gimple Front End Project

On Mon, Mar 07, 2016 at 11:33:55AM -0500, David Malcolm wrote:
> On Mon, 2016-03-07 at 13:26 +0100, Richard Biener wrote:
> > On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal <
> >> wrote:
> > > On 6 March 2016 at 21:13, Richard Biener <
> > >> wrote:
> > > > 
> > > > I'll be willing to mentor this.  Though I'd rather have us
> > > > starting from scratch and look at having a C-like input language,
> > > > even piggy-backing on the C frontend maybe.
> > > 
> > > That's great. I would like to know scope of the project for gsoc so
> > > that I can start preparing for proposal.
> > 
> > In my view (this may require discussion) the GIMPLE FE provides a way
> > to do better unit-testing in GCC as in
> > feeding a GIMPLE pass with specific IL to work with rather than
> > trying
> > to get that into proper shape via a C
> > testcase.  Especially making the input IL into that pass stable over
> > the development of GCC is hard.
> I've been looking at the gimple FE recently, at the above is precisely
> my own motivation.  Much of our current testing involves taking a C
> file, running the pass pipeline over it, and then verifying properties
> of one specific pass, and this worries me, since all of the intervening
> passes can change, and thus can change the effective input seen by the
> pass we were hoping to test, invalidating the test case.
> As part of the "unit tests" idea:
>   v1:
>   v2:
>   v3:
> I attempted to write unit tests for specific passes.  The closest I got
> was this, which built the function in tree form, then gimplified it,
> then expanded it:
> Whilst writing this I attempted to build test cases by constructing IR
> directly via API calls, but it became clear to me that that approach
> isn't a good one: it's very verbose, and would tie us to the internal
> API.
> (I think the above patch kit has merit for testing things other than
> passes, as a "-fself-test" option, which I want to pursue for gcc 7).
> So for testing specific passes, I'd much rather have an input format
> for testing individual passes that:
>   * can be easily generated by GCC from real test cases
>   * ability to turn into unit tests, which implies:
>     * human-readable and editable
>   * compatibility and stability: can load gcc 7 test cases into gcc 8;
> have e.g. gcc 6 generate test cases
>   * roundtrippable: can load the format, then save out the IR, and get
> the same file, for at least some subset of the format (consider e.g.
> location data: explicit location data can be roundtripped, implicit
> location data not so much).
> ...which suggests that we'd want to use gimple dumps as the input
> format to a test framework - which leads naturally to the idea of a
> gimple frontend.

Assuming you mean the format from -fdump-tree-* that's a kind of C like
language so argues against using tooples like the existing gimple-fe

> I'm thinking of something like a testsuite/gimple.dg subdirectory full
> of gimple dumps.
> We could have a new kind of diagnostic, a "remark", with DejaGnu
> directives to detect for it e.g.
>   a_5 = b_1 * c_2;  /* { dg-remark "propagated constant; became a_5 =
> b_1 * 3" } */
> or whatnot. 
> I see our dumpfiles as being something aimed at us, whereas remarks
> could be aimed at sophisticated end-users; they would be enabled on a
> per-pass basis, or perhaps for certain topics (e.g. vectorization) and
> could look something like:

That's interesting, as you sort of note the other option is to just scan
the output dump for what you intend to check.  The remark idea is
interesting though, the -Wsuggest-final-{method,type} warnings are
trying to be that, and istr something else like that.

> foo.c:27:10: remark: loop is not vectorizable since the iterator can be
> modified... [-Rvectorization]
> foo.c.35:20:
> or similar, where the user passed say "-Rvectorization" as a command
> line option to request more info on vectorization, and our test suites
> could do this.
> As a thought-experiment, consider that as well as cc1 etc, we could
> have an executable for every pass.  Then you could run individual
> passes e.g.:
>   $ run-vrp foo.gimple -o bar.gimple
>   $ run-switchconv quux.gimple -o baz.gimple
> etc.   (I'm not convinced that it makes sense to split things up so
> much, but I find it useful for inspiration, for getting ideas about the
> things that we could do if we had that level of modularity, especially
> from a testing perpective).

yeah, though if you got rid of most / all of the other global state
maybe it wouldn't be hard?  but yeah it doesn't seem like the most
important thing either.

> FWIW I started looking at the existing gimple FE branch last week.  It
> implements a parser for a tuple syntax, rather than the C-like syntax.
> The existing parser doeesn't actually generate any gimple IR
> internally, it just syntax-checks the input file.  Building IR
> internally seemed like a good next step, since I'm sure there are lots
> of state issues to sort out.  So I started looking at implementing a
> testsuite/gimple.dg/roundtrip subdirectory: the idea is that this would
> be full of gimple dumps; the parser would read them in, and then (with
> a flag supplied by roundtrip.exp) would write them out, and
> roundtrip.exp would compare input to output and require them to be
> identical.  I got as far as (partially) building a GIMPLE_ASSIGN
> internally when parsing a file containing one.
> That said, I don't care for the tuple syntax in the existing gimple
> dump format; I'd prefer a C-like syntax.

agreed, and being compatable with the existing dumps suggests it too.

> My thought was to hack up the existing gimple FE branch to change the
> parser to accept a more C-like syntax, but...
> > A C-like syntax is prefered, a syntax that is also valid C would be
> > even more prefered so that you can
> > write "torture" testcases that have fixed IL into a specific pass but
> > also run as regular testcases through
> > the whole optimization pipeline.
> > 
> > Piggy-backing on the C frontend makes it possible to leave all the
> > details of types and declarations
> > and global initializers as plain C while interpreting function bodies
> > as "GIMPLE" when leaving the frontend.
> sounds like you have a radically different implementation idea,
> in which the gimple frontend effectively becomes part of the C
> frontend, with some different behaviors.

Well, it seems like if the existing gimple-fe is basically just a parser
for a language we don't like there isn't much value in building off of
it instead of writing something from scratch.

Being compatable with C probably with some builtins to do SSA stuff
seems pretty nice.  I worry some about the work to avoid folding and
stuff, but sharing code with the c-family languages seems good if we


> > I expect that in the process of completing GIMPLE IL features you'll
> > have to add a few GNU C extensions,
> > mostly for features used by Ada (self-referential types come to my
> > mind).
> > 
> > I expect the first thing the project needs to do is add the "tooling"
> > side, signalling the C frontend it
> > should accept GIMPLE (add a -fgimple flag) plus adding a way to input
> > IL into a specific pass
> > (-ftest=<pass> or a function attribute so it affects only a specific
> > function so you can write a testcase
> > driver in plain C and have the actual testcase in a single function).
> > The first actual frontend
> > implementation challenge will then be emitting GIMPLE / CFG / SSA
> > directly which I'd do in the
> > "genericization" phase.  Adjustments to how the C FE handles
> > expressions should be made as well,
> > for example I'd remove any promotions done, letting it only literally
> > parse expressions.  Maybe
> > statement and expression parsing should be forked directly to not
> > make
> > the C FEs code too unwieldely
> > but as said I'd keep type and decl parsing and its data structures as
> > is.
> > 
> > Eventually the dump file format used by GCCs GIMPLE dumps should be
> > changed to be valid
> > GIMPLE FE inputs (and thus valid C inputs).  Adjustments mainly need
> > to be done to basic-block
> > labels and PHI nodes.
> > 
> > I'd first not think about our on-the-side data too much initially
> > (range info, points-to info, etc).
> > 
> > Richard.
> Hope this is constructive
> Dave

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]