This is the mail archive of the
mailing list for the GCC project.
Re: [gimplefe] [gsoc16] Gimple Front End Project
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Andrew MacLeod <amacleod at redhat dot com>
- Cc: David Malcolm <dmalcolm at redhat dot com>, Prasad Ghangal <prasad dot ghangal at gmail dot com>, Diego Novillo <dnovillo at google dot com>, gcc Mailing List <gcc at gcc dot gnu dot org>, sandeep at gcc dot gnu dot org
- Date: Wed, 9 Mar 2016 16:47:51 +0100
- Subject: Re: [gimplefe] [gsoc16] Gimple Front End Project
- Authentication-results: sourceware.org; auth=none
- References: <CAE+uiWabe9W088+CaKh+8VgSdadk+pyt2C6QEbxgj=bQs=Nkdg at mail dot gmail dot com> <CAE+uiWajGum8ccJer8E9w56KVm_VcM8jXB2atXSwpWeuYenFpg at mail dot gmail dot com> <CAD_=9DSJBCdKtY+K2FDt5FS85hAue7MznyUX2Z4RUffOmuoDFA at mail dot gmail dot com> <FA69E188-E41B-4A3C-AC4A-2D21F0ADA713 at gmail dot com> <CAE+uiWbJ7+mY_2xYNQBTT1emXf5J+E79nK+c2cE2u1Deh8Zf=w at mail dot gmail dot com> <CAFiYyc2_93J4K7vNDZngW=5wMxUK1s+JxQo2k7TByUkDT_cz7w at mail dot gmail dot com> <1457368435 dot 9813 dot 68 dot camel at redhat dot com> <56E032E1 dot 3020909 at redhat dot com>
On Wed, Mar 9, 2016 at 3:27 PM, Andrew MacLeod <firstname.lastname@example.org> wrote:
> On 03/07/2016 11:33 AM, David Malcolm wrote:
>>> So for testing specific passes, I'd much rather have an input format
>>> for testing individual passes that:
>>> * can be easily generated by GCC from real test cases
>>> * ability to turn into unit tests, which implies:
>>> * human-readable and editable
>>> * compatibility and stability: can load gcc 7 test cases into gcc 8;
>>> have e.g. gcc 6 generate test cases
>>> * roundtrippable: can load the format, then save out the IR, and get
>>> the same file, for at least some subset of the format (consider e.g.
>>> location data: explicit location data can be roundtripped, implicit
>>> location data not so much).
>>> ...which suggests that we'd want to use gimple dumps as the input
>>> format to a test framework - which leads naturally to the idea of a
>>> gimple frontend.
> We already read and write gimple IL in LTO, we just do it in binary form. I
> think the kind of effort you are talking about here is best placed in
> attaching a gimple parser to LTO, thus giving LTO the ability to read and
> write textual gimple as well as the current binary form. The current
> dump format could in theory be a starting point, but its clearly missing
> hunks of stuff. there is probably a better representation.
> LTO already knows all the bits required to reconstruct gimple. The
> definition of the textual representation can make intelligent choices about
> defaults so that you don't have to specify every single bit in the textual
> form that the binary form requires. ThIs seems far easier to me than
> starting with the incomplete form that the current dumps generate and trying
> to discover what other bits need to be added to properly reconstruct the IL.
> I think its hard to get a lot of the subtle things right. I also think
> the scope of defining and reading/writing should be relatively manageable.
> We can optimize the details once its working.
> It would also be very useful then to have LTO enhanced so that it can read
> and write before or after any pass... Then we can unit test any pass by
> injecting the IL immediately before the pass.. No jumping through any hoops
> to make sure the pass you care about sees the exact IL you want.. That is
> also a good proof that the LTO form (both binary and text) does fully
> represent gimple. We can also use this output as our debugging dumps and
> archive the current dumper.
> As gimple changes and evolves the result is only one place to worry about
> for reading and writing... and as we progress (slowly) towards uncoupling
> the middle/backend from the front ends, we'd have a single well defined
> "front end" for gimple that accepts binary or text.
So I chose to reply to this one (and will refrain from replying to other but try
to address comments there).
First, while the LTO approach works it's quite overkill in the details
and thus it's too closely tied to our internal bits which means testcases will
bitrot too quickly for the number one goal of having human
It's nice if there's going to be somebody spending quite some of his
unit-testing (hope not specifically "the GIMPLE frontend").
In my view the C frontend already can target most of the middle-end features and
for those it can't it should be straight-forward to add GNU extensions
for. A critical
piece is of course SSA here, specifically PHIs. I think a reasonable way to
express those in C are to use labels:
i_1 = 2;
i_3 = __PHI (L1:i_1, L2:i);
so the testcases would be valid GNU C (not C). What would be missing for
unit-testing would be some "raw" mode to avoid having the C FE fold things
or apply type promotions (so you can actually write a signed short addition).
As of restricting statements to GIMPLE I think that's not necessary - I'd simply
run the GENERIC from the FE through the gimplifier (I have patches that deal
with SSA pre into-SSA just fine, at least for non-PHIs, and if all the
could be just an internal function pre "real" SSA).
Note that I don't think we should restrict ourselves by connecting what LTO does
with what the requirements for unit testing are. The convenient bit of
extending the C FE here is that dumping a function body in the required form
is going to be easy and that you can have a testcase harness in plain C while
feeding in a unit-test function as "GNU C GIMPLE" (or how you'll call it). Say,
extern void abort (void);
int __attribute__((GIMPLE)) foo ()
_1 = x;
if (foo () != 1)
and the above would extend to __attribute__((RTL)) if anybody wants to
Give 'GIMPLE' an argument like __attribute__((GIMPLE("tree-pre"))) to
place to inject the function [you still have to feed it to the cgraph
from the beginning
of course, but the pass manager would skip anything before tree-pre for example
but still eventually compute IL side-data via required PROP_s]
Yes, a textual form for LTO data would be nice (or rather a self-descriptive LTO
data format so you can have external tools dump it). But I don't think using
the LTO dumper will work for unit testing.
About using the LLVM IR - similar issue I think, plus it is probably
too far away
from GCC so that what we'll end up will only look like LLVM IR but not actually
be LLVM IR.
I think with sticking to C and re-using (parts of) the frontend the path to
first "success" can be much shorter which I think is important for the
project to not bitrot in an unusable state like the last attempt. Of
I can spend some cycles mentoring a GSoC student I won't spend a
significant fraction of my work time on this project.