This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [rtlfe] Barebones implementation of "__RTL"; next steps?


On Thu, May 12, 2016 at 4:29 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Wed, 2016-05-11 at 11:57 +0200, Richard Biener wrote:
>> On Wed, May 11, 2016 at 3:31 AM, Trevor Saunders <
>> tbsaunde@tbsaunde.org> wrote:
>> > On Tue, May 10, 2016 at 05:01:00PM -0400, David Malcolm wrote:
>> > > [CCing Prasad since this may be useful for his gimple FE work, by
>> > > replacing "rtl" with "gimple" in the patch]
>> > >
>> > > On Mon, 2016-05-09 at 11:44 +0200, Richard Biener wrote:
>> > > > On Wed, May 4, 2016 at 10:49 PM, David Malcolm <
>> > > > dmalcolm@redhat.com>
>> > > > wrote:
>> > >
>> > > > > This patch kit introduces an RTL frontend, for the purpose
>> > > > > of unit testing: primarly for unit testing of RTL passes, and
>> > > > > possibly for unit testing of .md files.
>> > > > >
>> > > > > It's very much a work-in-progress; I'm posting it now to get
>> > > > > feedback.
>> > >
>> > > [...snip...]
>> > >
>> > > > > * The RTL frontend doesn't have any knowledge of the name of
>> > > > > the
>> > > > > function,
>> > > > >   of parameters, types, locals, globals, etc.  It creates a
>> > > > > single
>> > > > > function.
>> > > > >   The function is currently hardcoded to have this signature:
>> > > > >
>> > > > >      int test_1 (int, int, int);
>> > > > >
>> > > > >   since there's no syntax for specify otherwise, and we need
>> > > > > to
>> > > > > provide
>> > > > >   a FUNCTION_DECL tree when building a function object (by
>> > > > > calling
>> > > > >   allocate_struct_function).
>> > > > >
>> > > > > * Similarly, there are no types beyond the built-in ones; all
>> > > > > expressions
>> > > > >   are treated as being of type int.  I suspect that this
>> > > > > approach
>> > > > >   will be too simplistic when it comes to e.g. aliasing.
>> > > >
>> > > > To address this and the previous issue I suggest to implement
>> > > > the RTL
>> > > > FE
>> > > > similar to how I proposed the GIMPLE FE - piggy-back on the C
>> > > > FE and
>> > > > thus
>> > > > allow
>> > > >
>> > > > int __RTL foo (int a, int b) // gets you function decl and
>> > > > param
>> > > > decls
>> > > > {
>> > > >  (insn ...)
>> > > > ...
>> > > >
>> > > > }
>> > > >
>> > > > int main()
>> > > > {
>> > > >   if (foo (1) != 0)
>> > > >     abort ();
>> > > > }
>> > > >
>> > > > That would also allow dg-do run testcases and not rely solely
>> > > > on
>> > > > scanning
>> > > > RTL dumps.
>> > >
>> > > The following is an attempt at implementing this, by adding a new
>> > > "__RTL" keyword, and detecting it in the C frontend, switching
>> > > to a custom parser for the function body.
>> > >
>> > > Does this look like the kind of thing you had in mind for the
>> > > RTL and gimple "frontends"?
>>
>> Yes!
>>
>> > > Wiring this up to the existing RTL parser might be awkward: the
>> > > existing RTL parser in read-md.c etc works at the level of
>> > > characters (read_char and unread_char from a FILE *), whereas the
>> > > C frontend is in terms of tokens.
>> > >
>> > > I have a patch that ports the RTL parser to using libcpp for
>> > > location-tracking, and another that updates it to use libcpp
>> > > for diagnostics.  This adds more dependency on a build-time
>> > > libcpp to the gen* tools.  Both patches are currently messy.
>> > > Potentially I could build on them and attempt to update the RTL
>> > > parser further, to use libcpp's tokenizer.
>> > >
>> > > Does that general approach sound sane?  In particular:
>> > > - is it sane to eliminate errors.c in favor of building
>> > > diagnostics*.c for the build machine as well as the host machine?
>> > > - is it sane to rewrite the read-md.c/read-rtl.c code to
>> > > a token-based approach, using libcpp?
>> > >
>> > > Alternatively, the shorter term approach would be to kludge
>> > > in reading from a FILE * in read-md.c based on where the
>> > > C parser has got to, with a hybrid of the two approaches
>> > > (character-based vs token-based).
>> >
>> > Another option is to make read-md.c use tokens, but instead of
>> > building
>> > libcpp for the build machine write a new token parser that is text
>> > compatible with the libcpp one, but just enough to do what read
>> > -md.c
>> > needs.  However that seems silly, and suggests just using libcpp is
>> > the
>> > sane thing to do :)
>> >
>> > > Thoughts?
>> >
>> > I'm not aware of any pitfalls in using libcpp in build tools,
>> > though it
>> > does seem slightly unfortunate to need to build so much build tool
>> > stuff.
>> >
>> > Thinking about this I wonder if libcpp would be useful in gengtype
>> > to
>> > get around some of the wierdness with headers (and maybe even
>> > languages?) but that doesn't need to be thought about now.
>>
>> genmatch also uses libcpp, it's really convenient for the diagnostics
>> as well.
>>
>> That said, another kludge would be to simply use cpp_token_as_text
>> (see genmatch.c:c_expr::gen_transform), write the whole function
>> to a temporary file and parse that back in with read_md ;)
>>
>> In my mind sth to continue prototyping is more important than to
>> clean
>> this piece up righ now.
>
> Thanks.
>
> Given that I have a semi-working "rtl1" and "make check-rtl", I've
> moved the parser out of the gcc/rtl directory and into a new
>   gcc/read-rtl-function.c
> so that it can be shared between cc1 and rtl1, using "rtl1" as a
> testbed whilst I bring up the __RTL idea within cc1.
>
> I tried the idea of writing the tokens out to a tempfile, but it didn't
> work well: the existing parser I have is very sensitive to whitespace
> (for dealing with all of the non-standard syntax in print-rtl.c) and
> once we just have tokens we've lost that whitespace information.

Ah, indeed.  But you know whether there was whitespace before
a token at least...

> So I tried a different approach: record the locations of the opening
> and closing brace of the __RTL function body, consume all of the
> tokens, and then reopen the file as a FILE * and run the RTL parser on
> it, filtering the results of read_char () so it returns only whitespace
> until the open brace, and sees only EOF from the closing brace onwards.

I guess that works as well unless you get input from stdin or the parseable
code results from macro expansion.

> This seems to work well enough for now, and my prototype is able to
> have cc1 read in an existing RTL dump (with the C function wrapper) and
> then (as a test) dump it out again; an example input file can be seen
> at:
> https://dmalcolm.fedorapeople.org/gcc/2016-05-12/gcc-git-rtl-frontend-v13/0022-FIXME-add-test-case.patch
>
> The prototype parses the RTL as the C is being parsed.  Another way to
> do it might be to delay parsing the RTL until the function reaches a
> desired RTL pass: have a dummy tree for the body of the function, and
> compile the function normally (perhaps marking it noinline noclone or
> somesuch).  On reaching the pass we wish to test, we could then parse
> the RTL, replacing whatever RTL we already had.  (I don't know if this
> is a good idea, but it seems worth mentioning here).

What would be the benefit of doing this though?  I guess parsing might rely
on all declarations parsed (so you have more freedom in placing the __RTL)?

For the GIMPLE FE I'd like to annotate functions with an attribute telling
the pass manager where to start/end compilation.  For the RTL FE you
could tell it to start at a random RTL pass and make a missing
PROP_rtl start parsing...

> One wart I ran into is that system.h has this:
>
> /* Front ends should never have to include middle-end headers.  Enforce
>    this by poisoning the header double-include protection defines.  */
> #ifdef IN_GCC_FRONTEND
> #pragma GCC poison GCC_RTL_H GCC_EXCEPT_H GCC_EXPR_H
> #endif
>
> i.e. the idea of running RTL code from inside the C frontend seems to
> be banned.
>
> For now I simply removed that pragma.  Presumably I can avoid it
> properly by having c/c-parser.c call into something in the gcc
> directory that isn't compiled with -DIN_GCC_FRONTEND.

Yeah, or do the delayed parsing (though that makes you need to implement
your own symbol lookup for parsing things like MEM_EXPRs).

> Next steps:
> * I plan to try to get single passes running, and some form of round
> -trip dump test.
> * I'll also try to split out some of the work into generic cleanups of
> read-md.c that I hope can go into trunk independently of this work.
>
> Hope this sounds sane

yeah.

Thanks,
Richard.

> Dave
>
> FWIW I've backed up the latest version of the (very messy) patch kit
> to:
> https://dmalcolm.fedorapeople.org/gcc/2016-05-12/gcc-git-rtl-frontend-v13/
> It's on top of r235783, and has various unrelated stuff in it.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]