This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [rtlfe] Barebones implementation of "__RTL"; next steps?


On Wed, 2016-05-11 at 11:57 +0200, Richard Biener wrote:
> On Wed, May 11, 2016 at 3:31 AM, Trevor Saunders <
> tbsaunde@tbsaunde.org> wrote:
> > On Tue, May 10, 2016 at 05:01:00PM -0400, David Malcolm wrote:
> > > [CCing Prasad since this may be useful for his gimple FE work, by
> > > replacing "rtl" with "gimple" in the patch]
> > > 
> > > On Mon, 2016-05-09 at 11:44 +0200, Richard Biener wrote:
> > > > On Wed, May 4, 2016 at 10:49 PM, David Malcolm <
> > > > dmalcolm@redhat.com>
> > > > wrote:
> > > 
> > > > > This patch kit introduces an RTL frontend, for the purpose
> > > > > of unit testing: primarly for unit testing of RTL passes, and
> > > > > possibly for unit testing of .md files.
> > > > > 
> > > > > It's very much a work-in-progress; I'm posting it now to get
> > > > > feedback.
> > > 
> > > [...snip...]
> > > 
> > > > > * The RTL frontend doesn't have any knowledge of the name of
> > > > > the
> > > > > function,
> > > > >   of parameters, types, locals, globals, etc.  It creates a
> > > > > single
> > > > > function.
> > > > >   The function is currently hardcoded to have this signature:
> > > > > 
> > > > >      int test_1 (int, int, int);
> > > > > 
> > > > >   since there's no syntax for specify otherwise, and we need
> > > > > to
> > > > > provide
> > > > >   a FUNCTION_DECL tree when building a function object (by
> > > > > calling
> > > > >   allocate_struct_function).
> > > > > 
> > > > > * Similarly, there are no types beyond the built-in ones; all
> > > > > expressions
> > > > >   are treated as being of type int.  I suspect that this
> > > > > approach
> > > > >   will be too simplistic when it comes to e.g. aliasing.
> > > > 
> > > > To address this and the previous issue I suggest to implement
> > > > the RTL
> > > > FE
> > > > similar to how I proposed the GIMPLE FE - piggy-back on the C
> > > > FE and
> > > > thus
> > > > allow
> > > > 
> > > > int __RTL foo (int a, int b) // gets you function decl and
> > > > param
> > > > decls
> > > > {
> > > >  (insn ...)
> > > > ...
> > > > 
> > > > }
> > > > 
> > > > int main()
> > > > {
> > > >   if (foo (1) != 0)
> > > >     abort ();
> > > > }
> > > > 
> > > > That would also allow dg-do run testcases and not rely solely
> > > > on
> > > > scanning
> > > > RTL dumps.
> > > 
> > > The following is an attempt at implementing this, by adding a new
> > > "__RTL" keyword, and detecting it in the C frontend, switching
> > > to a custom parser for the function body.
> > > 
> > > Does this look like the kind of thing you had in mind for the
> > > RTL and gimple "frontends"?
> 
> Yes!
> 
> > > Wiring this up to the existing RTL parser might be awkward: the
> > > existing RTL parser in read-md.c etc works at the level of
> > > characters (read_char and unread_char from a FILE *), whereas the
> > > C frontend is in terms of tokens.
> > > 
> > > I have a patch that ports the RTL parser to using libcpp for
> > > location-tracking, and another that updates it to use libcpp
> > > for diagnostics.  This adds more dependency on a build-time
> > > libcpp to the gen* tools.  Both patches are currently messy.
> > > Potentially I could build on them and attempt to update the RTL
> > > parser further, to use libcpp's tokenizer.
> > > 
> > > Does that general approach sound sane?  In particular:
> > > - is it sane to eliminate errors.c in favor of building
> > > diagnostics*.c for the build machine as well as the host machine?
> > > - is it sane to rewrite the read-md.c/read-rtl.c code to
> > > a token-based approach, using libcpp?
> > > 
> > > Alternatively, the shorter term approach would be to kludge
> > > in reading from a FILE * in read-md.c based on where the
> > > C parser has got to, with a hybrid of the two approaches
> > > (character-based vs token-based).
> > 
> > Another option is to make read-md.c use tokens, but instead of
> > building
> > libcpp for the build machine write a new token parser that is text
> > compatible with the libcpp one, but just enough to do what read
> > -md.c
> > needs.  However that seems silly, and suggests just using libcpp is
> > the
> > sane thing to do :)
> > 
> > > Thoughts?
> > 
> > I'm not aware of any pitfalls in using libcpp in build tools,
> > though it
> > does seem slightly unfortunate to need to build so much build tool
> > stuff.
> > 
> > Thinking about this I wonder if libcpp would be useful in gengtype
> > to
> > get around some of the wierdness with headers (and maybe even
> > languages?) but that doesn't need to be thought about now.
> 
> genmatch also uses libcpp, it's really convenient for the diagnostics
> as well.
> 
> That said, another kludge would be to simply use cpp_token_as_text
> (see genmatch.c:c_expr::gen_transform), write the whole function
> to a temporary file and parse that back in with read_md ;)
> 
> In my mind sth to continue prototyping is more important than to
> clean
> this piece up righ now.

Thanks.  

Given that I have a semi-working "rtl1" and "make check-rtl", I've
moved the parser out of the gcc/rtl directory and into a new
  gcc/read-rtl-function.c
so that it can be shared between cc1 and rtl1, using "rtl1" as a
testbed whilst I bring up the __RTL idea within cc1.

I tried the idea of writing the tokens out to a tempfile, but it didn't
work well: the existing parser I have is very sensitive to whitespace
(for dealing with all of the non-standard syntax in print-rtl.c) and
once we just have tokens we've lost that whitespace information.

So I tried a different approach: record the locations of the opening
and closing brace of the __RTL function body, consume all of the
tokens, and then reopen the file as a FILE * and run the RTL parser on
it, filtering the results of read_char () so it returns only whitespace
until the open brace, and sees only EOF from the closing brace onwards.

This seems to work well enough for now, and my prototype is able to
have cc1 read in an existing RTL dump (with the C function wrapper) and
then (as a test) dump it out again; an example input file can be seen
at:
https://dmalcolm.fedorapeople.org/gcc/2016-05-12/gcc-git-rtl-frontend-v13/0022-FIXME-add-test-case.patch

The prototype parses the RTL as the C is being parsed.  Another way to
do it might be to delay parsing the RTL until the function reaches a
desired RTL pass: have a dummy tree for the body of the function, and
compile the function normally (perhaps marking it noinline noclone or
somesuch).  On reaching the pass we wish to test, we could then parse
the RTL, replacing whatever RTL we already had.  (I don't know if this
is a good idea, but it seems worth mentioning here).

One wart I ran into is that system.h has this:

/* Front ends should never have to include middle-end headers.  Enforce
   this by poisoning the header double-include protection defines.  */
#ifdef IN_GCC_FRONTEND
#pragma GCC poison GCC_RTL_H GCC_EXCEPT_H GCC_EXPR_H
#endif

i.e. the idea of running RTL code from inside the C frontend seems to
be banned.

For now I simply removed that pragma.  Presumably I can avoid it
properly by having c/c-parser.c call into something in the gcc
directory that isn't compiled with -DIN_GCC_FRONTEND.

Next steps:
* I plan to try to get single passes running, and some form of round
-trip dump test.
* I'll also try to split out some of the work into generic cleanups of
read-md.c that I hope can go into trunk independently of this work.

Hope this sounds sane
Dave

FWIW I've backed up the latest version of the (very messy) patch kit
to:
https://dmalcolm.fedorapeople.org/gcc/2016-05-12/gcc-git-rtl-frontend-v13/
It's on top of r235783, and has various unrelated stuff in it.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]