This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
Re: [gcjx] Patch: FYI: parser and lexer changes
Tom Tromey wrote:
I hate to say this, since I think it is a transient condition, and I
don't want people to really remember it, but gcjx as it is today is
really, really slow. It was more than 10x slower than jikes for
building classpath; now it is merely 6x slower.
Does Jikes handle the JDK 1.5 language? If not, it's not a relevant
comparsion, I think.
Also, Jikes only generates bytecode. We generate native code as well,
which means more general-purpose and probably less efficient trees.
If Jikes's parser really is a lot faster, it may be worthwhile
reading its code to see what they do.
Ranjit> Why do you think the current parser is beyond redemption?
I'm not totally certain that it is. But it does show up in the
profile a lot more than I think it ought to.
A suggestion: If we can get read of peek1, then we get rid of the
token_stream, and replace it by a single "current token".
The XQuery parser in Kawa does this and XQuery is actually a fairly
complicated language to parse.
In any case, if we only need peek and peek1, do we really need a
a deque for the tokens?
I see that you have a mechanism for "mark" and "backtrack", and it's
used heavily - i.e. for parse::primary. Is that really needed?
My guess
is that a generated parser would solve both of these problems.
Basically I took a crazy approach to writing a parser and now I'm
having second thoughts :-)
I don't think a generated parser is the solution. Instead, think about
the simplest data structures you can use.
You might find some ideas in the XQuery parser:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/kawa/gnu/xquery/lang/XQParser.java?rev=1.99&content-type=text/plain&cvsroot=kawa
(There are definitely some ugly and pointless kludges there, of course.)
Start with parsePrimaryExpr, perhaps. No token buffer or backtracking.
--
--Per Bothner
per@bothner.com http://per.bothner.com/