This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [gcjx] Patch: FYI: parser and lexer changes

From: Per Bothner <per at bothner dot com>
To: tromey at redhat dot com
Cc: GCJ Patches <java-patches at gcc dot gnu dot org>
Date: Tue, 04 Oct 2005 00:03:24 -0700
Subject: Re: [gcjx] Patch: FYI: parser and lexer changes
References: <m3br26l4j8.fsf@localhost.localdomain> <4341EDBC.1020203@gmail.com> <m3zmpqhtq5.fsf@localhost.localdomain>

Tom Tromey wrote:

I hate to say this, since I think it is a transient condition, and I
don't want people to really remember it, but gcjx as it is today is
really, really slow.  It was more than 10x slower than jikes for
building classpath; now it is merely 6x slower.


Does Jikes handle the JDK 1.5 language?   If not, it's not a relevant
comparsion, I think.

Also, Jikes only generates bytecode.  We generate native code as well,
which means more general-purpose and probably less efficient trees.

If Jikes's parser really is a lot faster, it may be worthwhile
reading its code to see what they do.

Ranjit> Why do you think the current parser is beyond redemption?

I'm not totally certain that it is. But it does show up in the profile a lot more than I think it ought to.


A suggestion: If we can get read of peek1, then we get rid of the
token_stream, and replace it by a single "current token".

The XQuery parser in Kawa does this and XQuery is actually a fairly
complicated language to parse.

In any case, if we only need peek and peek1, do we really need a
a deque for the tokens?

I see that you have a mechanism for "mark" and "backtrack", and it's
used heavily - i.e. for parse::primary.  Is that really needed?

 My guess
is that a generated parser would solve both of these problems.
Basically I took a crazy approach to writing a parser and now I'm
having second thoughts :-)


I don't think a generated parser is the solution.  Instead, think about
the simplest data structures you can use.

You might find some ideas in the XQuery parser:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/kawa/gnu/xquery/lang/XQParser.java?rev=1.99&content-type=text/plain&cvsroot=kawa
(There are definitely some ugly and pointless kludges there, of course.)
Start with parsePrimaryExpr, perhaps.  No token buffer or backtracking.
--
	--Per Bothner
per@bothner.com   http://per.bothner.com/

Follow-Ups:
- Re: [gcjx] Patch: FYI: parser and lexer changes
  - From: Tom Tromey

References:
- [gcjx] Patch: FYI: parser and lexer changes
  - From: Tom Tromey
- Re: [gcjx] Patch: FYI: parser and lexer changes
  - From: Ranjit Mathew
- Re: [gcjx] Patch: FYI: parser and lexer changes
  - From: Tom Tromey

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]