This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc compile-time performance
- From: dewar at gnat dot com (Robert Dewar)
- To: dewar at gnat dot com, neil at daikokuya dot demon dot co dot uk
- Cc: aoliva at redhat dot com, chip dot cuntz at earthling dot net, davem at redhat dot com,gcc at gcc dot gnu dot org, jh at suse dot cz
- Date: Sun, 19 May 2002 09:58:31 -0400 (EDT)
- Subject: Re: gcc compile-time performance
>
> You need to look ahead many times, such as when seeing '.' you need two
> chars to see if it's '...'. But that can be arbitrarily long because
> it could be '.\\n.\\n.". If you're using the mb functions, what do you
> do with the chars you've just read in if the 3rd one wasn't a dot?
> You can't just go back to after the initial dot, because the mb functions
> have state. So I imagine you have to buffer them elsewhere, and that
> means maintaining a buffer that needs to be checked whenever you read
> a character. It gets nasty.
But with all the escaped representations that I am familiar with a dot
looks like a dot, so looking at the next character in the source buffer
to see if it is a dot is easily done. The point is to avoid calling the
functions to interpret extended characters unless you have an extended
character.
I must be missing something here, I don't see why C is so different from
Ada here. I wrote all the circuitry for handling multiple representations
in the Ada lexer, and it seemed quite straightforward.