[gcjx] Patch: FYI: lexer and line numbers
Tom Tromey
tromey@redhat.com
Sat Jan 15 22:00:00 GMT 2005
I'm checking this in on the gcjx branch.
This fixes all the line-number-related jacks failures.
Tom
Index: ChangeLog
from Tom Tromey <tromey@redhat.com>
* source/lex.cc (get): Rewrote.
(get_raw): Don't handle newlines for purposes of location
updating.
Index: source/lex.cc
===================================================================
RCS file: /cvs/gcc/gcc/gcjx/source/Attic/lex.cc,v
retrieving revision 1.1.2.1
diff -u -r1.1.2.1 lex.cc
--- source/lex.cc 13 Jan 2005 03:18:37 -0000 1.1.2.1
+++ source/lex.cc 15 Jan 2005 21:59:00 -0000
@@ -227,16 +227,7 @@
throw exc;
}
- // FIXME: this is bogus, since it does the wrong thing when we
- // see \r\n. Note that we must do this processing before we do
- // escape handling, however, since otherwise a \u escape can
- // resemble a line feed.
- if (c == UNICODE_LINE_FEED || c == UNICODE_CARRIAGE_RETURN)
- {
- column = 0;
- ++line;
- }
- else if (c == UNICODE_TAB)
+ if (c == UNICODE_TAB)
{
// Advance to next multiple of tab width. Note we don't
// subtract one from tab_width since we start columns at
@@ -307,34 +298,44 @@
unicode_w_t
lexer::get ()
{
+ unicode_w_t c;
if (cooked_unget_value != UNICODE_W_NONE)
{
- unicode_w_t c = cooked_unget_value;
+ c = cooked_unget_value;
cooked_unget_value = UNICODE_W_NONE;
- // This should never happen, because the cooking includes \r\n
- // rewriting.
- assert (c != UNICODE_CARRIAGE_RETURN);
- return c;
}
+ else
+ c = read_handling_escapes ();
- bool was_return = false;
- while (true)
+ if (c == UNICODE_CARRIAGE_RETURN)
{
- unicode_w_t c = read_handling_escapes ();
- if (was_return && c == UNICODE_LINE_FEED)
- {
- // Saw \r\n and already returned the \n. Loop.
- was_return = false;
- continue;
- }
- was_return = false;
- if (c == UNICODE_CARRIAGE_RETURN)
+ c = read_handling_escapes ();
+ if (c != UNICODE_LINE_FEED)
{
- was_return = true;
+ // Saw \r followed by something else, so unget and return the
+ // line terminator that the rest of the lexer understands.
+ unget (c);
c = UNICODE_LINE_FEED;
}
- return c;
}
+
+ if (c == UNICODE_LINE_FEED)
+ {
+ // Note that it is somewhat bogus for this to be here, since it
+ // means an escape like \u00a0 will increment the line number.
+ // (Jacks tests for this, but I think the Jacks test is wrong
+ // because most editors do not work this way, and the line
+ // numbers are mostly useful for interacting with editors.) We
+ // do it this way because it is more work to do anything else.
+ // One approach might be to have read_handling_escapes tell us
+ // whether a character came from an escape sequence. It isn't
+ // extremely important, I suspect, since newlines-as-escapes are
+ // probably quite rare.
+ column = 0;
+ ++line;
+ }
+
+ return c;
}
unicode_w_t
More information about the Java-patches
mailing list