This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: \0 in strings


On Sun, Feb 27, 2000 at 03:18:17PM -0300, Horst von Brand wrote:
> Per Hedbor <per@idonex.se> said:
> > The newest snapshot version of gcc gives errors when it encounters a ÿ
> > after encountering a \0 in a string:
> > 
> > test.c:1: Unterminated string constant
> > test.c:2: Unterminated string constant
> > test.c:2: parse error at end of input
> > 
> > The whole contents of 'test.c' is:
> > unsigned char *f = "\0ÿ";
> 
> What machine is this? ÿ is what I see when somebody tries to print an EOF,
> so this is probably working in reverse somewhere.
>
> In any case, I see the same here (egcs-20000221, glibc-2.1.3 on
> i686-redhat-linux). egcs-1.1.2-24 and gcc-2.95.2 work fine.

I found the bug.  It is indeed a case of confusing ÿ with EOF -
what happens is, the ÿ is seen first by readescape, which correctly
interprets it as a character not part of the escape, and pushes it
back.  Then yylex calls getch to retrieve the next character.  getch
pulls the ÿ out of the pushback buffer, extends it to an int, and
returns it.  The trouble is that the pushback buffer is an array of
signed char, which means ÿ (0xFF) is _sign_ extended to (int)-1,
which is EOF.

The one-line fix is to change the pushback buffer to an array of
unsigned char, so 0xFF will be zero extended and interpreted as ÿ
instead of EOF.  I'll be committing the appended patch under the
obvious-bugfix rule as soon as I verify it doesn't break anything
else.

A test case has already been added: gcc.c-torture/execute/20000227-1.c.

zw

	* c-lex.c (struct putback_buffer): Change type of 'buffer'
	element to unsigned char.
===================================================================
Index: c-lex.c
--- c-lex.c	2000/02/26 05:45:17	1.75
+++ c-lex.c	2000/02/27 19:03:20
@@ -85,7 +85,7 @@ extern int yy_get_token ();
 #define UNGETC(c) put_back (c)
 
 struct putback_buffer {
-  char *buffer;
+  unsigned char *buffer;
   int   buffer_size;
   int   index;
 };


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]