This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
Re: \0 in strings
- To: Horst von Brand <vonbrand at sleipnir dot valparaiso dot cl>
- Subject: Re: \0 in strings
- From: Zack Weinberg <zack at wolery dot cumb dot org>
- Date: Sun, 27 Feb 2000 11:07:53 -0800
- Cc: Per Hedbor <per at idonex dot se>, gcc-bugs at gcc dot gnu dot org
- References: <per@idonex.se> <200002271818.e1RIIHj20090@sleipnir.valparaiso.cl>
On Sun, Feb 27, 2000 at 03:18:17PM -0300, Horst von Brand wrote:
> Per Hedbor <per@idonex.se> said:
> > The newest snapshot version of gcc gives errors when it encounters a ÿ
> > after encountering a \0 in a string:
> >
> > test.c:1: Unterminated string constant
> > test.c:2: Unterminated string constant
> > test.c:2: parse error at end of input
> >
> > The whole contents of 'test.c' is:
> > unsigned char *f = "\0ÿ";
>
> What machine is this? ÿ is what I see when somebody tries to print an EOF,
> so this is probably working in reverse somewhere.
>
> In any case, I see the same here (egcs-20000221, glibc-2.1.3 on
> i686-redhat-linux). egcs-1.1.2-24 and gcc-2.95.2 work fine.
I found the bug. It is indeed a case of confusing ÿ with EOF -
what happens is, the ÿ is seen first by readescape, which correctly
interprets it as a character not part of the escape, and pushes it
back. Then yylex calls getch to retrieve the next character. getch
pulls the ÿ out of the pushback buffer, extends it to an int, and
returns it. The trouble is that the pushback buffer is an array of
signed char, which means ÿ (0xFF) is _sign_ extended to (int)-1,
which is EOF.
The one-line fix is to change the pushback buffer to an array of
unsigned char, so 0xFF will be zero extended and interpreted as ÿ
instead of EOF. I'll be committing the appended patch under the
obvious-bugfix rule as soon as I verify it doesn't break anything
else.
A test case has already been added: gcc.c-torture/execute/20000227-1.c.
zw
* c-lex.c (struct putback_buffer): Change type of 'buffer'
element to unsigned char.
===================================================================
Index: c-lex.c
--- c-lex.c 2000/02/26 05:45:17 1.75
+++ c-lex.c 2000/02/27 19:03:20
@@ -85,7 +85,7 @@ extern int yy_get_token ();
#define UNGETC(c) put_back (c)
struct putback_buffer {
- char *buffer;
+ unsigned char *buffer;
int buffer_size;
int index;
};