GCC Bugzilla has been upgraded from version 4.4.9 to 5.0rc3. If you see any problem, please report it to bug 64968.
Bug 33167 - Hex constant characters with \x escape not parsing correctly
Summary: Hex constant characters with \x escape not parsing correctly
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.1.0
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-23 21:49 UTC by Weston Hopkins
Modified: 2007-08-24 19:45 UTC (History)
2 users (show)

See Also:
Host: i586-suse-linux
Target: i586-suse-linux
Build: i586-suse-linux
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Weston Hopkins 2007-08-23 21:49:33 UTC
There seems to be a problem with how gcc parses the \x escape sequences.  It doesn't look at just the first 2 hex digits, but will take the right most 2 hex digits in a string of hex digits.  

[Recreate]
---------------[ SNIP ]---------------------
// test.c
#include <stdio.h>
#include <string.h>

int main() {
	char *string = "\x01\x02\x03Bob";
	printf("len: %d\n",  strlen(string) );
	return 1;
}
---------------[ SNIP ]---------------------

[Compilation options]
gcc -Wall test.c -o test

[Expected Results]
You would expect this to print out "len: 6", but it actually prints out "len: 5" It seems that its parsing the last \x escape as the hex value 0x3B instead of 2 characters, 0x03 and 'B'.

[Platforms]
I've noticed this problem in gcc 4.1.0 and 4.0.1 (on a mac).  Heres more info on one of the systems I've experiences this on:

gcc (GCC) 4.1.0 (SUSE Linux)
Using built-in specs.
Target: i586-suse-linux
Configured with: ../configure --enable-threads=posix --prefix=/usr --with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib --libexecdir=/usr/lib --enable-languages=c,c++,objc,fortran,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.1.0 --enable-ssp --disable-libssp --enable-java-awt=gtk --enable-gtk-cairo --disable-libjava-multilib --with-slibdir=/lib --with-system-zlib --enable-shared --enable-__cxa_atexit --enable-libstdcxx-allocator=new --without-system-libunwind --with-cpu=generic --host=i586-suse-linux
Thread model: posix
gcc version 4.1.0 (SUSE Linux)

$ uname -a
linux haldol 2.6.16.13-4-default #1 Wed May 3 04:53:23 UTC 2006 i686 athlon i386 GNU/Linux
Comment 1 Andrew Pinski 2007-08-23 21:59:37 UTC
No GCC is correct.
The standard says:
Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.

So that means the B is going to be taken and be used for the hexadecimal escape sequence.
Comment 2 Weston Hopkins 2007-08-24 16:04:37 UTC
Yep, looks like you are right from the standard.  That sucks then. I wish it were the other way because I don't see a way to enter a literal single character in hex followed a by single character [A-Z0-9] without escape sequences.  Thanks for the quick response.
Comment 3 Ken Raeburn 2007-08-24 19:45:36 UTC
(In reply to comment #2)
> Yep, looks like you are right from the standard.  That sucks then. I wish it
> were the other way because I don't see a way to enter a literal single
> character in hex followed a by single character [A-Z0-9] without escape
> sequences.

char *string = "\x01\x02\x03" "Bob";

should work.