[RFC] Adjust output for strings in tree-pretty-print.c
FX
fxcoudert@gmail.com
Mon May 19 14:26:00 GMT 2008
Hi all,
The Fortran front-end now handles wide character strings
(UCS-4/UTF-32); for these, the string literals are emitted as strings
with the type of an array of unsigned 32-bit integers. The issue is
that tree-pretty-print.c, in pretty_print_string() assumes strings are
composed of chars and NUL-terminated. This fails, for example, if you
look at the tree dump for the following Fortran source file:
subroutine foo
call test(4_"I'm here!")
end subroutine foo
you currently get:
foo ()
{
test (&"I"[1]{lb: 1 sz: 4}, 9);
On my little-endian compiler, "I'm here!" is in UTF-32:
"I\0\0\0'\0\0\0m\0\0\0 \0\0\0h\0\0\0e\0\0\0r\0\0\0e\0\0\0!\0\0\0". So,
tree-pretty-print.c stops at the first '\0', and we get "l". To make
this work better, as STRING_CST's have an attached length
(TREE_STRING_LENGTH), I suggest using that to output the full string
length, instead of stopping at the first NUL character.
With that patch, the tree dump for the same Fortran source file looks like this:
test (&"I\0\0\0\'\0\0\0m\0\0\0
\0\0\0h\0\0\0e\0\0\0r\0\0\0e\0\0\0!\0\0\0"[1]{lb: 1 sz: 4}, 9);
and the tree dump for the following C testcase:
unsigned char *foo(void) { return "look\0here"; }
which was like this:
return (unsigned char *) "look";
is now like this:
return (unsigned char *) "look\0here\0";
Notice the added final '\0' in the C case; I don't know if it's bad to
have it there, but I don't see a way to not output it and still have
the correct output for Fortran (whose strings are not NUL-terminated).
Any comments? Is it OK to commit as is? It bootstraps and regtests
fine on x86_64-linux, with C and Fortran enabled, except for
gcc.dg/tree-ssa/builtin-{v,}{f,}printf-1.c which need their
scan-tree-dump patterns adjusted accordingly. If there is no
objection, I'll do that and build and regtest C++, objc and objc++ as
well before going ahead.
Thanks,
FX
--
FX Coudert
http://www.homepages.ucl.ac.uk/~uccafco/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wide_char_part6_gcc.diff
Type: application/octet-stream
Size: 2095 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20080519/1ef8926b/attachment.obj>
More information about the Gcc-patches
mailing list