This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: 20010325-1 vs wide strings
- To: DJ Delorie <dj at redhat dot com>
- Subject: Re: 20010325-1 vs wide strings
- From: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- Date: Thu, 10 May 2001 10:13:27 +0100 (BST)
- cc: <gcc at gcc dot gnu dot org>
On Wed, 9 May 2001, DJ Delorie wrote:
> Yet c-common.c (combine_strings) naively does this:
>
> for (i = 0; i < len; i++)
> {
> if (WCHAR_TYPE_SIZE == HOST_BITS_PER_SHORT)
> ((short *) q)[i] = TREE_STRING_POINTER (t)[i];
> else
> ((int *) q)[i] = TREE_STRING_POINTER (t)[i];
> }
>
> Sure enough, 20010325-1 fails for a big endian target on a little
> endian machine. Is the above code obviously wrong, or is there
> something subtle going on here?
The code above is wrong and is the bug tested for. As documented in
c-tree.texi:
For wide string constants, the @code{TREE_STRING_LENGTH} is the number
of wide characters in the string, and the @code{TREE_STRING_POINTER}
points to an array of the bytes of the string, as represented on the
target system (that is, as integers in the target endianness). Wide and
non-wide string constants are distinguished only by the @code{TREE_TYPE}
of the @code{STRING_CST}.
FIXME: The formats of string constants are not well-defined when the
target system bytes are not the same width as host system bytes.
The code that creates the string constants when they don't need merging,
and Jakub's code to split up string constants in array initializers (to
allow parts to be overridden), handle this correctly. Providing general
helper functions for access to wide strings is something I'd do when
implementing wide string format checking - but to a large extent I've put
off such development until after the release of 3.0, while focus is
supposed to be on the branch.
--
Joseph S. Myers
jsm28@cam.ac.uk