This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: 20010325-1 vs wide strings


On Wed, 9 May 2001, DJ Delorie wrote:

> Yet c-common.c (combine_strings) naively does this:
> 
>       for (i = 0; i < len; i++)
> 	{
> 	  if (WCHAR_TYPE_SIZE == HOST_BITS_PER_SHORT)
> 	    ((short *) q)[i] = TREE_STRING_POINTER (t)[i];
> 	  else
> 	    ((int *) q)[i] = TREE_STRING_POINTER (t)[i];
> 	}
> 
> Sure enough, 20010325-1 fails for a big endian target on a little
> endian machine.  Is the above code obviously wrong, or is there
> something subtle going on here?

The code above is wrong and is the bug tested for.  As documented in 
c-tree.texi:

	For wide string constants, the @code{TREE_STRING_LENGTH} is the number
	of wide characters in the string, and the @code{TREE_STRING_POINTER}
	points to an array of the bytes of the string, as represented on the
	target system (that is, as integers in the target endianness).  Wide and
	non-wide string constants are distinguished only by the @code{TREE_TYPE}
	of the @code{STRING_CST}.

	FIXME: The formats of string constants are not well-defined when the
	target system bytes are not the same width as host system bytes.

The code that creates the string constants when they don't need merging, 
and Jakub's code to split up string constants in array initializers (to 
allow parts to be overridden), handle this correctly.  Providing general 
helper functions for access to wide strings is something I'd do when 
implementing wide string format checking - but to a large extent I've put 
off such development until after the release of 3.0, while focus is 
supposed to be on the branch.

-- 
Joseph S. Myers
jsm28@cam.ac.uk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]