This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Revision 2: utf-16 and utf-32 support in C and C++

From: Jason Merrill <jason at redhat dot com>
To: Kris Van Hees <kris dot van dot hees at oracle dot com>
Cc: "Joseph S. Myers" <joseph at codesourcery dot com>, Andrew Pinski <pinskia at gmail dot com>, gcc-patches at gcc dot gnu dot org
Date: Fri, 04 Apr 2008 16:39:16 -0400
Subject: Re: [PATCH] Revision 2: utf-16 and utf-32 support in C and C++
References: <20080331185400.GB3167@oracle.com>

Kris Van Hees wrote:

Looking pretty good now, just a few nits:

+#if 0

My preference would be to omit the #if 0 blocks. If you really want to leave them in, please add a comment explaining why.

+      else if (type == char16_type_node && cxx_dialect == cxx0x)
+	write_string ("u8char16_t");
+      else if (type == char32_type_node && cxx_dialect == cxx0x)
+	write_string ("u8char32_t");

The mangling doesn't need to be conditional on C++0x mode (but see below).

+	  if (type != tok->type)
+	    {
+	      if (type == CPP_STRING)
+		type = tok->type;
+	      else if (tok->type != CPP_STRING)

The "if" here is redundant.

+#ifndef CHAR16_TYPE
+#define CHAR16_TYPE "short unsigned int"
+#endif
+
+#ifndef CHAR32_TYPE
+#define CHAR32_TYPE "unsigned int"
+#endif

I'm not sure we can always rely on these being at least 16/32 bits, but I suppose any targets for which they aren't can override the macros. So this bit is OK.

-  const int wide_flag = TREE_TYPE (value) == wchar_array_type_node;
+  const bool wide = TREE_TYPE (value)
+		    && TREE_TYPE (value) != char_array_type_node;

The "wide" variable seems unnecessary: it's only used once, and it would be clearer just to test for char_array_type_node before the others.

+  /* Define 'char16_t'.  */
+  char16_type_node = get_identifier (CHAR16_TYPE);
+  char16_type_node = TREE_TYPE (identifier_global_value (char16_type_node));
+  char16_type_size = TYPE_PRECISION (char16_type_node);
+  if (c_dialect_cxx () && cxx_dialect == cxx0x)
+    {
+      char16_type_node = make_unsigned_type (char16_type_size);
+      record_builtin_type (RID_CHAR16, "char16_t", char16_type_node);
+    }

Hmm, now I see why you made char16_t mangling conditional on C++0x mode. I'm very unhappy with changing the ABI of extended characters depending on C++98 or 0x mode; if char16_type_node can come up at all in C++98 mode, it must have the same meaning as in C++0x mode. I'd much rather make the type creation unconditional, and only have the compiler mode affect parsing.

Index: gcc/testsuite/gcc.dg/utf-dflt.c
Index: gcc/testsuite/g++.dg/ext/utf-gnuxx98.C

I don't think these tests are testing what you intended; the u/U uses are well-formed whether or not the macros are expanded.

+      /* 'L', 'u' or 'U' may introduce wide characters or strings.  */
+      if (c == 'L'
+	  || (CPP_OPTION (pfile, c99)
+	      && (CPP_OPTION (pfile, cplusplus) || !CPP_OPTION (pfile, std))))

This test seems wrong.

Jason

Follow-Ups:
- Re: [PATCH] Revision 2: utf-16 and utf-32 support in C and C++
  - From: Kris Van Hees

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]