This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Go patch committed: Reject surrogate pairs converting int to string


This patch to the Go frontend and libgo rejects surrogate pairs when
converting an int to a string.  They are not valid UTF-8.  The patch
also rejects a negative int--negative ints were already rejected by the
compiler, but not by the runtime.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline and 4.7 branch.

Ian

diff -r f16ad4ccc868 go/lex.cc
--- a/go/lex.cc	Fri Sep 21 23:32:36 2012 -0700
+++ b/go/lex.cc	Fri Sep 21 23:42:31 2012 -0700
@@ -1312,6 +1312,12 @@
 	  // Turn it into the "replacement character".
 	  v = 0xfffd;
 	}
+      if (v >= 0xd800 && v < 0xe000)
+	{
+	  warning_at(location, 0,
+		     "unicode code point 0x%x is invalid surrogate pair", v);
+	  v = 0xfffd;
+	}
       if (v <= 0xffff)
 	{
 	  buf[0] = 0xe0 + (v >> 12);
diff -r f16ad4ccc868 libgo/runtime/go-int-to-string.c
--- a/libgo/runtime/go-int-to-string.c	Fri Sep 21 23:32:36 2012 -0700
+++ b/libgo/runtime/go-int-to-string.c	Fri Sep 21 23:42:31 2012 -0700
@@ -17,6 +17,11 @@
   unsigned char *retdata;
   struct __go_string ret;
 
+  /* A negative value is not valid UTF-8; turn it into the replacement
+     character.  */
+  if (v < 0)
+    v = 0xfffd;
+
   if (v <= 0x7f)
     {
       buf[0] = v;
@@ -34,6 +39,10 @@
 	 "replacement character".  */
       if (v > 0x10ffff)
 	v = 0xfffd;
+      /* If the value is a surrogate pair, which is invalid in UTF-8,
+	 turn it into the replacement character.  */
+      if (v >= 0xd800 && v < 0xe000)
+	v = 0xfffd;
 
       if (v <= 0xffff)
 	{

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]