This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Go patch committed: Reject surrogate pairs converting int to string
- From: Ian Lance Taylor <iant at google dot com>
- To: gcc-patches at gcc dot gnu dot org, gofrontend-dev at googlegroups dot com
- Date: Fri, 21 Sep 2012 23:52:21 -0700
- Subject: Go patch committed: Reject surrogate pairs converting int to string
This patch to the Go frontend and libgo rejects surrogate pairs when
converting an int to a string. They are not valid UTF-8. The patch
also rejects a negative int--negative ints were already rejected by the
compiler, but not by the runtime. Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch.
Ian
diff -r f16ad4ccc868 go/lex.cc
--- a/go/lex.cc Fri Sep 21 23:32:36 2012 -0700
+++ b/go/lex.cc Fri Sep 21 23:42:31 2012 -0700
@@ -1312,6 +1312,12 @@
// Turn it into the "replacement character".
v = 0xfffd;
}
+ if (v >= 0xd800 && v < 0xe000)
+ {
+ warning_at(location, 0,
+ "unicode code point 0x%x is invalid surrogate pair", v);
+ v = 0xfffd;
+ }
if (v <= 0xffff)
{
buf[0] = 0xe0 + (v >> 12);
diff -r f16ad4ccc868 libgo/runtime/go-int-to-string.c
--- a/libgo/runtime/go-int-to-string.c Fri Sep 21 23:32:36 2012 -0700
+++ b/libgo/runtime/go-int-to-string.c Fri Sep 21 23:42:31 2012 -0700
@@ -17,6 +17,11 @@
unsigned char *retdata;
struct __go_string ret;
+ /* A negative value is not valid UTF-8; turn it into the replacement
+ character. */
+ if (v < 0)
+ v = 0xfffd;
+
if (v <= 0x7f)
{
buf[0] = v;
@@ -34,6 +39,10 @@
"replacement character". */
if (v > 0x10ffff)
v = 0xfffd;
+ /* If the value is a surrogate pair, which is invalid in UTF-8,
+ turn it into the replacement character. */
+ if (v >= 0xd800 && v < 0xe000)
+ v = 0xfffd;
if (v <= 0xffff)
{