Patch: gen-table -vs- Unicode 3.1 (Was: Patch installed for gcc and java dir const-ification)

>>>>> "Kaveh" == Kaveh R Ghazi <> writes:

Kaveh> Sure.  But when I ran 3.0.1 through the output was
Kaveh> vastly different from the old chartables.h.  It had a whole
Kaveh> bunch of (char*)0 extra lines and somehow a parsing error got
Kaveh> inserted so it wouldn't compile.  I think the
Kaveh> script encounters some input that it didn't expect and doesn't
Kaveh> yield valid C code.

I looked into this.
gen-table doesn't expect any characters after \uffff, and the 3.0.1
tables include some entries in that range.
I'm checking in the appended.


Index: ChangeLog
from  Tom Tromey  <>

	* Don't process characters after \uffff.  Added
	comment pointing to input file.

RCS file: /cvs/gcc/gcc/gcc/java/,v
retrieving revision 1.3
diff -u -r1.3
--- 2001/12/28 22:27:29 1.3
+++ 2001/12/29 02:15:40
@@ -1,6 +1,6 @@
 #! /usr/bin/perl
-#    Copyright (C) 2000 Free Software Foundation
+#    Copyright (C) 2000, 2001 Free Software Foundation
 #    This program is free software; you can redistribute it and/or modify
 #    it under the terms of the GNU General Public License as published by
@@ -20,10 +20,14 @@
 # - Generate tables for gcj from Unicode data.
 # Usage: perl DATA-FILE
-# A suitable DATA-FILE is available at:
+# You can find the Unicode data file here:
+# Please update this URL when this program is used with a more
+# recent version of the table.  Note that this table cannot be
+# distributed with gcc.
+# This program should not be re-run indiscriminately.  Care must be
+# taken that what it generates is in sync with the Java specification.
 # Names of fields in Unicode data table.
 $CODE = 0;
 $NAME = 1;
@@ -80,6 +84,7 @@
     $code = hex ($fields[$CODE]);
+    last if $code > 0xffff;
     if ($code > $last_code + 1)
 	# Found a gap.

