[PATCH] libstdc++: Implement debug format for strings and charcters formatters [PR109162]
Michael Welsh Duggan
mwd@md5i.com
Wed Apr 2 16:04:58 GMT 2025
Tomasz Kamiński <tkaminsk@redhat.com> writes:
> This patch implements part P2372R3 that specified debug (escaped)
> format for the stings and characters sequecenes. This include both
> handling of the '?' formatt specifier and set_debug_format member.
P2372R3 refers to "Fixing locale handling in chrono formatters". Some
searching seems to indicate that you actually meant P2733R3, "Fix
handling of empty specifiers in std::format".
> To indicate partial support we define __glibcxx_format_ranges macro
> value 1, without defining __cpp_lib_format_ranges.
>
> We provide two separate escaping routines depending on the literal
> encoding for the corresponding character types. If the charcter
> encoding we follow the specification for the standard
> (__format::__write_escaped_unicode).
> For other encodings, we escape only characters in range [0x00, 0x80),
> interpreting them as ACII values: [0x00, 0x20), 0x7f and '\t', '\r',
> '\n', '\\', '"', '\'' are escaped. We assume every character outside
> this range is printable (__format::_write_escpaed_ascii).
> In particular we do not yet implement special handling of shift
> sequences.
>
> For Unicode escaping a new __escape_edges table is introduced,
> that encodes information if character belongs tp General_Category
> that is escaped by the standard (Control or Other). This table
> is generated from DerivedGeneralCategory.txt provided by Unicode.
> Only boolean flag is preserved to reduce the number of entires.
> The additional rules for escaping are handled by __should_escape_unicode.
>
> When width of precision is specified, we emit escaped string
> to the temporary buffer and format the resulting string according
> ot the format spec. For characters fixed size stack buffer, for
> which a new _Fixedbuf_sink is introduced.
>
> Finally this patch corrects handling of UTF-32LE and UTF32-BE
> in __unicode::__literal_encoding_is_unicode<_CharT>, and now they
> are properly recognized as unicode.
>
> contrib/ChangeLog:
>
> * unicode/README:
> Mentioned `DerivedGeneralCategory.txt`
> * unicode/gen_libstdcxx_unicode_data.py:
> Generation __escape_edges table from DerivedGeneralCategory.txt.
> Update file name in comments.
> * unicode/DerivedGeneralCategory.txt:
> Copy of file distrubuted by Unicode Consortium
> ftp://ftp.unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/chrono_io.h (_GLIBCXX_WIDEN_, _GLIBCXX_WIDEN)
> (__detail::_Widen): Moved to std/format file.
> * include/bits/unicode-data.h:
> Regnerate using contrib/unicode/gen_libstdcxx_unicode_data.py.
> * include/bits/unicode.h (__unicode::_Utf_iterator::_M_units)
> (__unicode::__should_escape_category): Define.
> (__unicode::__literal_encoding_is_unicode<_CharT>):
> Corrected handing for UTF-16 and UTF-32 with "LE" or "BE" suffix.
> * include/bits/version.def:
> Define __glibcxx_format_ranges without corresponding std name.
> * include/bits/version.h: Regenerate.
> * include/std/format (_GLIBCXX_WIDEN_, _GLIBCXX_WIDEN):
> Moved from include/bits/chrono_io.h.
> (__format::_Term_char, __format::_Escapes, __format::_Separators)
> (__format::__should_escape_ascii, __format::__should_escape_unicode)
> (__format::__write_escape_seq, __format::__write_escaped_char)
> (__format::__write_escaped_acii, __format::__write_escaped_unicode)
> (__format::__write_escaped): Define.
> (__formatter_str::_M_format): Extracted non-escaped formatting.
> (__formatter_str::format): Handle _Pres_esc.
> (__formatter_int::_M_do_parse): Parse '?' if__glibcxx_format_ranges
> if set.
> (__formatter_int::_M_format_character_escaped): Define.
> (formatter<_CharT, _CharT>::format, formatter<char, wchar_t>::format):
> Handle _Pres_esc.
> (__formatter_str::set_debug_format, formatter<...>::set_debug_format)
> Guard with __glibcxx_format_ranges.
> (__format::_Fixedbuf_sink): Define.
> * testsuite/std/format/debug.cc: New test.
> * testsuite/std/format/parse_ctx.cc (escaped_strings_supported):
> Define to true if __glibcxx_format_ranges is defined.
> * testsuite/std/format/string.cc (escaped_strings_supported):
> Define to true if __glibcxx_format_ranges is defined.
> ---
> Testing on x86_64-linux. OK for trunk?
>
> For dg-options could I cofigure a run with unicode and non-unicode
> encodings in same file? If so what would encoding that may be supported
> on most of the platforms we run tests on (value for -fexec-charset=).
>
>
> contrib/unicode/DerivedGeneralCategory.txt | 4323 +++++++++++++++++
> contrib/unicode/README | 3 +-
> contrib/unicode/gen_libstdcxx_unicode_data.py | 46 +-
> libstdc++-v3/include/bits/chrono_io.h | 17 -
> libstdc++-v3/include/bits/unicode-data.h | 260 +-
> libstdc++-v3/include/bits/unicode.h | 19 +
> libstdc++-v3/include/bits/version.def | 18 +-
> libstdc++-v3/include/bits/version.h | 10 +
> libstdc++-v3/include/std/format | 443 +-
> libstdc++-v3/testsuite/std/format/debug.cc | 419 ++
> .../testsuite/std/format/parse_ctx.cc | 2 +-
> libstdc++-v3/testsuite/std/format/string.cc | 2 +-
> 12 files changed, 5481 insertions(+), 81 deletions(-)
> create mode 100644 contrib/unicode/DerivedGeneralCategory.txt
> create mode 100644 libstdc++-v3/testsuite/std/format/debug.cc
>
> diff --git a/contrib/unicode/DerivedGeneralCategory.txt b/contrib/unicode/DerivedGeneralCategory.txt
> new file mode 100644
> index 00000000000..07bf7bca93d
> --- /dev/null
> +++ b/contrib/unicode/DerivedGeneralCategory.txt
> @@ -0,0 +1,4323 @@
> +# DerivedGeneralCategory-16.0.0.txt
> +# Date: 2024-04-30, 21:48:17 GMT
> +# © 2024 Unicode®, Inc.
> +# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
> +# For terms of use and license, see https://www.unicode.org/terms_of_use.html
> +#
> +# Unicode Character Database
> +# For documentation, see https://www.unicode.org/reports/tr44/
> +
> +# ================================================
> +
> +# Property: General_Category
> +
> +# ================================================
> +
> +# General_Category=Unassigned
> +
> +0378..0379 ; Cn # [2] <reserved-0378>..<reserved-0379>
> +0380..0383 ; Cn # [4] <reserved-0380>..<reserved-0383>
> +038B ; Cn # <reserved-038B>
> +038D ; Cn # <reserved-038D>
> +03A2 ; Cn # <reserved-03A2>
> +0530 ; Cn # <reserved-0530>
> +0557..0558 ; Cn # [2] <reserved-0557>..<reserved-0558>
> +058B..058C ; Cn # [2] <reserved-058B>..<reserved-058C>
> +0590 ; Cn # <reserved-0590>
> +05C8..05CF ; Cn # [8] <reserved-05C8>..<reserved-05CF>
> +05EB..05EE ; Cn # [4] <reserved-05EB>..<reserved-05EE>
> +05F5..05FF ; Cn # [11] <reserved-05F5>..<reserved-05FF>
> +070E ; Cn # <reserved-070E>
> +074B..074C ; Cn # [2] <reserved-074B>..<reserved-074C>
> +07B2..07BF ; Cn # [14] <reserved-07B2>..<reserved-07BF>
> +07FB..07FC ; Cn # [2] <reserved-07FB>..<reserved-07FC>
> +082E..082F ; Cn # [2] <reserved-082E>..<reserved-082F>
> +083F ; Cn # <reserved-083F>
> +085C..085D ; Cn # [2] <reserved-085C>..<reserved-085D>
> +085F ; Cn # <reserved-085F>
> +086B..086F ; Cn # [5] <reserved-086B>..<reserved-086F>
> +088F ; Cn # <reserved-088F>
> +0892..0896 ; Cn # [5] <reserved-0892>..<reserved-0896>
> +0984 ; Cn # <reserved-0984>
> +098D..098E ; Cn # [2] <reserved-098D>..<reserved-098E>
> +0991..0992 ; Cn # [2] <reserved-0991>..<reserved-0992>
> +09A9 ; Cn # <reserved-09A9>
> +09B1 ; Cn # <reserved-09B1>
> +09B3..09B5 ; Cn # [3] <reserved-09B3>..<reserved-09B5>
> +09BA..09BB ; Cn # [2] <reserved-09BA>..<reserved-09BB>
> +09C5..09C6 ; Cn # [2] <reserved-09C5>..<reserved-09C6>
> +09C9..09CA ; Cn # [2] <reserved-09C9>..<reserved-09CA>
> +09CF..09D6 ; Cn # [8] <reserved-09CF>..<reserved-09D6>
> +09D8..09DB ; Cn # [4] <reserved-09D8>..<reserved-09DB>
> +09DE ; Cn # <reserved-09DE>
> +09E4..09E5 ; Cn # [2] <reserved-09E4>..<reserved-09E5>
> +09FF..0A00 ; Cn # [2] <reserved-09FF>..<reserved-0A00>
> +0A04 ; Cn # <reserved-0A04>
> +0A0B..0A0E ; Cn # [4] <reserved-0A0B>..<reserved-0A0E>
> +0A11..0A12 ; Cn # [2] <reserved-0A11>..<reserved-0A12>
> +0A29 ; Cn # <reserved-0A29>
> +0A31 ; Cn # <reserved-0A31>
> +0A34 ; Cn # <reserved-0A34>
> +0A37 ; Cn # <reserved-0A37>
> +0A3A..0A3B ; Cn # [2] <reserved-0A3A>..<reserved-0A3B>
> +0A3D ; Cn # <reserved-0A3D>
> +0A43..0A46 ; Cn # [4] <reserved-0A43>..<reserved-0A46>
> +0A49..0A4A ; Cn # [2] <reserved-0A49>..<reserved-0A4A>
> +0A4E..0A50 ; Cn # [3] <reserved-0A4E>..<reserved-0A50>
> +0A52..0A58 ; Cn # [7] <reserved-0A52>..<reserved-0A58>
> +0A5D ; Cn # <reserved-0A5D>
> +0A5F..0A65 ; Cn # [7] <reserved-0A5F>..<reserved-0A65>
> +0A77..0A80 ; Cn # [10] <reserved-0A77>..<reserved-0A80>
> +0A84 ; Cn # <reserved-0A84>
> +0A8E ; Cn # <reserved-0A8E>
> +0A92 ; Cn # <reserved-0A92>
> +0AA9 ; Cn # <reserved-0AA9>
> +0AB1 ; Cn # <reserved-0AB1>
> +0AB4 ; Cn # <reserved-0AB4>
> +0ABA..0ABB ; Cn # [2] <reserved-0ABA>..<reserved-0ABB>
> +0AC6 ; Cn # <reserved-0AC6>
> +0ACA ; Cn # <reserved-0ACA>
> +0ACE..0ACF ; Cn # [2] <reserved-0ACE>..<reserved-0ACF>
> +0AD1..0ADF ; Cn # [15] <reserved-0AD1>..<reserved-0ADF>
> +0AE4..0AE5 ; Cn # [2] <reserved-0AE4>..<reserved-0AE5>
> +0AF2..0AF8 ; Cn # [7] <reserved-0AF2>..<reserved-0AF8>
> +0B00 ; Cn # <reserved-0B00>
> +0B04 ; Cn # <reserved-0B04>
> +0B0D..0B0E ; Cn # [2] <reserved-0B0D>..<reserved-0B0E>
> +0B11..0B12 ; Cn # [2] <reserved-0B11>..<reserved-0B12>
> +0B29 ; Cn # <reserved-0B29>
> +0B31 ; Cn # <reserved-0B31>
> +0B34 ; Cn # <reserved-0B34>
> +0B3A..0B3B ; Cn # [2] <reserved-0B3A>..<reserved-0B3B>
> +0B45..0B46 ; Cn # [2] <reserved-0B45>..<reserved-0B46>
> +0B49..0B4A ; Cn # [2] <reserved-0B49>..<reserved-0B4A>
> +0B4E..0B54 ; Cn # [7] <reserved-0B4E>..<reserved-0B54>
> +0B58..0B5B ; Cn # [4] <reserved-0B58>..<reserved-0B5B>
> +0B5E ; Cn # <reserved-0B5E>
> +0B64..0B65 ; Cn # [2] <reserved-0B64>..<reserved-0B65>
> +0B78..0B81 ; Cn # [10] <reserved-0B78>..<reserved-0B81>
> +0B84 ; Cn # <reserved-0B84>
> +0B8B..0B8D ; Cn # [3] <reserved-0B8B>..<reserved-0B8D>
> +0B91 ; Cn # <reserved-0B91>
> +0B96..0B98 ; Cn # [3] <reserved-0B96>..<reserved-0B98>
> +0B9B ; Cn # <reserved-0B9B>
> +0B9D ; Cn # <reserved-0B9D>
> +0BA0..0BA2 ; Cn # [3] <reserved-0BA0>..<reserved-0BA2>
> +0BA5..0BA7 ; Cn # [3] <reserved-0BA5>..<reserved-0BA7>
> +0BAB..0BAD ; Cn # [3] <reserved-0BAB>..<reserved-0BAD>
> +0BBA..0BBD ; Cn # [4] <reserved-0BBA>..<reserved-0BBD>
> +0BC3..0BC5 ; Cn # [3] <reserved-0BC3>..<reserved-0BC5>
> +0BC9 ; Cn # <reserved-0BC9>
> +0BCE..0BCF ; Cn # [2] <reserved-0BCE>..<reserved-0BCF>
> +0BD1..0BD6 ; Cn # [6] <reserved-0BD1>..<reserved-0BD6>
> +0BD8..0BE5 ; Cn # [14] <reserved-0BD8>..<reserved-0BE5>
> +0BFB..0BFF ; Cn # [5] <reserved-0BFB>..<reserved-0BFF>
> +0C0D ; Cn # <reserved-0C0D>
> +0C11 ; Cn # <reserved-0C11>
> +0C29 ; Cn # <reserved-0C29>
> +0C3A..0C3B ; Cn # [2] <reserved-0C3A>..<reserved-0C3B>
> +0C45 ; Cn # <reserved-0C45>
> +0C49 ; Cn # <reserved-0C49>
> +0C4E..0C54 ; Cn # [7] <reserved-0C4E>..<reserved-0C54>
> +0C57 ; Cn # <reserved-0C57>
> +0C5B..0C5C ; Cn # [2] <reserved-0C5B>..<reserved-0C5C>
> +0C5E..0C5F ; Cn # [2] <reserved-0C5E>..<reserved-0C5F>
> +0C64..0C65 ; Cn # [2] <reserved-0C64>..<reserved-0C65>
> +0C70..0C76 ; Cn # [7] <reserved-0C70>..<reserved-0C76>
> +0C8D ; Cn # <reserved-0C8D>
> +0C91 ; Cn # <reserved-0C91>
> +0CA9 ; Cn # <reserved-0CA9>
> +0CB4 ; Cn # <reserved-0CB4>
> +0CBA..0CBB ; Cn # [2] <reserved-0CBA>..<reserved-0CBB>
> +0CC5 ; Cn # <reserved-0CC5>
> +0CC9 ; Cn # <reserved-0CC9>
> +0CCE..0CD4 ; Cn # [7] <reserved-0CCE>..<reserved-0CD4>
> +0CD7..0CDC ; Cn # [6] <reserved-0CD7>..<reserved-0CDC>
> +0CDF ; Cn # <reserved-0CDF>
> +0CE4..0CE5 ; Cn # [2] <reserved-0CE4>..<reserved-0CE5>
> +0CF0 ; Cn # <reserved-0CF0>
> +0CF4..0CFF ; Cn # [12] <reserved-0CF4>..<reserved-0CFF>
> +0D0D ; Cn # <reserved-0D0D>
> +0D11 ; Cn # <reserved-0D11>
> +0D45 ; Cn # <reserved-0D45>
> +0D49 ; Cn # <reserved-0D49>
> +0D50..0D53 ; Cn # [4] <reserved-0D50>..<reserved-0D53>
> +0D64..0D65 ; Cn # [2] <reserved-0D64>..<reserved-0D65>
> +0D80 ; Cn # <reserved-0D80>
> +0D84 ; Cn # <reserved-0D84>
> +0D97..0D99 ; Cn # [3] <reserved-0D97>..<reserved-0D99>
> +0DB2 ; Cn # <reserved-0DB2>
> +0DBC ; Cn # <reserved-0DBC>
> +0DBE..0DBF ; Cn # [2] <reserved-0DBE>..<reserved-0DBF>
> +0DC7..0DC9 ; Cn # [3] <reserved-0DC7>..<reserved-0DC9>
> +0DCB..0DCE ; Cn # [4] <reserved-0DCB>..<reserved-0DCE>
> +0DD5 ; Cn # <reserved-0DD5>
> +0DD7 ; Cn # <reserved-0DD7>
> +0DE0..0DE5 ; Cn # [6] <reserved-0DE0>..<reserved-0DE5>
> +0DF0..0DF1 ; Cn # [2] <reserved-0DF0>..<reserved-0DF1>
> +0DF5..0E00 ; Cn # [12] <reserved-0DF5>..<reserved-0E00>
> +0E3B..0E3E ; Cn # [4] <reserved-0E3B>..<reserved-0E3E>
> +0E5C..0E80 ; Cn # [37] <reserved-0E5C>..<reserved-0E80>
> +0E83 ; Cn # <reserved-0E83>
> +0E85 ; Cn # <reserved-0E85>
> +0E8B ; Cn # <reserved-0E8B>
> +0EA4 ; Cn # <reserved-0EA4>
> +0EA6 ; Cn # <reserved-0EA6>
> +0EBE..0EBF ; Cn # [2] <reserved-0EBE>..<reserved-0EBF>
> +0EC5 ; Cn # <reserved-0EC5>
> +0EC7 ; Cn # <reserved-0EC7>
> +0ECF ; Cn # <reserved-0ECF>
> +0EDA..0EDB ; Cn # [2] <reserved-0EDA>..<reserved-0EDB>
> +0EE0..0EFF ; Cn # [32] <reserved-0EE0>..<reserved-0EFF>
> +0F48 ; Cn # <reserved-0F48>
> +0F6D..0F70 ; Cn # [4] <reserved-0F6D>..<reserved-0F70>
> +0F98 ; Cn # <reserved-0F98>
> +0FBD ; Cn # <reserved-0FBD>
> +0FCD ; Cn # <reserved-0FCD>
> +0FDB..0FFF ; Cn # [37] <reserved-0FDB>..<reserved-0FFF>
> +10C6 ; Cn # <reserved-10C6>
> +10C8..10CC ; Cn # [5] <reserved-10C8>..<reserved-10CC>
> +10CE..10CF ; Cn # [2] <reserved-10CE>..<reserved-10CF>
> +1249 ; Cn # <reserved-1249>
> +124E..124F ; Cn # [2] <reserved-124E>..<reserved-124F>
> +1257 ; Cn # <reserved-1257>
> +1259 ; Cn # <reserved-1259>
> +125E..125F ; Cn # [2] <reserved-125E>..<reserved-125F>
> +1289 ; Cn # <reserved-1289>
> +128E..128F ; Cn # [2] <reserved-128E>..<reserved-128F>
> +12B1 ; Cn # <reserved-12B1>
> +12B6..12B7 ; Cn # [2] <reserved-12B6>..<reserved-12B7>
> +12BF ; Cn # <reserved-12BF>
> +12C1 ; Cn # <reserved-12C1>
> +12C6..12C7 ; Cn # [2] <reserved-12C6>..<reserved-12C7>
> +12D7 ; Cn # <reserved-12D7>
> +1311 ; Cn # <reserved-1311>
> +1316..1317 ; Cn # [2] <reserved-1316>..<reserved-1317>
> +135B..135C ; Cn # [2] <reserved-135B>..<reserved-135C>
> +137D..137F ; Cn # [3] <reserved-137D>..<reserved-137F>
> +139A..139F ; Cn # [6] <reserved-139A>..<reserved-139F>
> +13F6..13F7 ; Cn # [2] <reserved-13F6>..<reserved-13F7>
> +13FE..13FF ; Cn # [2] <reserved-13FE>..<reserved-13FF>
> +169D..169F ; Cn # [3] <reserved-169D>..<reserved-169F>
> +16F9..16FF ; Cn # [7] <reserved-16F9>..<reserved-16FF>
> +1716..171E ; Cn # [9] <reserved-1716>..<reserved-171E>
> +1737..173F ; Cn # [9] <reserved-1737>..<reserved-173F>
> +1754..175F ; Cn # [12] <reserved-1754>..<reserved-175F>
> +176D ; Cn # <reserved-176D>
> +1771 ; Cn # <reserved-1771>
> +1774..177F ; Cn # [12] <reserved-1774>..<reserved-177F>
> +17DE..17DF ; Cn # [2] <reserved-17DE>..<reserved-17DF>
> +17EA..17EF ; Cn # [6] <reserved-17EA>..<reserved-17EF>
> +17FA..17FF ; Cn # [6] <reserved-17FA>..<reserved-17FF>
> +181A..181F ; Cn # [6] <reserved-181A>..<reserved-181F>
> +1879..187F ; Cn # [7] <reserved-1879>..<reserved-187F>
> +18AB..18AF ; Cn # [5] <reserved-18AB>..<reserved-18AF>
> +18F6..18FF ; Cn # [10] <reserved-18F6>..<reserved-18FF>
> +191F ; Cn # <reserved-191F>
> +192C..192F ; Cn # [4] <reserved-192C>..<reserved-192F>
> +193C..193F ; Cn # [4] <reserved-193C>..<reserved-193F>
> +1941..1943 ; Cn # [3] <reserved-1941>..<reserved-1943>
> +196E..196F ; Cn # [2] <reserved-196E>..<reserved-196F>
> +1975..197F ; Cn # [11] <reserved-1975>..<reserved-197F>
> +19AC..19AF ; Cn # [4] <reserved-19AC>..<reserved-19AF>
> +19CA..19CF ; Cn # [6] <reserved-19CA>..<reserved-19CF>
> +19DB..19DD ; Cn # [3] <reserved-19DB>..<reserved-19DD>
> +1A1C..1A1D ; Cn # [2] <reserved-1A1C>..<reserved-1A1D>
> +1A5F ; Cn # <reserved-1A5F>
> +1A7D..1A7E ; Cn # [2] <reserved-1A7D>..<reserved-1A7E>
> +1A8A..1A8F ; Cn # [6] <reserved-1A8A>..<reserved-1A8F>
> +1A9A..1A9F ; Cn # [6] <reserved-1A9A>..<reserved-1A9F>
> +1AAE..1AAF ; Cn # [2] <reserved-1AAE>..<reserved-1AAF>
> +1ACF..1AFF ; Cn # [49] <reserved-1ACF>..<reserved-1AFF>
> +1B4D ; Cn # <reserved-1B4D>
> +1BF4..1BFB ; Cn # [8] <reserved-1BF4>..<reserved-1BFB>
> +1C38..1C3A ; Cn # [3] <reserved-1C38>..<reserved-1C3A>
> +1C4A..1C4C ; Cn # [3] <reserved-1C4A>..<reserved-1C4C>
> +1C8B..1C8F ; Cn # [5] <reserved-1C8B>..<reserved-1C8F>
> +1CBB..1CBC ; Cn # [2] <reserved-1CBB>..<reserved-1CBC>
> +1CC8..1CCF ; Cn # [8] <reserved-1CC8>..<reserved-1CCF>
> +1CFB..1CFF ; Cn # [5] <reserved-1CFB>..<reserved-1CFF>
> +1F16..1F17 ; Cn # [2] <reserved-1F16>..<reserved-1F17>
> +1F1E..1F1F ; Cn # [2] <reserved-1F1E>..<reserved-1F1F>
> +1F46..1F47 ; Cn # [2] <reserved-1F46>..<reserved-1F47>
> +1F4E..1F4F ; Cn # [2] <reserved-1F4E>..<reserved-1F4F>
> +1F58 ; Cn # <reserved-1F58>
> +1F5A ; Cn # <reserved-1F5A>
> +1F5C ; Cn # <reserved-1F5C>
> +1F5E ; Cn # <reserved-1F5E>
> +1F7E..1F7F ; Cn # [2] <reserved-1F7E>..<reserved-1F7F>
> +1FB5 ; Cn # <reserved-1FB5>
> +1FC5 ; Cn # <reserved-1FC5>
> +1FD4..1FD5 ; Cn # [2] <reserved-1FD4>..<reserved-1FD5>
> +1FDC ; Cn # <reserved-1FDC>
> +1FF0..1FF1 ; Cn # [2] <reserved-1FF0>..<reserved-1FF1>
> +1FF5 ; Cn # <reserved-1FF5>
> +1FFF ; Cn # <reserved-1FFF>
> +2065 ; Cn # <reserved-2065>
> +2072..2073 ; Cn # [2] <reserved-2072>..<reserved-2073>
> +208F ; Cn # <reserved-208F>
> +209D..209F ; Cn # [3] <reserved-209D>..<reserved-209F>
> +20C1..20CF ; Cn # [15] <reserved-20C1>..<reserved-20CF>
> +20F1..20FF ; Cn # [15] <reserved-20F1>..<reserved-20FF>
> +218C..218F ; Cn # [4] <reserved-218C>..<reserved-218F>
> +242A..243F ; Cn # [22] <reserved-242A>..<reserved-243F>
> +244B..245F ; Cn # [21] <reserved-244B>..<reserved-245F>
> +2B74..2B75 ; Cn # [2] <reserved-2B74>..<reserved-2B75>
> +2B96 ; Cn # <reserved-2B96>
> +2CF4..2CF8 ; Cn # [5] <reserved-2CF4>..<reserved-2CF8>
> +2D26 ; Cn # <reserved-2D26>
> +2D28..2D2C ; Cn # [5] <reserved-2D28>..<reserved-2D2C>
> +2D2E..2D2F ; Cn # [2] <reserved-2D2E>..<reserved-2D2F>
> +2D68..2D6E ; Cn # [7] <reserved-2D68>..<reserved-2D6E>
> +2D71..2D7E ; Cn # [14] <reserved-2D71>..<reserved-2D7E>
> +2D97..2D9F ; Cn # [9] <reserved-2D97>..<reserved-2D9F>
> +2DA7 ; Cn # <reserved-2DA7>
> +2DAF ; Cn # <reserved-2DAF>
> +2DB7 ; Cn # <reserved-2DB7>
> +2DBF ; Cn # <reserved-2DBF>
> +2DC7 ; Cn # <reserved-2DC7>
> +2DCF ; Cn # <reserved-2DCF>
> +2DD7 ; Cn # <reserved-2DD7>
> +2DDF ; Cn # <reserved-2DDF>
> +2E5E..2E7F ; Cn # [34] <reserved-2E5E>..<reserved-2E7F>
> +2E9A ; Cn # <reserved-2E9A>
> +2EF4..2EFF ; Cn # [12] <reserved-2EF4>..<reserved-2EFF>
> +2FD6..2FEF ; Cn # [26] <reserved-2FD6>..<reserved-2FEF>
> +3040 ; Cn # <reserved-3040>
> +3097..3098 ; Cn # [2] <reserved-3097>..<reserved-3098>
> +3100..3104 ; Cn # [5] <reserved-3100>..<reserved-3104>
> +3130 ; Cn # <reserved-3130>
> +318F ; Cn # <reserved-318F>
> +31E6..31EE ; Cn # [9] <reserved-31E6>..<reserved-31EE>
> +321F ; Cn # <reserved-321F>
> +A48D..A48F ; Cn # [3] <reserved-A48D>..<reserved-A48F>
> +A4C7..A4CF ; Cn # [9] <reserved-A4C7>..<reserved-A4CF>
> +A62C..A63F ; Cn # [20] <reserved-A62C>..<reserved-A63F>
> +A6F8..A6FF ; Cn # [8] <reserved-A6F8>..<reserved-A6FF>
> +A7CE..A7CF ; Cn # [2] <reserved-A7CE>..<reserved-A7CF>
> +A7D2 ; Cn # <reserved-A7D2>
> +A7D4 ; Cn # <reserved-A7D4>
> +A7DD..A7F1 ; Cn # [21] <reserved-A7DD>..<reserved-A7F1>
> +A82D..A82F ; Cn # [3] <reserved-A82D>..<reserved-A82F>
> +A83A..A83F ; Cn # [6] <reserved-A83A>..<reserved-A83F>
> +A878..A87F ; Cn # [8] <reserved-A878>..<reserved-A87F>
> +A8C6..A8CD ; Cn # [8] <reserved-A8C6>..<reserved-A8CD>
> +A8DA..A8DF ; Cn # [6] <reserved-A8DA>..<reserved-A8DF>
> +A954..A95E ; Cn # [11] <reserved-A954>..<reserved-A95E>
> +A97D..A97F ; Cn # [3] <reserved-A97D>..<reserved-A97F>
> +A9CE ; Cn # <reserved-A9CE>
> +A9DA..A9DD ; Cn # [4] <reserved-A9DA>..<reserved-A9DD>
> +A9FF ; Cn # <reserved-A9FF>
> +AA37..AA3F ; Cn # [9] <reserved-AA37>..<reserved-AA3F>
> +AA4E..AA4F ; Cn # [2] <reserved-AA4E>..<reserved-AA4F>
> +AA5A..AA5B ; Cn # [2] <reserved-AA5A>..<reserved-AA5B>
> +AAC3..AADA ; Cn # [24] <reserved-AAC3>..<reserved-AADA>
> +AAF7..AB00 ; Cn # [10] <reserved-AAF7>..<reserved-AB00>
> +AB07..AB08 ; Cn # [2] <reserved-AB07>..<reserved-AB08>
> +AB0F..AB10 ; Cn # [2] <reserved-AB0F>..<reserved-AB10>
> +AB17..AB1F ; Cn # [9] <reserved-AB17>..<reserved-AB1F>
> +AB27 ; Cn # <reserved-AB27>
> +AB2F ; Cn # <reserved-AB2F>
> +AB6C..AB6F ; Cn # [4] <reserved-AB6C>..<reserved-AB6F>
> +ABEE..ABEF ; Cn # [2] <reserved-ABEE>..<reserved-ABEF>
> +ABFA..ABFF ; Cn # [6] <reserved-ABFA>..<reserved-ABFF>
> +D7A4..D7AF ; Cn # [12] <reserved-D7A4>..<reserved-D7AF>
> +D7C7..D7CA ; Cn # [4] <reserved-D7C7>..<reserved-D7CA>
> +D7FC..D7FF ; Cn # [4] <reserved-D7FC>..<reserved-D7FF>
> +FA6E..FA6F ; Cn # [2] <reserved-FA6E>..<reserved-FA6F>
> +FADA..FAFF ; Cn # [38] <reserved-FADA>..<reserved-FAFF>
> +FB07..FB12 ; Cn # [12] <reserved-FB07>..<reserved-FB12>
> +FB18..FB1C ; Cn # [5] <reserved-FB18>..<reserved-FB1C>
> +FB37 ; Cn # <reserved-FB37>
> +FB3D ; Cn # <reserved-FB3D>
> +FB3F ; Cn # <reserved-FB3F>
> +FB42 ; Cn # <reserved-FB42>
> +FB45 ; Cn # <reserved-FB45>
> +FBC3..FBD2 ; Cn # [16] <reserved-FBC3>..<reserved-FBD2>
> +FD90..FD91 ; Cn # [2] <reserved-FD90>..<reserved-FD91>
> +FDC8..FDCE ; Cn # [7] <reserved-FDC8>..<reserved-FDCE>
> +FDD0..FDEF ; Cn # [32] <noncharacter-FDD0>..<noncharacter-FDEF>
> +FE1A..FE1F ; Cn # [6] <reserved-FE1A>..<reserved-FE1F>
> +FE53 ; Cn # <reserved-FE53>
> +FE67 ; Cn # <reserved-FE67>
> +FE6C..FE6F ; Cn # [4] <reserved-FE6C>..<reserved-FE6F>
> +FE75 ; Cn # <reserved-FE75>
> +FEFD..FEFE ; Cn # [2] <reserved-FEFD>..<reserved-FEFE>
> +FF00 ; Cn # <reserved-FF00>
> +FFBF..FFC1 ; Cn # [3] <reserved-FFBF>..<reserved-FFC1>
> +FFC8..FFC9 ; Cn # [2] <reserved-FFC8>..<reserved-FFC9>
> +FFD0..FFD1 ; Cn # [2] <reserved-FFD0>..<reserved-FFD1>
> +FFD8..FFD9 ; Cn # [2] <reserved-FFD8>..<reserved-FFD9>
> +FFDD..FFDF ; Cn # [3] <reserved-FFDD>..<reserved-FFDF>
> +FFE7 ; Cn # <reserved-FFE7>
> +FFEF..FFF8 ; Cn # [10] <reserved-FFEF>..<reserved-FFF8>
> +FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
> +1000C ; Cn # <reserved-1000C>
> +10027 ; Cn # <reserved-10027>
> +1003B ; Cn # <reserved-1003B>
> +1003E ; Cn # <reserved-1003E>
> +1004E..1004F ; Cn # [2] <reserved-1004E>..<reserved-1004F>
> +1005E..1007F ; Cn # [34] <reserved-1005E>..<reserved-1007F>
> +100FB..100FF ; Cn # [5] <reserved-100FB>..<reserved-100FF>
> +10103..10106 ; Cn # [4] <reserved-10103>..<reserved-10106>
> +10134..10136 ; Cn # [3] <reserved-10134>..<reserved-10136>
> +1018F ; Cn # <reserved-1018F>
> +1019D..1019F ; Cn # [3] <reserved-1019D>..<reserved-1019F>
> +101A1..101CF ; Cn # [47] <reserved-101A1>..<reserved-101CF>
> +101FE..1027F ; Cn # [130] <reserved-101FE>..<reserved-1027F>
> +1029D..1029F ; Cn # [3] <reserved-1029D>..<reserved-1029F>
> +102D1..102DF ; Cn # [15] <reserved-102D1>..<reserved-102DF>
> +102FC..102FF ; Cn # [4] <reserved-102FC>..<reserved-102FF>
> +10324..1032C ; Cn # [9] <reserved-10324>..<reserved-1032C>
> +1034B..1034F ; Cn # [5] <reserved-1034B>..<reserved-1034F>
> +1037B..1037F ; Cn # [5] <reserved-1037B>..<reserved-1037F>
> +1039E ; Cn # <reserved-1039E>
> +103C4..103C7 ; Cn # [4] <reserved-103C4>..<reserved-103C7>
> +103D6..103FF ; Cn # [42] <reserved-103D6>..<reserved-103FF>
> +1049E..1049F ; Cn # [2] <reserved-1049E>..<reserved-1049F>
> +104AA..104AF ; Cn # [6] <reserved-104AA>..<reserved-104AF>
> +104D4..104D7 ; Cn # [4] <reserved-104D4>..<reserved-104D7>
> +104FC..104FF ; Cn # [4] <reserved-104FC>..<reserved-104FF>
> +10528..1052F ; Cn # [8] <reserved-10528>..<reserved-1052F>
> +10564..1056E ; Cn # [11] <reserved-10564>..<reserved-1056E>
> +1057B ; Cn # <reserved-1057B>
> +1058B ; Cn # <reserved-1058B>
> +10593 ; Cn # <reserved-10593>
> +10596 ; Cn # <reserved-10596>
> +105A2 ; Cn # <reserved-105A2>
> +105B2 ; Cn # <reserved-105B2>
> +105BA ; Cn # <reserved-105BA>
> +105BD..105BF ; Cn # [3] <reserved-105BD>..<reserved-105BF>
> +105F4..105FF ; Cn # [12] <reserved-105F4>..<reserved-105FF>
> +10737..1073F ; Cn # [9] <reserved-10737>..<reserved-1073F>
> +10756..1075F ; Cn # [10] <reserved-10756>..<reserved-1075F>
> +10768..1077F ; Cn # [24] <reserved-10768>..<reserved-1077F>
> +10786 ; Cn # <reserved-10786>
> +107B1 ; Cn # <reserved-107B1>
> +107BB..107FF ; Cn # [69] <reserved-107BB>..<reserved-107FF>
> +10806..10807 ; Cn # [2] <reserved-10806>..<reserved-10807>
> +10809 ; Cn # <reserved-10809>
> +10836 ; Cn # <reserved-10836>
> +10839..1083B ; Cn # [3] <reserved-10839>..<reserved-1083B>
> +1083D..1083E ; Cn # [2] <reserved-1083D>..<reserved-1083E>
> +10856 ; Cn # <reserved-10856>
> +1089F..108A6 ; Cn # [8] <reserved-1089F>..<reserved-108A6>
> +108B0..108DF ; Cn # [48] <reserved-108B0>..<reserved-108DF>
> +108F3 ; Cn # <reserved-108F3>
> +108F6..108FA ; Cn # [5] <reserved-108F6>..<reserved-108FA>
> +1091C..1091E ; Cn # [3] <reserved-1091C>..<reserved-1091E>
> +1093A..1093E ; Cn # [5] <reserved-1093A>..<reserved-1093E>
> +10940..1097F ; Cn # [64] <reserved-10940>..<reserved-1097F>
> +109B8..109BB ; Cn # [4] <reserved-109B8>..<reserved-109BB>
> +109D0..109D1 ; Cn # [2] <reserved-109D0>..<reserved-109D1>
> +10A04 ; Cn # <reserved-10A04>
> +10A07..10A0B ; Cn # [5] <reserved-10A07>..<reserved-10A0B>
> +10A14 ; Cn # <reserved-10A14>
> +10A18 ; Cn # <reserved-10A18>
> +10A36..10A37 ; Cn # [2] <reserved-10A36>..<reserved-10A37>
> +10A3B..10A3E ; Cn # [4] <reserved-10A3B>..<reserved-10A3E>
> +10A49..10A4F ; Cn # [7] <reserved-10A49>..<reserved-10A4F>
> +10A59..10A5F ; Cn # [7] <reserved-10A59>..<reserved-10A5F>
> +10AA0..10ABF ; Cn # [32] <reserved-10AA0>..<reserved-10ABF>
> +10AE7..10AEA ; Cn # [4] <reserved-10AE7>..<reserved-10AEA>
> +10AF7..10AFF ; Cn # [9] <reserved-10AF7>..<reserved-10AFF>
> +10B36..10B38 ; Cn # [3] <reserved-10B36>..<reserved-10B38>
> +10B56..10B57 ; Cn # [2] <reserved-10B56>..<reserved-10B57>
> +10B73..10B77 ; Cn # [5] <reserved-10B73>..<reserved-10B77>
> +10B92..10B98 ; Cn # [7] <reserved-10B92>..<reserved-10B98>
> +10B9D..10BA8 ; Cn # [12] <reserved-10B9D>..<reserved-10BA8>
> +10BB0..10BFF ; Cn # [80] <reserved-10BB0>..<reserved-10BFF>
> +10C49..10C7F ; Cn # [55] <reserved-10C49>..<reserved-10C7F>
> +10CB3..10CBF ; Cn # [13] <reserved-10CB3>..<reserved-10CBF>
> +10CF3..10CF9 ; Cn # [7] <reserved-10CF3>..<reserved-10CF9>
> +10D28..10D2F ; Cn # [8] <reserved-10D28>..<reserved-10D2F>
> +10D3A..10D3F ; Cn # [6] <reserved-10D3A>..<reserved-10D3F>
> +10D66..10D68 ; Cn # [3] <reserved-10D66>..<reserved-10D68>
> +10D86..10D8D ; Cn # [8] <reserved-10D86>..<reserved-10D8D>
> +10D90..10E5F ; Cn # [208] <reserved-10D90>..<reserved-10E5F>
> +10E7F ; Cn # <reserved-10E7F>
> +10EAA ; Cn # <reserved-10EAA>
> +10EAE..10EAF ; Cn # [2] <reserved-10EAE>..<reserved-10EAF>
> +10EB2..10EC1 ; Cn # [16] <reserved-10EB2>..<reserved-10EC1>
> +10EC5..10EFB ; Cn # [55] <reserved-10EC5>..<reserved-10EFB>
> +10F28..10F2F ; Cn # [8] <reserved-10F28>..<reserved-10F2F>
> +10F5A..10F6F ; Cn # [22] <reserved-10F5A>..<reserved-10F6F>
> +10F8A..10FAF ; Cn # [38] <reserved-10F8A>..<reserved-10FAF>
> +10FCC..10FDF ; Cn # [20] <reserved-10FCC>..<reserved-10FDF>
> +10FF7..10FFF ; Cn # [9] <reserved-10FF7>..<reserved-10FFF>
> +1104E..11051 ; Cn # [4] <reserved-1104E>..<reserved-11051>
> +11076..1107E ; Cn # [9] <reserved-11076>..<reserved-1107E>
> +110C3..110CC ; Cn # [10] <reserved-110C3>..<reserved-110CC>
> +110CE..110CF ; Cn # [2] <reserved-110CE>..<reserved-110CF>
> +110E9..110EF ; Cn # [7] <reserved-110E9>..<reserved-110EF>
> +110FA..110FF ; Cn # [6] <reserved-110FA>..<reserved-110FF>
> +11135 ; Cn # <reserved-11135>
> +11148..1114F ; Cn # [8] <reserved-11148>..<reserved-1114F>
> +11177..1117F ; Cn # [9] <reserved-11177>..<reserved-1117F>
> +111E0 ; Cn # <reserved-111E0>
> +111F5..111FF ; Cn # [11] <reserved-111F5>..<reserved-111FF>
> +11212 ; Cn # <reserved-11212>
> +11242..1127F ; Cn # [62] <reserved-11242>..<reserved-1127F>
> +11287 ; Cn # <reserved-11287>
> +11289 ; Cn # <reserved-11289>
> +1128E ; Cn # <reserved-1128E>
> +1129E ; Cn # <reserved-1129E>
> +112AA..112AF ; Cn # [6] <reserved-112AA>..<reserved-112AF>
> +112EB..112EF ; Cn # [5] <reserved-112EB>..<reserved-112EF>
> +112FA..112FF ; Cn # [6] <reserved-112FA>..<reserved-112FF>
> +11304 ; Cn # <reserved-11304>
> +1130D..1130E ; Cn # [2] <reserved-1130D>..<reserved-1130E>
> +11311..11312 ; Cn # [2] <reserved-11311>..<reserved-11312>
> +11329 ; Cn # <reserved-11329>
> +11331 ; Cn # <reserved-11331>
> +11334 ; Cn # <reserved-11334>
> +1133A ; Cn # <reserved-1133A>
> +11345..11346 ; Cn # [2] <reserved-11345>..<reserved-11346>
> +11349..1134A ; Cn # [2] <reserved-11349>..<reserved-1134A>
> +1134E..1134F ; Cn # [2] <reserved-1134E>..<reserved-1134F>
> +11351..11356 ; Cn # [6] <reserved-11351>..<reserved-11356>
> +11358..1135C ; Cn # [5] <reserved-11358>..<reserved-1135C>
> +11364..11365 ; Cn # [2] <reserved-11364>..<reserved-11365>
> +1136D..1136F ; Cn # [3] <reserved-1136D>..<reserved-1136F>
> +11375..1137F ; Cn # [11] <reserved-11375>..<reserved-1137F>
> +1138A ; Cn # <reserved-1138A>
> +1138C..1138D ; Cn # [2] <reserved-1138C>..<reserved-1138D>
> +1138F ; Cn # <reserved-1138F>
> +113B6 ; Cn # <reserved-113B6>
> +113C1 ; Cn # <reserved-113C1>
> +113C3..113C4 ; Cn # [2] <reserved-113C3>..<reserved-113C4>
> +113C6 ; Cn # <reserved-113C6>
> +113CB ; Cn # <reserved-113CB>
> +113D6 ; Cn # <reserved-113D6>
> +113D9..113E0 ; Cn # [8] <reserved-113D9>..<reserved-113E0>
> +113E3..113FF ; Cn # [29] <reserved-113E3>..<reserved-113FF>
> +1145C ; Cn # <reserved-1145C>
> +11462..1147F ; Cn # [30] <reserved-11462>..<reserved-1147F>
> +114C8..114CF ; Cn # [8] <reserved-114C8>..<reserved-114CF>
> +114DA..1157F ; Cn # [166] <reserved-114DA>..<reserved-1157F>
> +115B6..115B7 ; Cn # [2] <reserved-115B6>..<reserved-115B7>
> +115DE..115FF ; Cn # [34] <reserved-115DE>..<reserved-115FF>
> +11645..1164F ; Cn # [11] <reserved-11645>..<reserved-1164F>
> +1165A..1165F ; Cn # [6] <reserved-1165A>..<reserved-1165F>
> +1166D..1167F ; Cn # [19] <reserved-1166D>..<reserved-1167F>
> +116BA..116BF ; Cn # [6] <reserved-116BA>..<reserved-116BF>
> +116CA..116CF ; Cn # [6] <reserved-116CA>..<reserved-116CF>
> +116E4..116FF ; Cn # [28] <reserved-116E4>..<reserved-116FF>
> +1171B..1171C ; Cn # [2] <reserved-1171B>..<reserved-1171C>
> +1172C..1172F ; Cn # [4] <reserved-1172C>..<reserved-1172F>
> +11747..117FF ; Cn # [185] <reserved-11747>..<reserved-117FF>
> +1183C..1189F ; Cn # [100] <reserved-1183C>..<reserved-1189F>
> +118F3..118FE ; Cn # [12] <reserved-118F3>..<reserved-118FE>
> +11907..11908 ; Cn # [2] <reserved-11907>..<reserved-11908>
> +1190A..1190B ; Cn # [2] <reserved-1190A>..<reserved-1190B>
> +11914 ; Cn # <reserved-11914>
> +11917 ; Cn # <reserved-11917>
> +11936 ; Cn # <reserved-11936>
> +11939..1193A ; Cn # [2] <reserved-11939>..<reserved-1193A>
> +11947..1194F ; Cn # [9] <reserved-11947>..<reserved-1194F>
> +1195A..1199F ; Cn # [70] <reserved-1195A>..<reserved-1199F>
> +119A8..119A9 ; Cn # [2] <reserved-119A8>..<reserved-119A9>
> +119D8..119D9 ; Cn # [2] <reserved-119D8>..<reserved-119D9>
> +119E5..119FF ; Cn # [27] <reserved-119E5>..<reserved-119FF>
> +11A48..11A4F ; Cn # [8] <reserved-11A48>..<reserved-11A4F>
> +11AA3..11AAF ; Cn # [13] <reserved-11AA3>..<reserved-11AAF>
> +11AF9..11AFF ; Cn # [7] <reserved-11AF9>..<reserved-11AFF>
> +11B0A..11BBF ; Cn # [182] <reserved-11B0A>..<reserved-11BBF>
> +11BE2..11BEF ; Cn # [14] <reserved-11BE2>..<reserved-11BEF>
> +11BFA..11BFF ; Cn # [6] <reserved-11BFA>..<reserved-11BFF>
> +11C09 ; Cn # <reserved-11C09>
> +11C37 ; Cn # <reserved-11C37>
> +11C46..11C4F ; Cn # [10] <reserved-11C46>..<reserved-11C4F>
> +11C6D..11C6F ; Cn # [3] <reserved-11C6D>..<reserved-11C6F>
> +11C90..11C91 ; Cn # [2] <reserved-11C90>..<reserved-11C91>
> +11CA8 ; Cn # <reserved-11CA8>
> +11CB7..11CFF ; Cn # [73] <reserved-11CB7>..<reserved-11CFF>
> +11D07 ; Cn # <reserved-11D07>
> +11D0A ; Cn # <reserved-11D0A>
> +11D37..11D39 ; Cn # [3] <reserved-11D37>..<reserved-11D39>
> +11D3B ; Cn # <reserved-11D3B>
> +11D3E ; Cn # <reserved-11D3E>
> +11D48..11D4F ; Cn # [8] <reserved-11D48>..<reserved-11D4F>
> +11D5A..11D5F ; Cn # [6] <reserved-11D5A>..<reserved-11D5F>
> +11D66 ; Cn # <reserved-11D66>
> +11D69 ; Cn # <reserved-11D69>
> +11D8F ; Cn # <reserved-11D8F>
> +11D92 ; Cn # <reserved-11D92>
> +11D99..11D9F ; Cn # [7] <reserved-11D99>..<reserved-11D9F>
> +11DAA..11EDF ; Cn # [310] <reserved-11DAA>..<reserved-11EDF>
> +11EF9..11EFF ; Cn # [7] <reserved-11EF9>..<reserved-11EFF>
> +11F11 ; Cn # <reserved-11F11>
> +11F3B..11F3D ; Cn # [3] <reserved-11F3B>..<reserved-11F3D>
> +11F5B..11FAF ; Cn # [85] <reserved-11F5B>..<reserved-11FAF>
> +11FB1..11FBF ; Cn # [15] <reserved-11FB1>..<reserved-11FBF>
> +11FF2..11FFE ; Cn # [13] <reserved-11FF2>..<reserved-11FFE>
> +1239A..123FF ; Cn # [102] <reserved-1239A>..<reserved-123FF>
> +1246F ; Cn # <reserved-1246F>
> +12475..1247F ; Cn # [11] <reserved-12475>..<reserved-1247F>
> +12544..12F8F ; Cn # [2636] <reserved-12544>..<reserved-12F8F>
> +12FF3..12FFF ; Cn # [13] <reserved-12FF3>..<reserved-12FFF>
> +13456..1345F ; Cn # [10] <reserved-13456>..<reserved-1345F>
> +143FB..143FF ; Cn # [5] <reserved-143FB>..<reserved-143FF>
> +14647..160FF ; Cn # [6841] <reserved-14647>..<reserved-160FF>
> +1613A..167FF ; Cn # [1734] <reserved-1613A>..<reserved-167FF>
> +16A39..16A3F ; Cn # [7] <reserved-16A39>..<reserved-16A3F>
> +16A5F ; Cn # <reserved-16A5F>
> +16A6A..16A6D ; Cn # [4] <reserved-16A6A>..<reserved-16A6D>
> +16ABF ; Cn # <reserved-16ABF>
> +16ACA..16ACF ; Cn # [6] <reserved-16ACA>..<reserved-16ACF>
> +16AEE..16AEF ; Cn # [2] <reserved-16AEE>..<reserved-16AEF>
> +16AF6..16AFF ; Cn # [10] <reserved-16AF6>..<reserved-16AFF>
> +16B46..16B4F ; Cn # [10] <reserved-16B46>..<reserved-16B4F>
> +16B5A ; Cn # <reserved-16B5A>
> +16B62 ; Cn # <reserved-16B62>
> +16B78..16B7C ; Cn # [5] <reserved-16B78>..<reserved-16B7C>
> +16B90..16D3F ; Cn # [432] <reserved-16B90>..<reserved-16D3F>
> +16D7A..16E3F ; Cn # [198] <reserved-16D7A>..<reserved-16E3F>
> +16E9B..16EFF ; Cn # [101] <reserved-16E9B>..<reserved-16EFF>
> +16F4B..16F4E ; Cn # [4] <reserved-16F4B>..<reserved-16F4E>
> +16F88..16F8E ; Cn # [7] <reserved-16F88>..<reserved-16F8E>
> +16FA0..16FDF ; Cn # [64] <reserved-16FA0>..<reserved-16FDF>
> +16FE5..16FEF ; Cn # [11] <reserved-16FE5>..<reserved-16FEF>
> +16FF2..16FFF ; Cn # [14] <reserved-16FF2>..<reserved-16FFF>
> +187F8..187FF ; Cn # [8] <reserved-187F8>..<reserved-187FF>
> +18CD6..18CFE ; Cn # [41] <reserved-18CD6>..<reserved-18CFE>
> +18D09..1AFEF ; Cn # [8935] <reserved-18D09>..<reserved-1AFEF>
> +1AFF4 ; Cn # <reserved-1AFF4>
> +1AFFC ; Cn # <reserved-1AFFC>
> +1AFFF ; Cn # <reserved-1AFFF>
> +1B123..1B131 ; Cn # [15] <reserved-1B123>..<reserved-1B131>
> +1B133..1B14F ; Cn # [29] <reserved-1B133>..<reserved-1B14F>
> +1B153..1B154 ; Cn # [2] <reserved-1B153>..<reserved-1B154>
> +1B156..1B163 ; Cn # [14] <reserved-1B156>..<reserved-1B163>
> +1B168..1B16F ; Cn # [8] <reserved-1B168>..<reserved-1B16F>
> +1B2FC..1BBFF ; Cn # [2308] <reserved-1B2FC>..<reserved-1BBFF>
> +1BC6B..1BC6F ; Cn # [5] <reserved-1BC6B>..<reserved-1BC6F>
> +1BC7D..1BC7F ; Cn # [3] <reserved-1BC7D>..<reserved-1BC7F>
> +1BC89..1BC8F ; Cn # [7] <reserved-1BC89>..<reserved-1BC8F>
> +1BC9A..1BC9B ; Cn # [2] <reserved-1BC9A>..<reserved-1BC9B>
> +1BCA4..1CBFF ; Cn # [3932] <reserved-1BCA4>..<reserved-1CBFF>
> +1CCFA..1CCFF ; Cn # [6] <reserved-1CCFA>..<reserved-1CCFF>
> +1CEB4..1CEFF ; Cn # [76] <reserved-1CEB4>..<reserved-1CEFF>
> +1CF2E..1CF2F ; Cn # [2] <reserved-1CF2E>..<reserved-1CF2F>
> +1CF47..1CF4F ; Cn # [9] <reserved-1CF47>..<reserved-1CF4F>
> +1CFC4..1CFFF ; Cn # [60] <reserved-1CFC4>..<reserved-1CFFF>
> +1D0F6..1D0FF ; Cn # [10] <reserved-1D0F6>..<reserved-1D0FF>
> +1D127..1D128 ; Cn # [2] <reserved-1D127>..<reserved-1D128>
> +1D1EB..1D1FF ; Cn # [21] <reserved-1D1EB>..<reserved-1D1FF>
> +1D246..1D2BF ; Cn # [122] <reserved-1D246>..<reserved-1D2BF>
> +1D2D4..1D2DF ; Cn # [12] <reserved-1D2D4>..<reserved-1D2DF>
> +1D2F4..1D2FF ; Cn # [12] <reserved-1D2F4>..<reserved-1D2FF>
> +1D357..1D35F ; Cn # [9] <reserved-1D357>..<reserved-1D35F>
> +1D379..1D3FF ; Cn # [135] <reserved-1D379>..<reserved-1D3FF>
> +1D455 ; Cn # <reserved-1D455>
> +1D49D ; Cn # <reserved-1D49D>
> +1D4A0..1D4A1 ; Cn # [2] <reserved-1D4A0>..<reserved-1D4A1>
> +1D4A3..1D4A4 ; Cn # [2] <reserved-1D4A3>..<reserved-1D4A4>
> +1D4A7..1D4A8 ; Cn # [2] <reserved-1D4A7>..<reserved-1D4A8>
> +1D4AD ; Cn # <reserved-1D4AD>
> +1D4BA ; Cn # <reserved-1D4BA>
> +1D4BC ; Cn # <reserved-1D4BC>
> +1D4C4 ; Cn # <reserved-1D4C4>
> +1D506 ; Cn # <reserved-1D506>
> +1D50B..1D50C ; Cn # [2] <reserved-1D50B>..<reserved-1D50C>
> +1D515 ; Cn # <reserved-1D515>
> +1D51D ; Cn # <reserved-1D51D>
> +1D53A ; Cn # <reserved-1D53A>
> +1D53F ; Cn # <reserved-1D53F>
> +1D545 ; Cn # <reserved-1D545>
> +1D547..1D549 ; Cn # [3] <reserved-1D547>..<reserved-1D549>
> +1D551 ; Cn # <reserved-1D551>
> +1D6A6..1D6A7 ; Cn # [2] <reserved-1D6A6>..<reserved-1D6A7>
> +1D7CC..1D7CD ; Cn # [2] <reserved-1D7CC>..<reserved-1D7CD>
> +1DA8C..1DA9A ; Cn # [15] <reserved-1DA8C>..<reserved-1DA9A>
> +1DAA0 ; Cn # <reserved-1DAA0>
> +1DAB0..1DEFF ; Cn # [1104] <reserved-1DAB0>..<reserved-1DEFF>
> +1DF1F..1DF24 ; Cn # [6] <reserved-1DF1F>..<reserved-1DF24>
> +1DF2B..1DFFF ; Cn # [213] <reserved-1DF2B>..<reserved-1DFFF>
> +1E007 ; Cn # <reserved-1E007>
> +1E019..1E01A ; Cn # [2] <reserved-1E019>..<reserved-1E01A>
> +1E022 ; Cn # <reserved-1E022>
> +1E025 ; Cn # <reserved-1E025>
> +1E02B..1E02F ; Cn # [5] <reserved-1E02B>..<reserved-1E02F>
> +1E06E..1E08E ; Cn # [33] <reserved-1E06E>..<reserved-1E08E>
> +1E090..1E0FF ; Cn # [112] <reserved-1E090>..<reserved-1E0FF>
> +1E12D..1E12F ; Cn # [3] <reserved-1E12D>..<reserved-1E12F>
> +1E13E..1E13F ; Cn # [2] <reserved-1E13E>..<reserved-1E13F>
> +1E14A..1E14D ; Cn # [4] <reserved-1E14A>..<reserved-1E14D>
> +1E150..1E28F ; Cn # [320] <reserved-1E150>..<reserved-1E28F>
> +1E2AF..1E2BF ; Cn # [17] <reserved-1E2AF>..<reserved-1E2BF>
> +1E2FA..1E2FE ; Cn # [5] <reserved-1E2FA>..<reserved-1E2FE>
> +1E300..1E4CF ; Cn # [464] <reserved-1E300>..<reserved-1E4CF>
> +1E4FA..1E5CF ; Cn # [214] <reserved-1E4FA>..<reserved-1E5CF>
> +1E5FB..1E5FE ; Cn # [4] <reserved-1E5FB>..<reserved-1E5FE>
> +1E600..1E7DF ; Cn # [480] <reserved-1E600>..<reserved-1E7DF>
> +1E7E7 ; Cn # <reserved-1E7E7>
> +1E7EC ; Cn # <reserved-1E7EC>
> +1E7EF ; Cn # <reserved-1E7EF>
> +1E7FF ; Cn # <reserved-1E7FF>
> +1E8C5..1E8C6 ; Cn # [2] <reserved-1E8C5>..<reserved-1E8C6>
> +1E8D7..1E8FF ; Cn # [41] <reserved-1E8D7>..<reserved-1E8FF>
> +1E94C..1E94F ; Cn # [4] <reserved-1E94C>..<reserved-1E94F>
> +1E95A..1E95D ; Cn # [4] <reserved-1E95A>..<reserved-1E95D>
> +1E960..1EC70 ; Cn # [785] <reserved-1E960>..<reserved-1EC70>
> +1ECB5..1ED00 ; Cn # [76] <reserved-1ECB5>..<reserved-1ED00>
> +1ED3E..1EDFF ; Cn # [194] <reserved-1ED3E>..<reserved-1EDFF>
> +1EE04 ; Cn # <reserved-1EE04>
> +1EE20 ; Cn # <reserved-1EE20>
> +1EE23 ; Cn # <reserved-1EE23>
> +1EE25..1EE26 ; Cn # [2] <reserved-1EE25>..<reserved-1EE26>
> +1EE28 ; Cn # <reserved-1EE28>
> +1EE33 ; Cn # <reserved-1EE33>
> +1EE38 ; Cn # <reserved-1EE38>
> +1EE3A ; Cn # <reserved-1EE3A>
> +1EE3C..1EE41 ; Cn # [6] <reserved-1EE3C>..<reserved-1EE41>
> +1EE43..1EE46 ; Cn # [4] <reserved-1EE43>..<reserved-1EE46>
> +1EE48 ; Cn # <reserved-1EE48>
> +1EE4A ; Cn # <reserved-1EE4A>
> +1EE4C ; Cn # <reserved-1EE4C>
> +1EE50 ; Cn # <reserved-1EE50>
> +1EE53 ; Cn # <reserved-1EE53>
> +1EE55..1EE56 ; Cn # [2] <reserved-1EE55>..<reserved-1EE56>
> +1EE58 ; Cn # <reserved-1EE58>
> +1EE5A ; Cn # <reserved-1EE5A>
> +1EE5C ; Cn # <reserved-1EE5C>
> +1EE5E ; Cn # <reserved-1EE5E>
> +1EE60 ; Cn # <reserved-1EE60>
> +1EE63 ; Cn # <reserved-1EE63>
> +1EE65..1EE66 ; Cn # [2] <reserved-1EE65>..<reserved-1EE66>
> +1EE6B ; Cn # <reserved-1EE6B>
> +1EE73 ; Cn # <reserved-1EE73>
> +1EE78 ; Cn # <reserved-1EE78>
> +1EE7D ; Cn # <reserved-1EE7D>
> +1EE7F ; Cn # <reserved-1EE7F>
> +1EE8A ; Cn # <reserved-1EE8A>
> +1EE9C..1EEA0 ; Cn # [5] <reserved-1EE9C>..<reserved-1EEA0>
> +1EEA4 ; Cn # <reserved-1EEA4>
> +1EEAA ; Cn # <reserved-1EEAA>
> +1EEBC..1EEEF ; Cn # [52] <reserved-1EEBC>..<reserved-1EEEF>
> +1EEF2..1EFFF ; Cn # [270] <reserved-1EEF2>..<reserved-1EFFF>
> +1F02C..1F02F ; Cn # [4] <reserved-1F02C>..<reserved-1F02F>
> +1F094..1F09F ; Cn # [12] <reserved-1F094>..<reserved-1F09F>
> +1F0AF..1F0B0 ; Cn # [2] <reserved-1F0AF>..<reserved-1F0B0>
> +1F0C0 ; Cn # <reserved-1F0C0>
> +1F0D0 ; Cn # <reserved-1F0D0>
> +1F0F6..1F0FF ; Cn # [10] <reserved-1F0F6>..<reserved-1F0FF>
> +1F1AE..1F1E5 ; Cn # [56] <reserved-1F1AE>..<reserved-1F1E5>
> +1F203..1F20F ; Cn # [13] <reserved-1F203>..<reserved-1F20F>
> +1F23C..1F23F ; Cn # [4] <reserved-1F23C>..<reserved-1F23F>
> +1F249..1F24F ; Cn # [7] <reserved-1F249>..<reserved-1F24F>
> +1F252..1F25F ; Cn # [14] <reserved-1F252>..<reserved-1F25F>
> +1F266..1F2FF ; Cn # [154] <reserved-1F266>..<reserved-1F2FF>
> +1F6D8..1F6DB ; Cn # [4] <reserved-1F6D8>..<reserved-1F6DB>
> +1F6ED..1F6EF ; Cn # [3] <reserved-1F6ED>..<reserved-1F6EF>
> +1F6FD..1F6FF ; Cn # [3] <reserved-1F6FD>..<reserved-1F6FF>
> +1F777..1F77A ; Cn # [4] <reserved-1F777>..<reserved-1F77A>
> +1F7DA..1F7DF ; Cn # [6] <reserved-1F7DA>..<reserved-1F7DF>
> +1F7EC..1F7EF ; Cn # [4] <reserved-1F7EC>..<reserved-1F7EF>
> +1F7F1..1F7FF ; Cn # [15] <reserved-1F7F1>..<reserved-1F7FF>
> +1F80C..1F80F ; Cn # [4] <reserved-1F80C>..<reserved-1F80F>
> +1F848..1F84F ; Cn # [8] <reserved-1F848>..<reserved-1F84F>
> +1F85A..1F85F ; Cn # [6] <reserved-1F85A>..<reserved-1F85F>
> +1F888..1F88F ; Cn # [8] <reserved-1F888>..<reserved-1F88F>
> +1F8AE..1F8AF ; Cn # [2] <reserved-1F8AE>..<reserved-1F8AF>
> +1F8BC..1F8BF ; Cn # [4] <reserved-1F8BC>..<reserved-1F8BF>
> +1F8C2..1F8FF ; Cn # [62] <reserved-1F8C2>..<reserved-1F8FF>
> +1FA54..1FA5F ; Cn # [12] <reserved-1FA54>..<reserved-1FA5F>
> +1FA6E..1FA6F ; Cn # [2] <reserved-1FA6E>..<reserved-1FA6F>
> +1FA7D..1FA7F ; Cn # [3] <reserved-1FA7D>..<reserved-1FA7F>
> +1FA8A..1FA8E ; Cn # [5] <reserved-1FA8A>..<reserved-1FA8E>
> +1FAC7..1FACD ; Cn # [7] <reserved-1FAC7>..<reserved-1FACD>
> +1FADD..1FADE ; Cn # [2] <reserved-1FADD>..<reserved-1FADE>
> +1FAEA..1FAEF ; Cn # [6] <reserved-1FAEA>..<reserved-1FAEF>
> +1FAF9..1FAFF ; Cn # [7] <reserved-1FAF9>..<reserved-1FAFF>
> +1FB93 ; Cn # <reserved-1FB93>
> +1FBFA..1FFFF ; Cn # [1030] <reserved-1FBFA>..<noncharacter-1FFFF>
> +2A6E0..2A6FF ; Cn # [32] <reserved-2A6E0>..<reserved-2A6FF>
> +2B73A..2B73F ; Cn # [6] <reserved-2B73A>..<reserved-2B73F>
> +2B81E..2B81F ; Cn # [2] <reserved-2B81E>..<reserved-2B81F>
> +2CEA2..2CEAF ; Cn # [14] <reserved-2CEA2>..<reserved-2CEAF>
> +2EBE1..2EBEF ; Cn # [15] <reserved-2EBE1>..<reserved-2EBEF>
> +2EE5E..2F7FF ; Cn # [2466] <reserved-2EE5E>..<reserved-2F7FF>
> +2FA1E..2FFFF ; Cn # [1506] <reserved-2FA1E>..<noncharacter-2FFFF>
> +3134B..3134F ; Cn # [5] <reserved-3134B>..<reserved-3134F>
> +323B0..E0000 ; Cn # [711761] <reserved-323B0>..<reserved-E0000>
> +E0002..E001F ; Cn # [30] <reserved-E0002>..<reserved-E001F>
> +E0080..E00FF ; Cn # [128] <reserved-E0080>..<reserved-E00FF>
> +E01F0..EFFFF ; Cn # [65040] <reserved-E01F0>..<noncharacter-EFFFF>
> +FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
> +10FFFE..10FFFF; Cn # [2] <noncharacter-10FFFE>..<noncharacter-10FFFF>
> +
> +# Total code points: 819533
> +
> +# ================================================
> +
> +# General_Category=Uppercase_Letter
> +
> +0041..005A ; Lu # [26] LATIN CAPITAL LETTER A..LATIN CAPITAL LETTER Z
> +00C0..00D6 ; Lu # [23] LATIN CAPITAL LETTER A WITH GRAVE..LATIN CAPITAL LETTER O WITH DIAERESIS
> +00D8..00DE ; Lu # [7] LATIN CAPITAL LETTER O WITH STROKE..LATIN CAPITAL LETTER THORN
> +0100 ; Lu # LATIN CAPITAL LETTER A WITH MACRON
> +0102 ; Lu # LATIN CAPITAL LETTER A WITH BREVE
> +0104 ; Lu # LATIN CAPITAL LETTER A WITH OGONEK
> +0106 ; Lu # LATIN CAPITAL LETTER C WITH ACUTE
> +0108 ; Lu # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
> +010A ; Lu # LATIN CAPITAL LETTER C WITH DOT ABOVE
> +010C ; Lu # LATIN CAPITAL LETTER C WITH CARON
> +010E ; Lu # LATIN CAPITAL LETTER D WITH CARON
> +0110 ; Lu # LATIN CAPITAL LETTER D WITH STROKE
> +0112 ; Lu # LATIN CAPITAL LETTER E WITH MACRON
> +0114 ; Lu # LATIN CAPITAL LETTER E WITH BREVE
> +0116 ; Lu # LATIN CAPITAL LETTER E WITH DOT ABOVE
> +0118 ; Lu # LATIN CAPITAL LETTER E WITH OGONEK
> +011A ; Lu # LATIN CAPITAL LETTER E WITH CARON
> +011C ; Lu # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
> +011E ; Lu # LATIN CAPITAL LETTER G WITH BREVE
> +0120 ; Lu # LATIN CAPITAL LETTER G WITH DOT ABOVE
> +0122 ; Lu # LATIN CAPITAL LETTER G WITH CEDILLA
> +0124 ; Lu # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
> +0126 ; Lu # LATIN CAPITAL LETTER H WITH STROKE
> +0128 ; Lu # LATIN CAPITAL LETTER I WITH TILDE
> +012A ; Lu # LATIN CAPITAL LETTER I WITH MACRON
> +012C ; Lu # LATIN CAPITAL LETTER I WITH BREVE
> +012E ; Lu # LATIN CAPITAL LETTER I WITH OGONEK
> +0130 ; Lu # LATIN CAPITAL LETTER I WITH DOT ABOVE
> +0132 ; Lu # LATIN CAPITAL LIGATURE IJ
> +0134 ; Lu # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
> +0136 ; Lu # LATIN CAPITAL LETTER K WITH CEDILLA
> +0139 ; Lu # LATIN CAPITAL LETTER L WITH ACUTE
> +013B ; Lu # LATIN CAPITAL LETTER L WITH CEDILLA
> +013D ; Lu # LATIN CAPITAL LETTER L WITH CARON
> +013F ; Lu # LATIN CAPITAL LETTER L WITH MIDDLE DOT
> +0141 ; Lu # LATIN CAPITAL LETTER L WITH STROKE
> +0143 ; Lu # LATIN CAPITAL LETTER N WITH ACUTE
> +0145 ; Lu # LATIN CAPITAL LETTER N WITH CEDILLA
> +0147 ; Lu # LATIN CAPITAL LETTER N WITH CARON
> +014A ; Lu # LATIN CAPITAL LETTER ENG
> +014C ; Lu # LATIN CAPITAL LETTER O WITH MACRON
> +014E ; Lu # LATIN CAPITAL LETTER O WITH BREVE
> +0150 ; Lu # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
> +0152 ; Lu # LATIN CAPITAL LIGATURE OE
> +0154 ; Lu # LATIN CAPITAL LETTER R WITH ACUTE
> +0156 ; Lu # LATIN CAPITAL LETTER R WITH CEDILLA
> +0158 ; Lu # LATIN CAPITAL LETTER R WITH CARON
> +015A ; Lu # LATIN CAPITAL LETTER S WITH ACUTE
> +015C ; Lu # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
> +015E ; Lu # LATIN CAPITAL LETTER S WITH CEDILLA
> +0160 ; Lu # LATIN CAPITAL LETTER S WITH CARON
> +0162 ; Lu # LATIN CAPITAL LETTER T WITH CEDILLA
> +0164 ; Lu # LATIN CAPITAL LETTER T WITH CARON
> +0166 ; Lu # LATIN CAPITAL LETTER T WITH STROKE
> +0168 ; Lu # LATIN CAPITAL LETTER U WITH TILDE
> +016A ; Lu # LATIN CAPITAL LETTER U WITH MACRON
> +016C ; Lu # LATIN CAPITAL LETTER U WITH BREVE
> +016E ; Lu # LATIN CAPITAL LETTER U WITH RING ABOVE
> +0170 ; Lu # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
> +0172 ; Lu # LATIN CAPITAL LETTER U WITH OGONEK
> +0174 ; Lu # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
> +0176 ; Lu # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
> +0178..0179 ; Lu # [2] LATIN CAPITAL LETTER Y WITH DIAERESIS..LATIN CAPITAL LETTER Z WITH ACUTE
> +017B ; Lu # LATIN CAPITAL LETTER Z WITH DOT ABOVE
> +017D ; Lu # LATIN CAPITAL LETTER Z WITH CARON
> +0181..0182 ; Lu # [2] LATIN CAPITAL LETTER B WITH HOOK..LATIN CAPITAL LETTER B WITH TOPBAR
> +0184 ; Lu # LATIN CAPITAL LETTER TONE SIX
> +0186..0187 ; Lu # [2] LATIN CAPITAL LETTER OPEN O..LATIN CAPITAL LETTER C WITH HOOK
> +0189..018B ; Lu # [3] LATIN CAPITAL LETTER AFRICAN D..LATIN CAPITAL LETTER D WITH TOPBAR
> +018E..0191 ; Lu # [4] LATIN CAPITAL LETTER REVERSED E..LATIN CAPITAL LETTER F WITH HOOK
> +0193..0194 ; Lu # [2] LATIN CAPITAL LETTER G WITH HOOK..LATIN CAPITAL LETTER GAMMA
> +0196..0198 ; Lu # [3] LATIN CAPITAL LETTER IOTA..LATIN CAPITAL LETTER K WITH HOOK
> +019C..019D ; Lu # [2] LATIN CAPITAL LETTER TURNED M..LATIN CAPITAL LETTER N WITH LEFT HOOK
> +019F..01A0 ; Lu # [2] LATIN CAPITAL LETTER O WITH MIDDLE TILDE..LATIN CAPITAL LETTER O WITH HORN
> +01A2 ; Lu # LATIN CAPITAL LETTER OI
> +01A4 ; Lu # LATIN CAPITAL LETTER P WITH HOOK
> +01A6..01A7 ; Lu # [2] LATIN LETTER YR..LATIN CAPITAL LETTER TONE TWO
> +01A9 ; Lu # LATIN CAPITAL LETTER ESH
> +01AC ; Lu # LATIN CAPITAL LETTER T WITH HOOK
> +01AE..01AF ; Lu # [2] LATIN CAPITAL LETTER T WITH RETROFLEX HOOK..LATIN CAPITAL LETTER U WITH HORN
> +01B1..01B3 ; Lu # [3] LATIN CAPITAL LETTER UPSILON..LATIN CAPITAL LETTER Y WITH HOOK
> +01B5 ; Lu # LATIN CAPITAL LETTER Z WITH STROKE
> +01B7..01B8 ; Lu # [2] LATIN CAPITAL LETTER EZH..LATIN CAPITAL LETTER EZH REVERSED
> +01BC ; Lu # LATIN CAPITAL LETTER TONE FIVE
> +01C4 ; Lu # LATIN CAPITAL LETTER DZ WITH CARON
> +01C7 ; Lu # LATIN CAPITAL LETTER LJ
> +01CA ; Lu # LATIN CAPITAL LETTER NJ
> +01CD ; Lu # LATIN CAPITAL LETTER A WITH CARON
> +01CF ; Lu # LATIN CAPITAL LETTER I WITH CARON
> +01D1 ; Lu # LATIN CAPITAL LETTER O WITH CARON
> +01D3 ; Lu # LATIN CAPITAL LETTER U WITH CARON
> +01D5 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
> +01D7 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
> +01D9 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
> +01DB ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
> +01DE ; Lu # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
> +01E0 ; Lu # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
> +01E2 ; Lu # LATIN CAPITAL LETTER AE WITH MACRON
> +01E4 ; Lu # LATIN CAPITAL LETTER G WITH STROKE
> +01E6 ; Lu # LATIN CAPITAL LETTER G WITH CARON
> +01E8 ; Lu # LATIN CAPITAL LETTER K WITH CARON
> +01EA ; Lu # LATIN CAPITAL LETTER O WITH OGONEK
> +01EC ; Lu # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
> +01EE ; Lu # LATIN CAPITAL LETTER EZH WITH CARON
> +01F1 ; Lu # LATIN CAPITAL LETTER DZ
> +01F4 ; Lu # LATIN CAPITAL LETTER G WITH ACUTE
> +01F6..01F8 ; Lu # [3] LATIN CAPITAL LETTER HWAIR..LATIN CAPITAL LETTER N WITH GRAVE
> +01FA ; Lu # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
> +01FC ; Lu # LATIN CAPITAL LETTER AE WITH ACUTE
> +01FE ; Lu # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE
> +0200 ; Lu # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE
> +0202 ; Lu # LATIN CAPITAL LETTER A WITH INVERTED BREVE
> +0204 ; Lu # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE
> +0206 ; Lu # LATIN CAPITAL LETTER E WITH INVERTED BREVE
> +0208 ; Lu # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE
> +020A ; Lu # LATIN CAPITAL LETTER I WITH INVERTED BREVE
> +020C ; Lu # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE
> +020E ; Lu # LATIN CAPITAL LETTER O WITH INVERTED BREVE
> +0210 ; Lu # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE
> +0212 ; Lu # LATIN CAPITAL LETTER R WITH INVERTED BREVE
> +0214 ; Lu # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE
> +0216 ; Lu # LATIN CAPITAL LETTER U WITH INVERTED BREVE
> +0218 ; Lu # LATIN CAPITAL LETTER S WITH COMMA BELOW
> +021A ; Lu # LATIN CAPITAL LETTER T WITH COMMA BELOW
> +021C ; Lu # LATIN CAPITAL LETTER YOGH
> +021E ; Lu # LATIN CAPITAL LETTER H WITH CARON
> +0220 ; Lu # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
> +0222 ; Lu # LATIN CAPITAL LETTER OU
> +0224 ; Lu # LATIN CAPITAL LETTER Z WITH HOOK
> +0226 ; Lu # LATIN CAPITAL LETTER A WITH DOT ABOVE
> +0228 ; Lu # LATIN CAPITAL LETTER E WITH CEDILLA
> +022A ; Lu # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON
> +022C ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND MACRON
> +022E ; Lu # LATIN CAPITAL LETTER O WITH DOT ABOVE
> +0230 ; Lu # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON
> +0232 ; Lu # LATIN CAPITAL LETTER Y WITH MACRON
> +023A..023B ; Lu # [2] LATIN CAPITAL LETTER A WITH STROKE..LATIN CAPITAL LETTER C WITH STROKE
> +023D..023E ; Lu # [2] LATIN CAPITAL LETTER L WITH BAR..LATIN CAPITAL LETTER T WITH DIAGONAL STROKE
> +0241 ; Lu # LATIN CAPITAL LETTER GLOTTAL STOP
> +0243..0246 ; Lu # [4] LATIN CAPITAL LETTER B WITH STROKE..LATIN CAPITAL LETTER E WITH STROKE
> +0248 ; Lu # LATIN CAPITAL LETTER J WITH STROKE
> +024A ; Lu # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL
> +024C ; Lu # LATIN CAPITAL LETTER R WITH STROKE
> +024E ; Lu # LATIN CAPITAL LETTER Y WITH STROKE
> +0370 ; Lu # GREEK CAPITAL LETTER HETA
> +0372 ; Lu # GREEK CAPITAL LETTER ARCHAIC SAMPI
> +0376 ; Lu # GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA
> +037F ; Lu # GREEK CAPITAL LETTER YOT
> +0386 ; Lu # GREEK CAPITAL LETTER ALPHA WITH TONOS
> +0388..038A ; Lu # [3] GREEK CAPITAL LETTER EPSILON WITH TONOS..GREEK CAPITAL LETTER IOTA WITH TONOS
> +038C ; Lu # GREEK CAPITAL LETTER OMICRON WITH TONOS
> +038E..038F ; Lu # [2] GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK CAPITAL LETTER OMEGA WITH TONOS
> +0391..03A1 ; Lu # [17] GREEK CAPITAL LETTER ALPHA..GREEK CAPITAL LETTER RHO
> +03A3..03AB ; Lu # [9] GREEK CAPITAL LETTER SIGMA..GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
> +03CF ; Lu # GREEK CAPITAL KAI SYMBOL
> +03D2..03D4 ; Lu # [3] GREEK UPSILON WITH HOOK SYMBOL..GREEK UPSILON WITH DIAERESIS AND HOOK SYMBOL
> +03D8 ; Lu # GREEK LETTER ARCHAIC KOPPA
> +03DA ; Lu # GREEK LETTER STIGMA
> +03DC ; Lu # GREEK LETTER DIGAMMA
> +03DE ; Lu # GREEK LETTER KOPPA
> +03E0 ; Lu # GREEK LETTER SAMPI
> +03E2 ; Lu # COPTIC CAPITAL LETTER SHEI
> +03E4 ; Lu # COPTIC CAPITAL LETTER FEI
> +03E6 ; Lu # COPTIC CAPITAL LETTER KHEI
> +03E8 ; Lu # COPTIC CAPITAL LETTER HORI
> +03EA ; Lu # COPTIC CAPITAL LETTER GANGIA
> +03EC ; Lu # COPTIC CAPITAL LETTER SHIMA
> +03EE ; Lu # COPTIC CAPITAL LETTER DEI
> +03F4 ; Lu # GREEK CAPITAL THETA SYMBOL
> +03F7 ; Lu # GREEK CAPITAL LETTER SHO
> +03F9..03FA ; Lu # [2] GREEK CAPITAL LUNATE SIGMA SYMBOL..GREEK CAPITAL LETTER SAN
> +03FD..042F ; Lu # [51] GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL..CYRILLIC CAPITAL LETTER YA
> +0460 ; Lu # CYRILLIC CAPITAL LETTER OMEGA
> +0462 ; Lu # CYRILLIC CAPITAL LETTER YAT
> +0464 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED E
> +0466 ; Lu # CYRILLIC CAPITAL LETTER LITTLE YUS
> +0468 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS
> +046A ; Lu # CYRILLIC CAPITAL LETTER BIG YUS
> +046C ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS
> +046E ; Lu # CYRILLIC CAPITAL LETTER KSI
> +0470 ; Lu # CYRILLIC CAPITAL LETTER PSI
> +0472 ; Lu # CYRILLIC CAPITAL LETTER FITA
> +0474 ; Lu # CYRILLIC CAPITAL LETTER IZHITSA
> +0476 ; Lu # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
> +0478 ; Lu # CYRILLIC CAPITAL LETTER UK
> +047A ; Lu # CYRILLIC CAPITAL LETTER ROUND OMEGA
> +047C ; Lu # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO
> +047E ; Lu # CYRILLIC CAPITAL LETTER OT
> +0480 ; Lu # CYRILLIC CAPITAL LETTER KOPPA
> +048A ; Lu # CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
> +048C ; Lu # CYRILLIC CAPITAL LETTER SEMISOFT SIGN
> +048E ; Lu # CYRILLIC CAPITAL LETTER ER WITH TICK
> +0490 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH UPTURN
> +0492 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH STROKE
> +0494 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
> +0496 ; Lu # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
> +0498 ; Lu # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER
> +049A ; Lu # CYRILLIC CAPITAL LETTER KA WITH DESCENDER
> +049C ; Lu # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE
> +049E ; Lu # CYRILLIC CAPITAL LETTER KA WITH STROKE
> +04A0 ; Lu # CYRILLIC CAPITAL LETTER BASHKIR KA
> +04A2 ; Lu # CYRILLIC CAPITAL LETTER EN WITH DESCENDER
> +04A4 ; Lu # CYRILLIC CAPITAL LIGATURE EN GHE
> +04A6 ; Lu # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
> +04A8 ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN HA
> +04AA ; Lu # CYRILLIC CAPITAL LETTER ES WITH DESCENDER
> +04AC ; Lu # CYRILLIC CAPITAL LETTER TE WITH DESCENDER
> +04AE ; Lu # CYRILLIC CAPITAL LETTER STRAIGHT U
> +04B0 ; Lu # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE
> +04B2 ; Lu # CYRILLIC CAPITAL LETTER HA WITH DESCENDER
> +04B4 ; Lu # CYRILLIC CAPITAL LIGATURE TE TSE
> +04B6 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER
> +04B8 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE
> +04BA ; Lu # CYRILLIC CAPITAL LETTER SHHA
> +04BC ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN CHE
> +04BE ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
> +04C0..04C1 ; Lu # [2] CYRILLIC LETTER PALOCHKA..CYRILLIC CAPITAL LETTER ZHE WITH BREVE
> +04C3 ; Lu # CYRILLIC CAPITAL LETTER KA WITH HOOK
> +04C5 ; Lu # CYRILLIC CAPITAL LETTER EL WITH TAIL
> +04C7 ; Lu # CYRILLIC CAPITAL LETTER EN WITH HOOK
> +04C9 ; Lu # CYRILLIC CAPITAL LETTER EN WITH TAIL
> +04CB ; Lu # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
> +04CD ; Lu # CYRILLIC CAPITAL LETTER EM WITH TAIL
> +04D0 ; Lu # CYRILLIC CAPITAL LETTER A WITH BREVE
> +04D2 ; Lu # CYRILLIC CAPITAL LETTER A WITH DIAERESIS
> +04D4 ; Lu # CYRILLIC CAPITAL LIGATURE A IE
> +04D6 ; Lu # CYRILLIC CAPITAL LETTER IE WITH BREVE
> +04D8 ; Lu # CYRILLIC CAPITAL LETTER SCHWA
> +04DA ; Lu # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS
> +04DC ; Lu # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
> +04DE ; Lu # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
> +04E0 ; Lu # CYRILLIC CAPITAL LETTER ABKHASIAN DZE
> +04E2 ; Lu # CYRILLIC CAPITAL LETTER I WITH MACRON
> +04E4 ; Lu # CYRILLIC CAPITAL LETTER I WITH DIAERESIS
> +04E6 ; Lu # CYRILLIC CAPITAL LETTER O WITH DIAERESIS
> +04E8 ; Lu # CYRILLIC CAPITAL LETTER BARRED O
> +04EA ; Lu # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS
> +04EC ; Lu # CYRILLIC CAPITAL LETTER E WITH DIAERESIS
> +04EE ; Lu # CYRILLIC CAPITAL LETTER U WITH MACRON
> +04F0 ; Lu # CYRILLIC CAPITAL LETTER U WITH DIAERESIS
> +04F2 ; Lu # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
> +04F4 ; Lu # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
> +04F6 ; Lu # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER
> +04F8 ; Lu # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
> +04FA ; Lu # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOOK
> +04FC ; Lu # CYRILLIC CAPITAL LETTER HA WITH HOOK
> +04FE ; Lu # CYRILLIC CAPITAL LETTER HA WITH STROKE
> +0500 ; Lu # CYRILLIC CAPITAL LETTER KOMI DE
> +0502 ; Lu # CYRILLIC CAPITAL LETTER KOMI DJE
> +0504 ; Lu # CYRILLIC CAPITAL LETTER KOMI ZJE
> +0506 ; Lu # CYRILLIC CAPITAL LETTER KOMI DZJE
> +0508 ; Lu # CYRILLIC CAPITAL LETTER KOMI LJE
> +050A ; Lu # CYRILLIC CAPITAL LETTER KOMI NJE
> +050C ; Lu # CYRILLIC CAPITAL LETTER KOMI SJE
> +050E ; Lu # CYRILLIC CAPITAL LETTER KOMI TJE
> +0510 ; Lu # CYRILLIC CAPITAL LETTER REVERSED ZE
> +0512 ; Lu # CYRILLIC CAPITAL LETTER EL WITH HOOK
> +0514 ; Lu # CYRILLIC CAPITAL LETTER LHA
> +0516 ; Lu # CYRILLIC CAPITAL LETTER RHA
> +0518 ; Lu # CYRILLIC CAPITAL LETTER YAE
> +051A ; Lu # CYRILLIC CAPITAL LETTER QA
> +051C ; Lu # CYRILLIC CAPITAL LETTER WE
> +051E ; Lu # CYRILLIC CAPITAL LETTER ALEUT KA
> +0520 ; Lu # CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK
> +0522 ; Lu # CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK
> +0524 ; Lu # CYRILLIC CAPITAL LETTER PE WITH DESCENDER
> +0526 ; Lu # CYRILLIC CAPITAL LETTER SHHA WITH DESCENDER
> +0528 ; Lu # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
> +052A ; Lu # CYRILLIC CAPITAL LETTER DZZHE
> +052C ; Lu # CYRILLIC CAPITAL LETTER DCHE
> +052E ; Lu # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
> +0531..0556 ; Lu # [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH
> +10A0..10C5 ; Lu # [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE
> +10C7 ; Lu # GEORGIAN CAPITAL LETTER YN
> +10CD ; Lu # GEORGIAN CAPITAL LETTER AEN
> +13A0..13F5 ; Lu # [86] CHEROKEE LETTER A..CHEROKEE LETTER MV
> +1C89 ; Lu # CYRILLIC CAPITAL LETTER TJE
> +1C90..1CBA ; Lu # [43] GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN MTAVRULI CAPITAL LETTER AIN
> +1CBD..1CBF ; Lu # [3] GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
> +1E00 ; Lu # LATIN CAPITAL LETTER A WITH RING BELOW
> +1E02 ; Lu # LATIN CAPITAL LETTER B WITH DOT ABOVE
> +1E04 ; Lu # LATIN CAPITAL LETTER B WITH DOT BELOW
> +1E06 ; Lu # LATIN CAPITAL LETTER B WITH LINE BELOW
> +1E08 ; Lu # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
> +1E0A ; Lu # LATIN CAPITAL LETTER D WITH DOT ABOVE
> +1E0C ; Lu # LATIN CAPITAL LETTER D WITH DOT BELOW
> +1E0E ; Lu # LATIN CAPITAL LETTER D WITH LINE BELOW
> +1E10 ; Lu # LATIN CAPITAL LETTER D WITH CEDILLA
> +1E12 ; Lu # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW
> +1E14 ; Lu # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
> +1E16 ; Lu # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
> +1E18 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX BELOW
> +1E1A ; Lu # LATIN CAPITAL LETTER E WITH TILDE BELOW
> +1E1C ; Lu # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
> +1E1E ; Lu # LATIN CAPITAL LETTER F WITH DOT ABOVE
> +1E20 ; Lu # LATIN CAPITAL LETTER G WITH MACRON
> +1E22 ; Lu # LATIN CAPITAL LETTER H WITH DOT ABOVE
> +1E24 ; Lu # LATIN CAPITAL LETTER H WITH DOT BELOW
> +1E26 ; Lu # LATIN CAPITAL LETTER H WITH DIAERESIS
> +1E28 ; Lu # LATIN CAPITAL LETTER H WITH CEDILLA
> +1E2A ; Lu # LATIN CAPITAL LETTER H WITH BREVE BELOW
> +1E2C ; Lu # LATIN CAPITAL LETTER I WITH TILDE BELOW
> +1E2E ; Lu # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
> +1E30 ; Lu # LATIN CAPITAL LETTER K WITH ACUTE
> +1E32 ; Lu # LATIN CAPITAL LETTER K WITH DOT BELOW
> +1E34 ; Lu # LATIN CAPITAL LETTER K WITH LINE BELOW
> +1E36 ; Lu # LATIN CAPITAL LETTER L WITH DOT BELOW
> +1E38 ; Lu # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
> +1E3A ; Lu # LATIN CAPITAL LETTER L WITH LINE BELOW
> +1E3C ; Lu # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW
> +1E3E ; Lu # LATIN CAPITAL LETTER M WITH ACUTE
> +1E40 ; Lu # LATIN CAPITAL LETTER M WITH DOT ABOVE
> +1E42 ; Lu # LATIN CAPITAL LETTER M WITH DOT BELOW
> +1E44 ; Lu # LATIN CAPITAL LETTER N WITH DOT ABOVE
> +1E46 ; Lu # LATIN CAPITAL LETTER N WITH DOT BELOW
> +1E48 ; Lu # LATIN CAPITAL LETTER N WITH LINE BELOW
> +1E4A ; Lu # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW
> +1E4C ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
> +1E4E ; Lu # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
> +1E50 ; Lu # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
> +1E52 ; Lu # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
> +1E54 ; Lu # LATIN CAPITAL LETTER P WITH ACUTE
> +1E56 ; Lu # LATIN CAPITAL LETTER P WITH DOT ABOVE
> +1E58 ; Lu # LATIN CAPITAL LETTER R WITH DOT ABOVE
> +1E5A ; Lu # LATIN CAPITAL LETTER R WITH DOT BELOW
> +1E5C ; Lu # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
> +1E5E ; Lu # LATIN CAPITAL LETTER R WITH LINE BELOW
> +1E60 ; Lu # LATIN CAPITAL LETTER S WITH DOT ABOVE
> +1E62 ; Lu # LATIN CAPITAL LETTER S WITH DOT BELOW
> +1E64 ; Lu # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
> +1E66 ; Lu # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
> +1E68 ; Lu # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
> +1E6A ; Lu # LATIN CAPITAL LETTER T WITH DOT ABOVE
> +1E6C ; Lu # LATIN CAPITAL LETTER T WITH DOT BELOW
> +1E6E ; Lu # LATIN CAPITAL LETTER T WITH LINE BELOW
> +1E70 ; Lu # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW
> +1E72 ; Lu # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
> +1E74 ; Lu # LATIN CAPITAL LETTER U WITH TILDE BELOW
> +1E76 ; Lu # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW
> +1E78 ; Lu # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
> +1E7A ; Lu # LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS
> +1E7C ; Lu # LATIN CAPITAL LETTER V WITH TILDE
> +1E7E ; Lu # LATIN CAPITAL LETTER V WITH DOT BELOW
> +1E80 ; Lu # LATIN CAPITAL LETTER W WITH GRAVE
> +1E82 ; Lu # LATIN CAPITAL LETTER W WITH ACUTE
> +1E84 ; Lu # LATIN CAPITAL LETTER W WITH DIAERESIS
> +1E86 ; Lu # LATIN CAPITAL LETTER W WITH DOT ABOVE
> +1E88 ; Lu # LATIN CAPITAL LETTER W WITH DOT BELOW
> +1E8A ; Lu # LATIN CAPITAL LETTER X WITH DOT ABOVE
> +1E8C ; Lu # LATIN CAPITAL LETTER X WITH DIAERESIS
> +1E8E ; Lu # LATIN CAPITAL LETTER Y WITH DOT ABOVE
> +1E90 ; Lu # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
> +1E92 ; Lu # LATIN CAPITAL LETTER Z WITH DOT BELOW
> +1E94 ; Lu # LATIN CAPITAL LETTER Z WITH LINE BELOW
> +1E9E ; Lu # LATIN CAPITAL LETTER SHARP S
> +1EA0 ; Lu # LATIN CAPITAL LETTER A WITH DOT BELOW
> +1EA2 ; Lu # LATIN CAPITAL LETTER A WITH HOOK ABOVE
> +1EA4 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
> +1EA6 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
> +1EA8 ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
> +1EAA ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
> +1EAC ; Lu # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
> +1EAE ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
> +1EB0 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
> +1EB2 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
> +1EB4 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
> +1EB6 ; Lu # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
> +1EB8 ; Lu # LATIN CAPITAL LETTER E WITH DOT BELOW
> +1EBA ; Lu # LATIN CAPITAL LETTER E WITH HOOK ABOVE
> +1EBC ; Lu # LATIN CAPITAL LETTER E WITH TILDE
> +1EBE ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
> +1EC0 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
> +1EC2 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
> +1EC4 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
> +1EC6 ; Lu # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
> +1EC8 ; Lu # LATIN CAPITAL LETTER I WITH HOOK ABOVE
> +1ECA ; Lu # LATIN CAPITAL LETTER I WITH DOT BELOW
> +1ECC ; Lu # LATIN CAPITAL LETTER O WITH DOT BELOW
> +1ECE ; Lu # LATIN CAPITAL LETTER O WITH HOOK ABOVE
> +1ED0 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
> +1ED2 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
> +1ED4 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
> +1ED6 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
> +1ED8 ; Lu # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
> +1EDA ; Lu # LATIN CAPITAL LETTER O WITH HORN AND ACUTE
> +1EDC ; Lu # LATIN CAPITAL LETTER O WITH HORN AND GRAVE
> +1EDE ; Lu # LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE
> +1EE0 ; Lu # LATIN CAPITAL LETTER O WITH HORN AND TILDE
> +1EE2 ; Lu # LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW
> +1EE4 ; Lu # LATIN CAPITAL LETTER U WITH DOT BELOW
> +1EE6 ; Lu # LATIN CAPITAL LETTER U WITH HOOK ABOVE
> +1EE8 ; Lu # LATIN CAPITAL LETTER U WITH HORN AND ACUTE
> +1EEA ; Lu # LATIN CAPITAL LETTER U WITH HORN AND GRAVE
> +1EEC ; Lu # LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE
> +1EEE ; Lu # LATIN CAPITAL LETTER U WITH HORN AND TILDE
> +1EF0 ; Lu # LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW
> +1EF2 ; Lu # LATIN CAPITAL LETTER Y WITH GRAVE
> +1EF4 ; Lu # LATIN CAPITAL LETTER Y WITH DOT BELOW
> +1EF6 ; Lu # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
> +1EF8 ; Lu # LATIN CAPITAL LETTER Y WITH TILDE
> +1EFA ; Lu # LATIN CAPITAL LETTER MIDDLE-WELSH LL
> +1EFC ; Lu # LATIN CAPITAL LETTER MIDDLE-WELSH V
> +1EFE ; Lu # LATIN CAPITAL LETTER Y WITH LOOP
> +1F08..1F0F ; Lu # [8] GREEK CAPITAL LETTER ALPHA WITH PSILI..GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI
> +1F18..1F1D ; Lu # [6] GREEK CAPITAL LETTER EPSILON WITH PSILI..GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
> +1F28..1F2F ; Lu # [8] GREEK CAPITAL LETTER ETA WITH PSILI..GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI
> +1F38..1F3F ; Lu # [8] GREEK CAPITAL LETTER IOTA WITH PSILI..GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI
> +1F48..1F4D ; Lu # [6] GREEK CAPITAL LETTER OMICRON WITH PSILI..GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA
> +1F59 ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA
> +1F5B ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA
> +1F5D ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA
> +1F5F ; Lu # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI
> +1F68..1F6F ; Lu # [8] GREEK CAPITAL LETTER OMEGA WITH PSILI..GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI
> +1FB8..1FBB ; Lu # [4] GREEK CAPITAL LETTER ALPHA WITH VRACHY..GREEK CAPITAL LETTER ALPHA WITH OXIA
> +1FC8..1FCB ; Lu # [4] GREEK CAPITAL LETTER EPSILON WITH VARIA..GREEK CAPITAL LETTER ETA WITH OXIA
> +1FD8..1FDB ; Lu # [4] GREEK CAPITAL LETTER IOTA WITH VRACHY..GREEK CAPITAL LETTER IOTA WITH OXIA
> +1FE8..1FEC ; Lu # [5] GREEK CAPITAL LETTER UPSILON WITH VRACHY..GREEK CAPITAL LETTER RHO WITH DASIA
> +1FF8..1FFB ; Lu # [4] GREEK CAPITAL LETTER OMICRON WITH VARIA..GREEK CAPITAL LETTER OMEGA WITH OXIA
> +2102 ; Lu # DOUBLE-STRUCK CAPITAL C
> +2107 ; Lu # EULER CONSTANT
> +210B..210D ; Lu # [3] SCRIPT CAPITAL H..DOUBLE-STRUCK CAPITAL H
> +2110..2112 ; Lu # [3] SCRIPT CAPITAL I..SCRIPT CAPITAL L
> +2115 ; Lu # DOUBLE-STRUCK CAPITAL N
> +2119..211D ; Lu # [5] DOUBLE-STRUCK CAPITAL P..DOUBLE-STRUCK CAPITAL R
> +2124 ; Lu # DOUBLE-STRUCK CAPITAL Z
> +2126 ; Lu # OHM SIGN
> +2128 ; Lu # BLACK-LETTER CAPITAL Z
> +212A..212D ; Lu # [4] KELVIN SIGN..BLACK-LETTER CAPITAL C
> +2130..2133 ; Lu # [4] SCRIPT CAPITAL E..SCRIPT CAPITAL M
> +213E..213F ; Lu # [2] DOUBLE-STRUCK CAPITAL GAMMA..DOUBLE-STRUCK CAPITAL PI
> +2145 ; Lu # DOUBLE-STRUCK ITALIC CAPITAL D
> +2183 ; Lu # ROMAN NUMERAL REVERSED ONE HUNDRED
> +2C00..2C2F ; Lu # [48] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPITAL LETTER CAUDATE CHRIVI
> +2C60 ; Lu # LATIN CAPITAL LETTER L WITH DOUBLE BAR
> +2C62..2C64 ; Lu # [3] LATIN CAPITAL LETTER L WITH MIDDLE TILDE..LATIN CAPITAL LETTER R WITH TAIL
> +2C67 ; Lu # LATIN CAPITAL LETTER H WITH DESCENDER
> +2C69 ; Lu # LATIN CAPITAL LETTER K WITH DESCENDER
> +2C6B ; Lu # LATIN CAPITAL LETTER Z WITH DESCENDER
> +2C6D..2C70 ; Lu # [4] LATIN CAPITAL LETTER ALPHA..LATIN CAPITAL LETTER TURNED ALPHA
> +2C72 ; Lu # LATIN CAPITAL LETTER W WITH HOOK
> +2C75 ; Lu # LATIN CAPITAL LETTER HALF H
> +2C7E..2C80 ; Lu # [3] LATIN CAPITAL LETTER S WITH SWASH TAIL..COPTIC CAPITAL LETTER ALFA
> +2C82 ; Lu # COPTIC CAPITAL LETTER VIDA
> +2C84 ; Lu # COPTIC CAPITAL LETTER GAMMA
> +2C86 ; Lu # COPTIC CAPITAL LETTER DALDA
> +2C88 ; Lu # COPTIC CAPITAL LETTER EIE
> +2C8A ; Lu # COPTIC CAPITAL LETTER SOU
> +2C8C ; Lu # COPTIC CAPITAL LETTER ZATA
> +2C8E ; Lu # COPTIC CAPITAL LETTER HATE
> +2C90 ; Lu # COPTIC CAPITAL LETTER THETHE
> +2C92 ; Lu # COPTIC CAPITAL LETTER IAUDA
> +2C94 ; Lu # COPTIC CAPITAL LETTER KAPA
> +2C96 ; Lu # COPTIC CAPITAL LETTER LAULA
> +2C98 ; Lu # COPTIC CAPITAL LETTER MI
> +2C9A ; Lu # COPTIC CAPITAL LETTER NI
> +2C9C ; Lu # COPTIC CAPITAL LETTER KSI
> +2C9E ; Lu # COPTIC CAPITAL LETTER O
> +2CA0 ; Lu # COPTIC CAPITAL LETTER PI
> +2CA2 ; Lu # COPTIC CAPITAL LETTER RO
> +2CA4 ; Lu # COPTIC CAPITAL LETTER SIMA
> +2CA6 ; Lu # COPTIC CAPITAL LETTER TAU
> +2CA8 ; Lu # COPTIC CAPITAL LETTER UA
> +2CAA ; Lu # COPTIC CAPITAL LETTER FI
> +2CAC ; Lu # COPTIC CAPITAL LETTER KHI
> +2CAE ; Lu # COPTIC CAPITAL LETTER PSI
> +2CB0 ; Lu # COPTIC CAPITAL LETTER OOU
> +2CB2 ; Lu # COPTIC CAPITAL LETTER DIALECT-P ALEF
> +2CB4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC AIN
> +2CB6 ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE
> +2CB8 ; Lu # COPTIC CAPITAL LETTER DIALECT-P KAPA
> +2CBA ; Lu # COPTIC CAPITAL LETTER DIALECT-P NI
> +2CBC ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI
> +2CBE ; Lu # COPTIC CAPITAL LETTER OLD COPTIC OOU
> +2CC0 ; Lu # COPTIC CAPITAL LETTER SAMPI
> +2CC2 ; Lu # COPTIC CAPITAL LETTER CROSSED SHEI
> +2CC4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC SHEI
> +2CC6 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC ESH
> +2CC8 ; Lu # COPTIC CAPITAL LETTER AKHMIMIC KHEI
> +2CCA ; Lu # COPTIC CAPITAL LETTER DIALECT-P HORI
> +2CCC ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HORI
> +2CCE ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HA
> +2CD0 ; Lu # COPTIC CAPITAL LETTER L-SHAPED HA
> +2CD2 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HEI
> +2CD4 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC HAT
> +2CD6 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC GANGIA
> +2CD8 ; Lu # COPTIC CAPITAL LETTER OLD COPTIC DJA
> +2CDA ; Lu # COPTIC CAPITAL LETTER OLD COPTIC SHIMA
> +2CDC ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA
> +2CDE ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN NGI
> +2CE0 ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN NYI
> +2CE2 ; Lu # COPTIC CAPITAL LETTER OLD NUBIAN WAU
> +2CEB ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI
> +2CED ; Lu # COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA
> +2CF2 ; Lu # COPTIC CAPITAL LETTER BOHAIRIC KHEI
> +A640 ; Lu # CYRILLIC CAPITAL LETTER ZEMLYA
> +A642 ; Lu # CYRILLIC CAPITAL LETTER DZELO
> +A644 ; Lu # CYRILLIC CAPITAL LETTER REVERSED DZE
> +A646 ; Lu # CYRILLIC CAPITAL LETTER IOTA
> +A648 ; Lu # CYRILLIC CAPITAL LETTER DJERV
> +A64A ; Lu # CYRILLIC CAPITAL LETTER MONOGRAPH UK
> +A64C ; Lu # CYRILLIC CAPITAL LETTER BROAD OMEGA
> +A64E ; Lu # CYRILLIC CAPITAL LETTER NEUTRAL YER
> +A650 ; Lu # CYRILLIC CAPITAL LETTER YERU WITH BACK YER
> +A652 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED YAT
> +A654 ; Lu # CYRILLIC CAPITAL LETTER REVERSED YU
> +A656 ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED A
> +A658 ; Lu # CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS
> +A65A ; Lu # CYRILLIC CAPITAL LETTER BLENDED YUS
> +A65C ; Lu # CYRILLIC CAPITAL LETTER IOTIFIED CLOSED LITTLE YUS
> +A65E ; Lu # CYRILLIC CAPITAL LETTER YN
> +A660 ; Lu # CYRILLIC CAPITAL LETTER REVERSED TSE
> +A662 ; Lu # CYRILLIC CAPITAL LETTER SOFT DE
> +A664 ; Lu # CYRILLIC CAPITAL LETTER SOFT EL
> +A666 ; Lu # CYRILLIC CAPITAL LETTER SOFT EM
> +A668 ; Lu # CYRILLIC CAPITAL LETTER MONOCULAR O
> +A66A ; Lu # CYRILLIC CAPITAL LETTER BINOCULAR O
> +A66C ; Lu # CYRILLIC CAPITAL LETTER DOUBLE MONOCULAR O
> +A680 ; Lu # CYRILLIC CAPITAL LETTER DWE
> +A682 ; Lu # CYRILLIC CAPITAL LETTER DZWE
> +A684 ; Lu # CYRILLIC CAPITAL LETTER ZHWE
> +A686 ; Lu # CYRILLIC CAPITAL LETTER CCHE
> +A688 ; Lu # CYRILLIC CAPITAL LETTER DZZE
> +A68A ; Lu # CYRILLIC CAPITAL LETTER TE WITH MIDDLE HOOK
> +A68C ; Lu # CYRILLIC CAPITAL LETTER TWE
> +A68E ; Lu # CYRILLIC CAPITAL LETTER TSWE
> +A690 ; Lu # CYRILLIC CAPITAL LETTER TSSE
> +A692 ; Lu # CYRILLIC CAPITAL LETTER TCHE
> +A694 ; Lu # CYRILLIC CAPITAL LETTER HWE
> +A696 ; Lu # CYRILLIC CAPITAL LETTER SHWE
> +A698 ; Lu # CYRILLIC CAPITAL LETTER DOUBLE O
> +A69A ; Lu # CYRILLIC CAPITAL LETTER CROSSED O
> +A722 ; Lu # LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF
> +A724 ; Lu # LATIN CAPITAL LETTER EGYPTOLOGICAL AIN
> +A726 ; Lu # LATIN CAPITAL LETTER HENG
> +A728 ; Lu # LATIN CAPITAL LETTER TZ
> +A72A ; Lu # LATIN CAPITAL LETTER TRESILLO
> +A72C ; Lu # LATIN CAPITAL LETTER CUATRILLO
> +A72E ; Lu # LATIN CAPITAL LETTER CUATRILLO WITH COMMA
> +A732 ; Lu # LATIN CAPITAL LETTER AA
> +A734 ; Lu # LATIN CAPITAL LETTER AO
> +A736 ; Lu # LATIN CAPITAL LETTER AU
> +A738 ; Lu # LATIN CAPITAL LETTER AV
> +A73A ; Lu # LATIN CAPITAL LETTER AV WITH HORIZONTAL BAR
> +A73C ; Lu # LATIN CAPITAL LETTER AY
> +A73E ; Lu # LATIN CAPITAL LETTER REVERSED C WITH DOT
> +A740 ; Lu # LATIN CAPITAL LETTER K WITH STROKE
> +A742 ; Lu # LATIN CAPITAL LETTER K WITH DIAGONAL STROKE
> +A744 ; Lu # LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE
> +A746 ; Lu # LATIN CAPITAL LETTER BROKEN L
> +A748 ; Lu # LATIN CAPITAL LETTER L WITH HIGH STROKE
> +A74A ; Lu # LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY
> +A74C ; Lu # LATIN CAPITAL LETTER O WITH LOOP
> +A74E ; Lu # LATIN CAPITAL LETTER OO
> +A750 ; Lu # LATIN CAPITAL LETTER P WITH STROKE THROUGH DESCENDER
> +A752 ; Lu # LATIN CAPITAL LETTER P WITH FLOURISH
> +A754 ; Lu # LATIN CAPITAL LETTER P WITH SQUIRREL TAIL
> +A756 ; Lu # LATIN CAPITAL LETTER Q WITH STROKE THROUGH DESCENDER
> +A758 ; Lu # LATIN CAPITAL LETTER Q WITH DIAGONAL STROKE
> +A75A ; Lu # LATIN CAPITAL LETTER R ROTUNDA
> +A75C ; Lu # LATIN CAPITAL LETTER RUM ROTUNDA
> +A75E ; Lu # LATIN CAPITAL LETTER V WITH DIAGONAL STROKE
> +A760 ; Lu # LATIN CAPITAL LETTER VY
> +A762 ; Lu # LATIN CAPITAL LETTER VISIGOTHIC Z
> +A764 ; Lu # LATIN CAPITAL LETTER THORN WITH STROKE
> +A766 ; Lu # LATIN CAPITAL LETTER THORN WITH STROKE THROUGH DESCENDER
> +A768 ; Lu # LATIN CAPITAL LETTER VEND
> +A76A ; Lu # LATIN CAPITAL LETTER ET
> +A76C ; Lu # LATIN CAPITAL LETTER IS
> +A76E ; Lu # LATIN CAPITAL LETTER CON
> +A779 ; Lu # LATIN CAPITAL LETTER INSULAR D
> +A77B ; Lu # LATIN CAPITAL LETTER INSULAR F
> +A77D..A77E ; Lu # [2] LATIN CAPITAL LETTER INSULAR G..LATIN CAPITAL LETTER TURNED INSULAR G
> +A780 ; Lu # LATIN CAPITAL LETTER TURNED L
> +A782 ; Lu # LATIN CAPITAL LETTER INSULAR R
> +A784 ; Lu # LATIN CAPITAL LETTER INSULAR S
> +A786 ; Lu # LATIN CAPITAL LETTER INSULAR T
> +A78B ; Lu # LATIN CAPITAL LETTER SALTILLO
> +A78D ; Lu # LATIN CAPITAL LETTER TURNED H
> +A790 ; Lu # LATIN CAPITAL LETTER N WITH DESCENDER
> +A792 ; Lu # LATIN CAPITAL LETTER C WITH BAR
> +A796 ; Lu # LATIN CAPITAL LETTER B WITH FLOURISH
> +A798 ; Lu # LATIN CAPITAL LETTER F WITH STROKE
> +A79A ; Lu # LATIN CAPITAL LETTER VOLAPUK AE
> +A79C ; Lu # LATIN CAPITAL LETTER VOLAPUK OE
> +A79E ; Lu # LATIN CAPITAL LETTER VOLAPUK UE
> +A7A0 ; Lu # LATIN CAPITAL LETTER G WITH OBLIQUE STROKE
> +A7A2 ; Lu # LATIN CAPITAL LETTER K WITH OBLIQUE STROKE
> +A7A4 ; Lu # LATIN CAPITAL LETTER N WITH OBLIQUE STROKE
> +A7A6 ; Lu # LATIN CAPITAL LETTER R WITH OBLIQUE STROKE
> +A7A8 ; Lu # LATIN CAPITAL LETTER S WITH OBLIQUE STROKE
> +A7AA..A7AE ; Lu # [5] LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPITAL LETTER SMALL CAPITAL I
> +A7B0..A7B4 ; Lu # [5] LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL LETTER BETA
> +A7B6 ; Lu # LATIN CAPITAL LETTER OMEGA
> +A7B8 ; Lu # LATIN CAPITAL LETTER U WITH STROKE
> +A7BA ; Lu # LATIN CAPITAL LETTER GLOTTAL A
> +A7BC ; Lu # LATIN CAPITAL LETTER GLOTTAL I
> +A7BE ; Lu # LATIN CAPITAL LETTER GLOTTAL U
> +A7C0 ; Lu # LATIN CAPITAL LETTER OLD POLISH O
> +A7C2 ; Lu # LATIN CAPITAL LETTER ANGLICANA W
> +A7C4..A7C7 ; Lu # [4] LATIN CAPITAL LETTER C WITH PALATAL HOOK..LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY
> +A7C9 ; Lu # LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY
> +A7CB..A7CC ; Lu # [2] LATIN CAPITAL LETTER RAMS HORN..LATIN CAPITAL LETTER S WITH DIAGONAL STROKE
> +A7D0 ; Lu # LATIN CAPITAL LETTER CLOSED INSULAR G
> +A7D6 ; Lu # LATIN CAPITAL LETTER MIDDLE SCOTS S
> +A7D8 ; Lu # LATIN CAPITAL LETTER SIGMOID S
> +A7DA ; Lu # LATIN CAPITAL LETTER LAMBDA
> +A7DC ; Lu # LATIN CAPITAL LETTER LAMBDA WITH STROKE
> +A7F5 ; Lu # LATIN CAPITAL LETTER REVERSED HALF H
> +FF21..FF3A ; Lu # [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z
> +10400..10427 ; Lu # [40] DESERET CAPITAL LETTER LONG I..DESERET CAPITAL LETTER EW
> +104B0..104D3 ; Lu # [36] OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER ZHA
> +10570..1057A ; Lu # [11] VITHKUQI CAPITAL LETTER A..VITHKUQI CAPITAL LETTER GA
> +1057C..1058A ; Lu # [15] VITHKUQI CAPITAL LETTER HA..VITHKUQI CAPITAL LETTER RE
> +1058C..10592 ; Lu # [7] VITHKUQI CAPITAL LETTER SE..VITHKUQI CAPITAL LETTER XE
> +10594..10595 ; Lu # [2] VITHKUQI CAPITAL LETTER Y..VITHKUQI CAPITAL LETTER ZE
> +10C80..10CB2 ; Lu # [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US
> +10D50..10D65 ; Lu # [22] GARAY CAPITAL LETTER A..GARAY CAPITAL LETTER OLD NA
> +118A0..118BF ; Lu # [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO
> +16E40..16E5F ; Lu # [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y
> +1D400..1D419 ; Lu # [26] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL BOLD CAPITAL Z
> +1D434..1D44D ; Lu # [26] MATHEMATICAL ITALIC CAPITAL A..MATHEMATICAL ITALIC CAPITAL Z
> +1D468..1D481 ; Lu # [26] MATHEMATICAL BOLD ITALIC CAPITAL A..MATHEMATICAL BOLD ITALIC CAPITAL Z
> +1D49C ; Lu # MATHEMATICAL SCRIPT CAPITAL A
> +1D49E..1D49F ; Lu # [2] MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SCRIPT CAPITAL D
> +1D4A2 ; Lu # MATHEMATICAL SCRIPT CAPITAL G
> +1D4A5..1D4A6 ; Lu # [2] MATHEMATICAL SCRIPT CAPITAL J..MATHEMATICAL SCRIPT CAPITAL K
> +1D4A9..1D4AC ; Lu # [4] MATHEMATICAL SCRIPT CAPITAL N..MATHEMATICAL SCRIPT CAPITAL Q
> +1D4AE..1D4B5 ; Lu # [8] MATHEMATICAL SCRIPT CAPITAL S..MATHEMATICAL SCRIPT CAPITAL Z
> +1D4D0..1D4E9 ; Lu # [26] MATHEMATICAL BOLD SCRIPT CAPITAL A..MATHEMATICAL BOLD SCRIPT CAPITAL Z
> +1D504..1D505 ; Lu # [2] MATHEMATICAL FRAKTUR CAPITAL A..MATHEMATICAL FRAKTUR CAPITAL B
> +1D507..1D50A ; Lu # [4] MATHEMATICAL FRAKTUR CAPITAL D..MATHEMATICAL FRAKTUR CAPITAL G
> +1D50D..1D514 ; Lu # [8] MATHEMATICAL FRAKTUR CAPITAL J..MATHEMATICAL FRAKTUR CAPITAL Q
> +1D516..1D51C ; Lu # [7] MATHEMATICAL FRAKTUR CAPITAL S..MATHEMATICAL FRAKTUR CAPITAL Y
> +1D538..1D539 ; Lu # [2] MATHEMATICAL DOUBLE-STRUCK CAPITAL A..MATHEMATICAL DOUBLE-STRUCK CAPITAL B
> +1D53B..1D53E ; Lu # [4] MATHEMATICAL DOUBLE-STRUCK CAPITAL D..MATHEMATICAL DOUBLE-STRUCK CAPITAL G
> +1D540..1D544 ; Lu # [5] MATHEMATICAL DOUBLE-STRUCK CAPITAL I..MATHEMATICAL DOUBLE-STRUCK CAPITAL M
> +1D546 ; Lu # MATHEMATICAL DOUBLE-STRUCK CAPITAL O
> +1D54A..1D550 ; Lu # [7] MATHEMATICAL DOUBLE-STRUCK CAPITAL S..MATHEMATICAL DOUBLE-STRUCK CAPITAL Y
> +1D56C..1D585 ; Lu # [26] MATHEMATICAL BOLD FRAKTUR CAPITAL A..MATHEMATICAL BOLD FRAKTUR CAPITAL Z
> +1D5A0..1D5B9 ; Lu # [26] MATHEMATICAL SANS-SERIF CAPITAL A..MATHEMATICAL SANS-SERIF CAPITAL Z
> +1D5D4..1D5ED ; Lu # [26] MATHEMATICAL SANS-SERIF BOLD CAPITAL A..MATHEMATICAL SANS-SERIF BOLD CAPITAL Z
> +1D608..1D621 ; Lu # [26] MATHEMATICAL SANS-SERIF ITALIC CAPITAL A..MATHEMATICAL SANS-SERIF ITALIC CAPITAL Z
> +1D63C..1D655 ; Lu # [26] MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL A..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL Z
> +1D670..1D689 ; Lu # [26] MATHEMATICAL MONOSPACE CAPITAL A..MATHEMATICAL MONOSPACE CAPITAL Z
> +1D6A8..1D6C0 ; Lu # [25] MATHEMATICAL BOLD CAPITAL ALPHA..MATHEMATICAL BOLD CAPITAL OMEGA
> +1D6E2..1D6FA ; Lu # [25] MATHEMATICAL ITALIC CAPITAL ALPHA..MATHEMATICAL ITALIC CAPITAL OMEGA
> +1D71C..1D734 ; Lu # [25] MATHEMATICAL BOLD ITALIC CAPITAL ALPHA..MATHEMATICAL BOLD ITALIC CAPITAL OMEGA
> +1D756..1D76E ; Lu # [25] MATHEMATICAL SANS-SERIF BOLD CAPITAL ALPHA..MATHEMATICAL SANS-SERIF BOLD CAPITAL OMEGA
> +1D790..1D7A8 ; Lu # [25] MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA
> +1D7CA ; Lu # MATHEMATICAL BOLD CAPITAL DIGAMMA
> +1E900..1E921 ; Lu # [34] ADLAM CAPITAL LETTER ALIF..ADLAM CAPITAL LETTER SHA
> +
> +# Total code points: 1858
> +
> +# ================================================
> +
> +# General_Category=Lowercase_Letter
> +
> +0061..007A ; Ll # [26] LATIN SMALL LETTER A..LATIN SMALL LETTER Z
> +00B5 ; Ll # MICRO SIGN
> +00DF..00F6 ; Ll # [24] LATIN SMALL LETTER SHARP S..LATIN SMALL LETTER O WITH DIAERESIS
> +00F8..00FF ; Ll # [8] LATIN SMALL LETTER O WITH STROKE..LATIN SMALL LETTER Y WITH DIAERESIS
> +0101 ; Ll # LATIN SMALL LETTER A WITH MACRON
> +0103 ; Ll # LATIN SMALL LETTER A WITH BREVE
> +0105 ; Ll # LATIN SMALL LETTER A WITH OGONEK
> +0107 ; Ll # LATIN SMALL LETTER C WITH ACUTE
> +0109 ; Ll # LATIN SMALL LETTER C WITH CIRCUMFLEX
> +010B ; Ll # LATIN SMALL LETTER C WITH DOT ABOVE
> +010D ; Ll # LATIN SMALL LETTER C WITH CARON
> +010F ; Ll # LATIN SMALL LETTER D WITH CARON
> +0111 ; Ll # LATIN SMALL LETTER D WITH STROKE
> +0113 ; Ll # LATIN SMALL LETTER E WITH MACRON
> +0115 ; Ll # LATIN SMALL LETTER E WITH BREVE
> +0117 ; Ll # LATIN SMALL LETTER E WITH DOT ABOVE
> +0119 ; Ll # LATIN SMALL LETTER E WITH OGONEK
> +011B ; Ll # LATIN SMALL LETTER E WITH CARON
> +011D ; Ll # LATIN SMALL LETTER G WITH CIRCUMFLEX
> +011F ; Ll # LATIN SMALL LETTER G WITH BREVE
> +0121 ; Ll # LATIN SMALL LETTER G WITH DOT ABOVE
> +0123 ; Ll # LATIN SMALL LETTER G WITH CEDILLA
> +0125 ; Ll # LATIN SMALL LETTER H WITH CIRCUMFLEX
> +0127 ; Ll # LATIN SMALL LETTER H WITH STROKE
> +0129 ; Ll # LATIN SMALL LETTER I WITH TILDE
> +012B ; Ll # LATIN SMALL LETTER I WITH MACRON
> +012D ; Ll # LATIN SMALL LETTER I WITH BREVE
> +012F ; Ll # LATIN SMALL LETTER I WITH OGONEK
> +0131 ; Ll # LATIN SMALL LETTER DOTLESS I
> +0133 ; Ll # LATIN SMALL LIGATURE IJ
> +0135 ; Ll # LATIN SMALL LETTER J WITH CIRCUMFLEX
> +0137..0138 ; Ll # [2] LATIN SMALL LETTER K WITH CEDILLA..LATIN SMALL LETTER KRA
> +013A ; Ll # LATIN SMALL LETTER L WITH ACUTE
> +013C ; Ll # LATIN SMALL LETTER L WITH CEDILLA
> +013E ; Ll # LATIN SMALL LETTER L WITH CARON
> +0140 ; Ll # LATIN SMALL LETTER L WITH MIDDLE DOT
> +0142 ; Ll # LATIN SMALL LETTER L WITH STROKE
> +0144 ; Ll # LATIN SMALL LETTER N WITH ACUTE
> +0146 ; Ll # LATIN SMALL LETTER N WITH CEDILLA
> +0148..0149 ; Ll # [2] LATIN SMALL LETTER N WITH CARON..LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
> +014B ; Ll # LATIN SMALL LETTER ENG
> +014D ; Ll # LATIN SMALL LETTER O WITH MACRON
> +014F ; Ll # LATIN SMALL LETTER O WITH BREVE
> +0151 ; Ll # LATIN SMALL LETTER O WITH DOUBLE ACUTE
> +0153 ; Ll # LATIN SMALL LIGATURE OE
> +0155 ; Ll # LATIN SMALL LETTER R WITH ACUTE
> +0157 ; Ll # LATIN SMALL LETTER R WITH CEDILLA
> +0159 ; Ll # LATIN SMALL LETTER R WITH CARON
> +015B ; Ll # LATIN SMALL LETTER S WITH ACUTE
> +015D ; Ll # LATIN SMALL LETTER S WITH CIRCUMFLEX
> +015F ; Ll # LATIN SMALL LETTER S WITH CEDILLA
> +0161 ; Ll # LATIN SMALL LETTER S WITH CARON
> +0163 ; Ll # LATIN SMALL LETTER T WITH CEDILLA
> +0165 ; Ll # LATIN SMALL LETTER T WITH CARON
> +0167 ; Ll # LATIN SMALL LETTER T WITH STROKE
> +0169 ; Ll # LATIN SMALL LETTER U WITH TILDE
> +016B ; Ll # LATIN SMALL LETTER U WITH MACRON
> +016D ; Ll # LATIN SMALL LETTER U WITH BREVE
> +016F ; Ll # LATIN SMALL LETTER U WITH RING ABOVE
> +0171 ; Ll # LATIN SMALL LETTER U WITH DOUBLE ACUTE
> +0173 ; Ll # LATIN SMALL LETTER U WITH OGONEK
> +0175 ; Ll # LATIN SMALL LETTER W WITH CIRCUMFLEX
> +0177 ; Ll # LATIN SMALL LETTER Y WITH CIRCUMFLEX
> +017A ; Ll # LATIN SMALL LETTER Z WITH ACUTE
> +017C ; Ll # LATIN SMALL LETTER Z WITH DOT ABOVE
> +017E..0180 ; Ll # [3] LATIN SMALL LETTER Z WITH CARON..LATIN SMALL LETTER B WITH STROKE
> +0183 ; Ll # LATIN SMALL LETTER B WITH TOPBAR
> +0185 ; Ll # LATIN SMALL LETTER TONE SIX
> +0188 ; Ll # LATIN SMALL LETTER C WITH HOOK
> +018C..018D ; Ll # [2] LATIN SMALL LETTER D WITH TOPBAR..LATIN SMALL LETTER TURNED DELTA
> +0192 ; Ll # LATIN SMALL LETTER F WITH HOOK
> +0195 ; Ll # LATIN SMALL LETTER HV
> +0199..019B ; Ll # [3] LATIN SMALL LETTER K WITH HOOK..LATIN SMALL LETTER LAMBDA WITH STROKE
> +019E ; Ll # LATIN SMALL LETTER N WITH LONG RIGHT LEG
> +01A1 ; Ll # LATIN SMALL LETTER O WITH HORN
> +01A3 ; Ll # LATIN SMALL LETTER OI
> +01A5 ; Ll # LATIN SMALL LETTER P WITH HOOK
> +01A8 ; Ll # LATIN SMALL LETTER TONE TWO
> +01AA..01AB ; Ll # [2] LATIN LETTER REVERSED ESH LOOP..LATIN SMALL LETTER T WITH PALATAL HOOK
> +01AD ; Ll # LATIN SMALL LETTER T WITH HOOK
> +01B0 ; Ll # LATIN SMALL LETTER U WITH HORN
> +01B4 ; Ll # LATIN SMALL LETTER Y WITH HOOK
> +01B6 ; Ll # LATIN SMALL LETTER Z WITH STROKE
> +01B9..01BA ; Ll # [2] LATIN SMALL LETTER EZH REVERSED..LATIN SMALL LETTER EZH WITH TAIL
> +01BD..01BF ; Ll # [3] LATIN SMALL LETTER TONE FIVE..LATIN LETTER WYNN
> +01C6 ; Ll # LATIN SMALL LETTER DZ WITH CARON
> +01C9 ; Ll # LATIN SMALL LETTER LJ
> +01CC ; Ll # LATIN SMALL LETTER NJ
> +01CE ; Ll # LATIN SMALL LETTER A WITH CARON
> +01D0 ; Ll # LATIN SMALL LETTER I WITH CARON
> +01D2 ; Ll # LATIN SMALL LETTER O WITH CARON
> +01D4 ; Ll # LATIN SMALL LETTER U WITH CARON
> +01D6 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
> +01D8 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
> +01DA ; Ll # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
> +01DC..01DD ; Ll # [2] LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE..LATIN SMALL LETTER TURNED E
> +01DF ; Ll # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
> +01E1 ; Ll # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
> +01E3 ; Ll # LATIN SMALL LETTER AE WITH MACRON
> +01E5 ; Ll # LATIN SMALL LETTER G WITH STROKE
> +01E7 ; Ll # LATIN SMALL LETTER G WITH CARON
> +01E9 ; Ll # LATIN SMALL LETTER K WITH CARON
> +01EB ; Ll # LATIN SMALL LETTER O WITH OGONEK
> +01ED ; Ll # LATIN SMALL LETTER O WITH OGONEK AND MACRON
> +01EF..01F0 ; Ll # [2] LATIN SMALL LETTER EZH WITH CARON..LATIN SMALL LETTER J WITH CARON
> +01F3 ; Ll # LATIN SMALL LETTER DZ
> +01F5 ; Ll # LATIN SMALL LETTER G WITH ACUTE
> +01F9 ; Ll # LATIN SMALL LETTER N WITH GRAVE
> +01FB ; Ll # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
> +01FD ; Ll # LATIN SMALL LETTER AE WITH ACUTE
> +01FF ; Ll # LATIN SMALL LETTER O WITH STROKE AND ACUTE
> +0201 ; Ll # LATIN SMALL LETTER A WITH DOUBLE GRAVE
> +0203 ; Ll # LATIN SMALL LETTER A WITH INVERTED BREVE
> +0205 ; Ll # LATIN SMALL LETTER E WITH DOUBLE GRAVE
> +0207 ; Ll # LATIN SMALL LETTER E WITH INVERTED BREVE
> +0209 ; Ll # LATIN SMALL LETTER I WITH DOUBLE GRAVE
> +020B ; Ll # LATIN SMALL LETTER I WITH INVERTED BREVE
> +020D ; Ll # LATIN SMALL LETTER O WITH DOUBLE GRAVE
> +020F ; Ll # LATIN SMALL LETTER O WITH INVERTED BREVE
> +0211 ; Ll # LATIN SMALL LETTER R WITH DOUBLE GRAVE
> +0213 ; Ll # LATIN SMALL LETTER R WITH INVERTED BREVE
> +0215 ; Ll # LATIN SMALL LETTER U WITH DOUBLE GRAVE
> +0217 ; Ll # LATIN SMALL LETTER U WITH INVERTED BREVE
> +0219 ; Ll # LATIN SMALL LETTER S WITH COMMA BELOW
> +021B ; Ll # LATIN SMALL LETTER T WITH COMMA BELOW
> +021D ; Ll # LATIN SMALL LETTER YOGH
> +021F ; Ll # LATIN SMALL LETTER H WITH CARON
> +0221 ; Ll # LATIN SMALL LETTER D WITH CURL
> +0223 ; Ll # LATIN SMALL LETTER OU
> +0225 ; Ll # LATIN SMALL LETTER Z WITH HOOK
> +0227 ; Ll # LATIN SMALL LETTER A WITH DOT ABOVE
> +0229 ; Ll # LATIN SMALL LETTER E WITH CEDILLA
> +022B ; Ll # LATIN SMALL LETTER O WITH DIAERESIS AND MACRON
> +022D ; Ll # LATIN SMALL LETTER O WITH TILDE AND MACRON
> +022F ; Ll # LATIN SMALL LETTER O WITH DOT ABOVE
> +0231 ; Ll # LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON
> +0233..0239 ; Ll # [7] LATIN SMALL LETTER Y WITH MACRON..LATIN SMALL LETTER QP DIGRAPH
> +023C ; Ll # LATIN SMALL LETTER C WITH STROKE
> +023F..0240 ; Ll # [2] LATIN SMALL LETTER S WITH SWASH TAIL..LATIN SMALL LETTER Z WITH SWASH TAIL
> +0242 ; Ll # LATIN SMALL LETTER GLOTTAL STOP
> +0247 ; Ll # LATIN SMALL LETTER E WITH STROKE
> +0249 ; Ll # LATIN SMALL LETTER J WITH STROKE
> +024B ; Ll # LATIN SMALL LETTER Q WITH HOOK TAIL
> +024D ; Ll # LATIN SMALL LETTER R WITH STROKE
> +024F..0293 ; Ll # [69] LATIN SMALL LETTER Y WITH STROKE..LATIN SMALL LETTER EZH WITH CURL
> +0295..02AF ; Ll # [27] LATIN LETTER PHARYNGEAL VOICED FRICATIVE..LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL
> +0371 ; Ll # GREEK SMALL LETTER HETA
> +0373 ; Ll # GREEK SMALL LETTER ARCHAIC SAMPI
> +0377 ; Ll # GREEK SMALL LETTER PAMPHYLIAN DIGAMMA
> +037B..037D ; Ll # [3] GREEK SMALL REVERSED LUNATE SIGMA SYMBOL..GREEK SMALL REVERSED DOTTED LUNATE SIGMA SYMBOL
> +0390 ; Ll # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
> +03AC..03CE ; Ll # [35] GREEK SMALL LETTER ALPHA WITH TONOS..GREEK SMALL LETTER OMEGA WITH TONOS
> +03D0..03D1 ; Ll # [2] GREEK BETA SYMBOL..GREEK THETA SYMBOL
> +03D5..03D7 ; Ll # [3] GREEK PHI SYMBOL..GREEK KAI SYMBOL
> +03D9 ; Ll # GREEK SMALL LETTER ARCHAIC KOPPA
> +03DB ; Ll # GREEK SMALL LETTER STIGMA
> +03DD ; Ll # GREEK SMALL LETTER DIGAMMA
> +03DF ; Ll # GREEK SMALL LETTER KOPPA
> +03E1 ; Ll # GREEK SMALL LETTER SAMPI
> +03E3 ; Ll # COPTIC SMALL LETTER SHEI
> +03E5 ; Ll # COPTIC SMALL LETTER FEI
> +03E7 ; Ll # COPTIC SMALL LETTER KHEI
> +03E9 ; Ll # COPTIC SMALL LETTER HORI
> +03EB ; Ll # COPTIC SMALL LETTER GANGIA
> +03ED ; Ll # COPTIC SMALL LETTER SHIMA
> +03EF..03F3 ; Ll # [5] COPTIC SMALL LETTER DEI..GREEK LETTER YOT
> +03F5 ; Ll # GREEK LUNATE EPSILON SYMBOL
> +03F8 ; Ll # GREEK SMALL LETTER SHO
> +03FB..03FC ; Ll # [2] GREEK SMALL LETTER SAN..GREEK RHO WITH STROKE SYMBOL
> +0430..045F ; Ll # [48] CYRILLIC SMALL LETTER A..CYRILLIC SMALL LETTER DZHE
> +0461 ; Ll # CYRILLIC SMALL LETTER OMEGA
> +0463 ; Ll # CYRILLIC SMALL LETTER YAT
> +0465 ; Ll # CYRILLIC SMALL LETTER IOTIFIED E
> +0467 ; Ll # CYRILLIC SMALL LETTER LITTLE YUS
> +0469 ; Ll # CYRILLIC SMALL LETTER IOTIFIED LITTLE YUS
> +046B ; Ll # CYRILLIC SMALL LETTER BIG YUS
> +046D ; Ll # CYRILLIC SMALL LETTER IOTIFIED BIG YUS
> +046F ; Ll # CYRILLIC SMALL LETTER KSI
> +0471 ; Ll # CYRILLIC SMALL LETTER PSI
> +0473 ; Ll # CYRILLIC SMALL LETTER FITA
> +0475 ; Ll # CYRILLIC SMALL LETTER IZHITSA
> +0477 ; Ll # CYRILLIC SMALL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
> +0479 ; Ll # CYRILLIC SMALL LETTER UK
> +047B ; Ll # CYRILLIC SMALL LETTER ROUND OMEGA
> +047D ; Ll # CYRILLIC SMALL LETTER OMEGA WITH TITLO
> +047F ; Ll # CYRILLIC SMALL LETTER OT
> +0481 ; Ll # CYRILLIC SMALL LETTER KOPPA
> +048B ; Ll # CYRILLIC SMALL LETTER SHORT I WITH TAIL
> +048D ; Ll # CYRILLIC SMALL LETTER SEMISOFT SIGN
> +048F ; Ll # CYRILLIC SMALL LETTER ER WITH TICK
> +0491 ; Ll # CYRILLIC SMALL LETTER GHE WITH UPTURN
> +0493 ; Ll # CYRILLIC SMALL LETTER GHE WITH STROKE
> +0495 ; Ll # CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
> +0497 ; Ll # CYRILLIC SMALL LETTER ZHE WITH DESCENDER
> +0499 ; Ll # CYRILLIC SMALL LETTER ZE WITH DESCENDER
> +049B ; Ll # CYRILLIC SMALL LETTER KA WITH DESCENDER
> +049D ; Ll # CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE
> +049F ; Ll # CYRILLIC SMALL LETTER KA WITH STROKE
> +04A1 ; Ll # CYRILLIC SMALL LETTER BASHKIR KA
> +04A3 ; Ll # CYRILLIC SMALL LETTER EN WITH DESCENDER
> +04A5 ; Ll # CYRILLIC SMALL LIGATURE EN GHE
> +04A7 ; Ll # CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
> +04A9 ; Ll # CYRILLIC SMALL LETTER ABKHASIAN HA
> +04AB ; Ll # CYRILLIC SMALL LETTER ES WITH DESCENDER
> +04AD ; Ll # CYRILLIC SMALL LETTER TE WITH DESCENDER
> +04AF ; Ll # CYRILLIC SMALL LETTER STRAIGHT U
> +04B1 ; Ll # CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE
> +04B3 ; Ll # CYRILLIC SMALL LETTER HA WITH DESCENDER
> +04B5 ; Ll # CYRILLIC SMALL LIGATURE TE TSE
> +04B7 ; Ll # CYRILLIC SMALL LETTER CHE WITH DESCENDER
> +04B9 ; Ll # CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE
> +04BB ; Ll # CYRILLIC SMALL LETTER SHHA
> +04BD ; Ll # CYRILLIC SMALL LETTER ABKHASIAN CHE
> +04BF ; Ll # CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
> +04C2 ; Ll # CYRILLIC SMALL LETTER ZHE WITH BREVE
> +04C4 ; Ll # CYRILLIC SMALL LETTER KA WITH HOOK
> +04C6 ; Ll # CYRILLIC SMALL LETTER EL WITH TAIL
> +04C8 ; Ll # CYRILLIC SMALL LETTER EN WITH HOOK
> +04CA ; Ll # CYRILLIC SMALL LETTER EN WITH TAIL
> +04CC ; Ll # CYRILLIC SMALL LETTER KHAKASSIAN CHE
> +04CE..04CF ; Ll # [2] CYRILLIC SMALL LETTER EM WITH TAIL..CYRILLIC SMALL LETTER PALOCHKA
> +04D1 ; Ll # CYRILLIC SMALL LETTER A WITH BREVE
> +04D3 ; Ll # CYRILLIC SMALL LETTER A WITH DIAERESIS
> +04D5 ; Ll # CYRILLIC SMALL LIGATURE A IE
> +04D7 ; Ll # CYRILLIC SMALL LETTER IE WITH BREVE
> +04D9 ; Ll # CYRILLIC SMALL LETTER SCHWA
> +04DB ; Ll # CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS
> +04DD ; Ll # CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
> +04DF ; Ll # CYRILLIC SMALL LETTER ZE WITH DIAERESIS
> +04E1 ; Ll # CYRILLIC SMALL LETTER ABKHASIAN DZE
> +04E3 ; Ll # CYRILLIC SMALL LETTER I WITH MACRON
> +04E5 ; Ll # CYRILLIC SMALL LETTER I WITH DIAERESIS
> +04E7 ; Ll # CYRILLIC SMALL LETTER O WITH DIAERESIS
> +04E9 ; Ll # CYRILLIC SMALL LETTER BARRED O
> +04EB ; Ll # CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS
> +04ED ; Ll # CYRILLIC SMALL LETTER E WITH DIAERESIS
> +04EF ; Ll # CYRILLIC SMALL LETTER U WITH MACRON
> +04F1 ; Ll # CYRILLIC SMALL LETTER U WITH DIAERESIS
> +04F3 ; Ll # CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
> +04F5 ; Ll # CYRILLIC SMALL LETTER CHE WITH DIAERESIS
> +04F7 ; Ll # CYRILLIC SMALL LETTER GHE WITH DESCENDER
> +04F9 ; Ll # CYRILLIC SMALL LETTER YERU WITH DIAERESIS
> +04FB ; Ll # CYRILLIC SMALL LETTER GHE WITH STROKE AND HOOK
> +04FD ; Ll # CYRILLIC SMALL LETTER HA WITH HOOK
> +04FF ; Ll # CYRILLIC SMALL LETTER HA WITH STROKE
> +0501 ; Ll # CYRILLIC SMALL LETTER KOMI DE
> +0503 ; Ll # CYRILLIC SMALL LETTER KOMI DJE
> +0505 ; Ll # CYRILLIC SMALL LETTER KOMI ZJE
> +0507 ; Ll # CYRILLIC SMALL LETTER KOMI DZJE
> +0509 ; Ll # CYRILLIC SMALL LETTER KOMI LJE
> +050B ; Ll # CYRILLIC SMALL LETTER KOMI NJE
> +050D ; Ll # CYRILLIC SMALL LETTER KOMI SJE
> +050F ; Ll # CYRILLIC SMALL LETTER KOMI TJE
> +0511 ; Ll # CYRILLIC SMALL LETTER REVERSED ZE
> +0513 ; Ll # CYRILLIC SMALL LETTER EL WITH HOOK
> +0515 ; Ll # CYRILLIC SMALL LETTER LHA
> +0517 ; Ll # CYRILLIC SMALL LETTER RHA
> +0519 ; Ll # CYRILLIC SMALL LETTER YAE
> +051B ; Ll # CYRILLIC SMALL LETTER QA
> +051D ; Ll # CYRILLIC SMALL LETTER WE
> +051F ; Ll # CYRILLIC SMALL LETTER ALEUT KA
> +0521 ; Ll # CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK
> +0523 ; Ll # CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK
> +0525 ; Ll # CYRILLIC SMALL LETTER PE WITH DESCENDER
> +0527 ; Ll # CYRILLIC SMALL LETTER SHHA WITH DESCENDER
> +0529 ; Ll # CYRILLIC SMALL LETTER EN WITH LEFT HOOK
> +052B ; Ll # CYRILLIC SMALL LETTER DZZHE
> +052D ; Ll # CYRILLIC SMALL LETTER DCHE
> +052F ; Ll # CYRILLIC SMALL LETTER EL WITH DESCENDER
> +0560..0588 ; Ll # [41] ARMENIAN SMALL LETTER TURNED AYB..ARMENIAN SMALL LETTER YI WITH STROKE
> +10D0..10FA ; Ll # [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
> +10FD..10FF ; Ll # [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN
> +13F8..13FD ; Ll # [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV
> +1C80..1C88 ; Ll # [9] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER UNBLENDED UK
> +1C8A ; Ll # CYRILLIC SMALL LETTER TJE
> +1D00..1D2B ; Ll # [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL
> +1D6B..1D77 ; Ll # [13] LATIN SMALL LETTER UE..LATIN SMALL LETTER TURNED G
> +1D79..1D9A ; Ll # [34] LATIN SMALL LETTER INSULAR G..LATIN SMALL LETTER EZH WITH RETROFLEX HOOK
> +1E01 ; Ll # LATIN SMALL LETTER A WITH RING BELOW
> +1E03 ; Ll # LATIN SMALL LETTER B WITH DOT ABOVE
> +1E05 ; Ll # LATIN SMALL LETTER B WITH DOT BELOW
> +1E07 ; Ll # LATIN SMALL LETTER B WITH LINE BELOW
> +1E09 ; Ll # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
> +1E0B ; Ll # LATIN SMALL LETTER D WITH DOT ABOVE
> +1E0D ; Ll # LATIN SMALL LETTER D WITH DOT BELOW
> +1E0F ; Ll # LATIN SMALL LETTER D WITH LINE BELOW
> +1E11 ; Ll # LATIN SMALL LETTER D WITH CEDILLA
> +1E13 ; Ll # LATIN SMALL LETTER D WITH CIRCUMFLEX BELOW
> +1E15 ; Ll # LATIN SMALL LETTER E WITH MACRON AND GRAVE
> +1E17 ; Ll # LATIN SMALL LETTER E WITH MACRON AND ACUTE
> +1E19 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX BELOW
> +1E1B ; Ll # LATIN SMALL LETTER E WITH TILDE BELOW
> +1E1D ; Ll # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
> +1E1F ; Ll # LATIN SMALL LETTER F WITH DOT ABOVE
> +1E21 ; Ll # LATIN SMALL LETTER G WITH MACRON
> +1E23 ; Ll # LATIN SMALL LETTER H WITH DOT ABOVE
> +1E25 ; Ll # LATIN SMALL LETTER H WITH DOT BELOW
> +1E27 ; Ll # LATIN SMALL LETTER H WITH DIAERESIS
> +1E29 ; Ll # LATIN SMALL LETTER H WITH CEDILLA
> +1E2B ; Ll # LATIN SMALL LETTER H WITH BREVE BELOW
> +1E2D ; Ll # LATIN SMALL LETTER I WITH TILDE BELOW
> +1E2F ; Ll # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
> +1E31 ; Ll # LATIN SMALL LETTER K WITH ACUTE
> +1E33 ; Ll # LATIN SMALL LETTER K WITH DOT BELOW
> +1E35 ; Ll # LATIN SMALL LETTER K WITH LINE BELOW
> +1E37 ; Ll # LATIN SMALL LETTER L WITH DOT BELOW
> +1E39 ; Ll # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
> +1E3B ; Ll # LATIN SMALL LETTER L WITH LINE BELOW
> +1E3D ; Ll # LATIN SMALL LETTER L WITH CIRCUMFLEX BELOW
> +1E3F ; Ll # LATIN SMALL LETTER M WITH ACUTE
> +1E41 ; Ll # LATIN SMALL LETTER M WITH DOT ABOVE
> +1E43 ; Ll # LATIN SMALL LETTER M WITH DOT BELOW
> +1E45 ; Ll # LATIN SMALL LETTER N WITH DOT ABOVE
> +1E47 ; Ll # LATIN SMALL LETTER N WITH DOT BELOW
> +1E49 ; Ll # LATIN SMALL LETTER N WITH LINE BELOW
> +1E4B ; Ll # LATIN SMALL LETTER N WITH CIRCUMFLEX BELOW
> +1E4D ; Ll # LATIN SMALL LETTER O WITH TILDE AND ACUTE
> +1E4F ; Ll # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
> +1E51 ; Ll # LATIN SMALL LETTER O WITH MACRON AND GRAVE
> +1E53 ; Ll # LATIN SMALL LETTER O WITH MACRON AND ACUTE
> +1E55 ; Ll # LATIN SMALL LETTER P WITH ACUTE
> +1E57 ; Ll # LATIN SMALL LETTER P WITH DOT ABOVE
> +1E59 ; Ll # LATIN SMALL LETTER R WITH DOT ABOVE
> +1E5B ; Ll # LATIN SMALL LETTER R WITH DOT BELOW
> +1E5D ; Ll # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
> +1E5F ; Ll # LATIN SMALL LETTER R WITH LINE BELOW
> +1E61 ; Ll # LATIN SMALL LETTER S WITH DOT ABOVE
> +1E63 ; Ll # LATIN SMALL LETTER S WITH DOT BELOW
> +1E65 ; Ll # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
> +1E67 ; Ll # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
> +1E69 ; Ll # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
> +1E6B ; Ll # LATIN SMALL LETTER T WITH DOT ABOVE
> +1E6D ; Ll # LATIN SMALL LETTER T WITH DOT BELOW
> +1E6F ; Ll # LATIN SMALL LETTER T WITH LINE BELOW
> +1E71 ; Ll # LATIN SMALL LETTER T WITH CIRCUMFLEX BELOW
> +1E73 ; Ll # LATIN SMALL LETTER U WITH DIAERESIS BELOW
> +1E75 ; Ll # LATIN SMALL LETTER U WITH TILDE BELOW
> +1E77 ; Ll # LATIN SMALL LETTER U WITH CIRCUMFLEX BELOW
> +1E79 ; Ll # LATIN SMALL LETTER U WITH TILDE AND ACUTE
> +1E7B ; Ll # LATIN SMALL LETTER U WITH MACRON AND DIAERESIS
> +1E7D ; Ll # LATIN SMALL LETTER V WITH TILDE
> +1E7F ; Ll # LATIN SMALL LETTER V WITH DOT BELOW
> +1E81 ; Ll # LATIN SMALL LETTER W WITH GRAVE
> +1E83 ; Ll # LATIN SMALL LETTER W WITH ACUTE
> +1E85 ; Ll # LATIN SMALL LETTER W WITH DIAERESIS
> +1E87 ; Ll # LATIN SMALL LETTER W WITH DOT ABOVE
> +1E89 ; Ll # LATIN SMALL LETTER W WITH DOT BELOW
> +1E8B ; Ll # LATIN SMALL LETTER X WITH DOT ABOVE
> +1E8D ; Ll # LATIN SMALL LETTER X WITH DIAERESIS
> +1E8F ; Ll # LATIN SMALL LETTER Y WITH DOT ABOVE
> +1E91 ; Ll # LATIN SMALL LETTER Z WITH CIRCUMFLEX
> +1E93 ; Ll # LATIN SMALL LETTER Z WITH DOT BELOW
> +1E95..1E9D ; Ll # [9] LATIN SMALL LETTER Z WITH LINE BELOW..LATIN SMALL LETTER LONG S WITH HIGH STROKE
> +1E9F ; Ll # LATIN SMALL LETTER DELTA
> +1EA1 ; Ll # LATIN SMALL LETTER A WITH DOT BELOW
> +1EA3 ; Ll # LATIN SMALL LETTER A WITH HOOK ABOVE
> +1EA5 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
> +1EA7 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
> +1EA9 ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
> +1EAB ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
> +1EAD ; Ll # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
> +1EAF ; Ll # LATIN SMALL LETTER A WITH BREVE AND ACUTE
> +1EB1 ; Ll # LATIN SMALL LETTER A WITH BREVE AND GRAVE
> +1EB3 ; Ll # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
> +1EB5 ; Ll # LATIN SMALL LETTER A WITH BREVE AND TILDE
> +1EB7 ; Ll # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
> +1EB9 ; Ll # LATIN SMALL LETTER E WITH DOT BELOW
> +1EBB ; Ll # LATIN SMALL LETTER E WITH HOOK ABOVE
> +1EBD ; Ll # LATIN SMALL LETTER E WITH TILDE
> +1EBF ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
> +1EC1 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
> +1EC3 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
> +1EC5 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
> +1EC7 ; Ll # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
> +1EC9 ; Ll # LATIN SMALL LETTER I WITH HOOK ABOVE
> +1ECB ; Ll # LATIN SMALL LETTER I WITH DOT BELOW
> +1ECD ; Ll # LATIN SMALL LETTER O WITH DOT BELOW
> +1ECF ; Ll # LATIN SMALL LETTER O WITH HOOK ABOVE
> +1ED1 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
> +1ED3 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
> +1ED5 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
> +1ED7 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
> +1ED9 ; Ll # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
> +1EDB ; Ll # LATIN SMALL LETTER O WITH HORN AND ACUTE
> +1EDD ; Ll # LATIN SMALL LETTER O WITH HORN AND GRAVE
> +1EDF ; Ll # LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE
> +1EE1 ; Ll # LATIN SMALL LETTER O WITH HORN AND TILDE
> +1EE3 ; Ll # LATIN SMALL LETTER O WITH HORN AND DOT BELOW
> +1EE5 ; Ll # LATIN SMALL LETTER U WITH DOT BELOW
> +1EE7 ; Ll # LATIN SMALL LETTER U WITH HOOK ABOVE
> +1EE9 ; Ll # LATIN SMALL LETTER U WITH HORN AND ACUTE
> +1EEB ; Ll # LATIN SMALL LETTER U WITH HORN AND GRAVE
> +1EED ; Ll # LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE
> +1EEF ; Ll # LATIN SMALL LETTER U WITH HORN AND TILDE
> +1EF1 ; Ll # LATIN SMALL LETTER U WITH HORN AND DOT BELOW
> +1EF3 ; Ll # LATIN SMALL LETTER Y WITH GRAVE
> +1EF5 ; Ll # LATIN SMALL LETTER Y WITH DOT BELOW
> +1EF7 ; Ll # LATIN SMALL LETTER Y WITH HOOK ABOVE
> +1EF9 ; Ll # LATIN SMALL LETTER Y WITH TILDE
> +1EFB ; Ll # LATIN SMALL LETTER MIDDLE-WELSH LL
> +1EFD ; Ll # LATIN SMALL LETTER MIDDLE-WELSH V
> +1EFF..1F07 ; Ll # [9] LATIN SMALL LETTER Y WITH LOOP..GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI
> +1F10..1F15 ; Ll # [6] GREEK SMALL LETTER EPSILON WITH PSILI..GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA
> +1F20..1F27 ; Ll # [8] GREEK SMALL LETTER ETA WITH PSILI..GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI
> +1F30..1F37 ; Ll # [8] GREEK SMALL LETTER IOTA WITH PSILI..GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI
> +1F40..1F45 ; Ll # [6] GREEK SMALL LETTER OMICRON WITH PSILI..GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA
> +1F50..1F57 ; Ll # [8] GREEK SMALL LETTER UPSILON WITH PSILI..GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI
> +1F60..1F67 ; Ll # [8] GREEK SMALL LETTER OMEGA WITH PSILI..GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI
> +1F70..1F7D ; Ll # [14] GREEK SMALL LETTER ALPHA WITH VARIA..GREEK SMALL LETTER OMEGA WITH OXIA
> +1F80..1F87 ; Ll # [8] GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
> +1F90..1F97 ; Ll # [8] GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
> +1FA0..1FA7 ; Ll # [8] GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
> +1FB0..1FB4 ; Ll # [5] GREEK SMALL LETTER ALPHA WITH VRACHY..GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
> +1FB6..1FB7 ; Ll # [2] GREEK SMALL LETTER ALPHA WITH PERISPOMENI..GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI
> +1FBE ; Ll # GREEK PROSGEGRAMMENI
> +1FC2..1FC4 ; Ll # [3] GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
> +1FC6..1FC7 ; Ll # [2] GREEK SMALL LETTER ETA WITH PERISPOMENI..GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI
> +1FD0..1FD3 ; Ll # [4] GREEK SMALL LETTER IOTA WITH VRACHY..GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
> +1FD6..1FD7 ; Ll # [2] GREEK SMALL LETTER IOTA WITH PERISPOMENI..GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
> +1FE0..1FE7 ; Ll # [8] GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
> +1FF2..1FF4 ; Ll # [3] GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
> +1FF6..1FF7 ; Ll # [2] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
> +210A ; Ll # SCRIPT SMALL G
> +210E..210F ; Ll # [2] PLANCK CONSTANT..PLANCK CONSTANT OVER TWO PI
> +2113 ; Ll # SCRIPT SMALL L
> +212F ; Ll # SCRIPT SMALL E
> +2134 ; Ll # SCRIPT SMALL O
> +2139 ; Ll # INFORMATION SOURCE
> +213C..213D ; Ll # [2] DOUBLE-STRUCK SMALL PI..DOUBLE-STRUCK SMALL GAMMA
> +2146..2149 ; Ll # [4] DOUBLE-STRUCK ITALIC SMALL D..DOUBLE-STRUCK ITALIC SMALL J
> +214E ; Ll # TURNED SMALL F
> +2184 ; Ll # LATIN SMALL LETTER REVERSED C
> +2C30..2C5F ; Ll # [48] GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL LETTER CAUDATE CHRIVI
> +2C61 ; Ll # LATIN SMALL LETTER L WITH DOUBLE BAR
> +2C65..2C66 ; Ll # [2] LATIN SMALL LETTER A WITH STROKE..LATIN SMALL LETTER T WITH DIAGONAL STROKE
> +2C68 ; Ll # LATIN SMALL LETTER H WITH DESCENDER
> +2C6A ; Ll # LATIN SMALL LETTER K WITH DESCENDER
> +2C6C ; Ll # LATIN SMALL LETTER Z WITH DESCENDER
> +2C71 ; Ll # LATIN SMALL LETTER V WITH RIGHT HOOK
> +2C73..2C74 ; Ll # [2] LATIN SMALL LETTER W WITH HOOK..LATIN SMALL LETTER V WITH CURL
> +2C76..2C7B ; Ll # [6] LATIN SMALL LETTER HALF H..LATIN LETTER SMALL CAPITAL TURNED E
> +2C81 ; Ll # COPTIC SMALL LETTER ALFA
> +2C83 ; Ll # COPTIC SMALL LETTER VIDA
> +2C85 ; Ll # COPTIC SMALL LETTER GAMMA
> +2C87 ; Ll # COPTIC SMALL LETTER DALDA
> +2C89 ; Ll # COPTIC SMALL LETTER EIE
> +2C8B ; Ll # COPTIC SMALL LETTER SOU
> +2C8D ; Ll # COPTIC SMALL LETTER ZATA
> +2C8F ; Ll # COPTIC SMALL LETTER HATE
> +2C91 ; Ll # COPTIC SMALL LETTER THETHE
> +2C93 ; Ll # COPTIC SMALL LETTER IAUDA
> +2C95 ; Ll # COPTIC SMALL LETTER KAPA
> +2C97 ; Ll # COPTIC SMALL LETTER LAULA
> +2C99 ; Ll # COPTIC SMALL LETTER MI
> +2C9B ; Ll # COPTIC SMALL LETTER NI
> +2C9D ; Ll # COPTIC SMALL LETTER KSI
> +2C9F ; Ll # COPTIC SMALL LETTER O
> +2CA1 ; Ll # COPTIC SMALL LETTER PI
> +2CA3 ; Ll # COPTIC SMALL LETTER RO
> +2CA5 ; Ll # COPTIC SMALL LETTER SIMA
> +2CA7 ; Ll # COPTIC SMALL LETTER TAU
> +2CA9 ; Ll # COPTIC SMALL LETTER UA
> +2CAB ; Ll # COPTIC SMALL LETTER FI
> +2CAD ; Ll # COPTIC SMALL LETTER KHI
> +2CAF ; Ll # COPTIC SMALL LETTER PSI
> +2CB1 ; Ll # COPTIC SMALL LETTER OOU
> +2CB3 ; Ll # COPTIC SMALL LETTER DIALECT-P ALEF
> +2CB5 ; Ll # COPTIC SMALL LETTER OLD COPTIC AIN
> +2CB7 ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC EIE
> +2CB9 ; Ll # COPTIC SMALL LETTER DIALECT-P KAPA
> +2CBB ; Ll # COPTIC SMALL LETTER DIALECT-P NI
> +2CBD ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC NI
> +2CBF ; Ll # COPTIC SMALL LETTER OLD COPTIC OOU
> +2CC1 ; Ll # COPTIC SMALL LETTER SAMPI
> +2CC3 ; Ll # COPTIC SMALL LETTER CROSSED SHEI
> +2CC5 ; Ll # COPTIC SMALL LETTER OLD COPTIC SHEI
> +2CC7 ; Ll # COPTIC SMALL LETTER OLD COPTIC ESH
> +2CC9 ; Ll # COPTIC SMALL LETTER AKHMIMIC KHEI
> +2CCB ; Ll # COPTIC SMALL LETTER DIALECT-P HORI
> +2CCD ; Ll # COPTIC SMALL LETTER OLD COPTIC HORI
> +2CCF ; Ll # COPTIC SMALL LETTER OLD COPTIC HA
> +2CD1 ; Ll # COPTIC SMALL LETTER L-SHAPED HA
> +2CD3 ; Ll # COPTIC SMALL LETTER OLD COPTIC HEI
> +2CD5 ; Ll # COPTIC SMALL LETTER OLD COPTIC HAT
> +2CD7 ; Ll # COPTIC SMALL LETTER OLD COPTIC GANGIA
> +2CD9 ; Ll # COPTIC SMALL LETTER OLD COPTIC DJA
> +2CDB ; Ll # COPTIC SMALL LETTER OLD COPTIC SHIMA
> +2CDD ; Ll # COPTIC SMALL LETTER OLD NUBIAN SHIMA
> +2CDF ; Ll # COPTIC SMALL LETTER OLD NUBIAN NGI
> +2CE1 ; Ll # COPTIC SMALL LETTER OLD NUBIAN NYI
> +2CE3..2CE4 ; Ll # [2] COPTIC SMALL LETTER OLD NUBIAN WAU..COPTIC SYMBOL KAI
> +2CEC ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC SHEI
> +2CEE ; Ll # COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA
> +2CF3 ; Ll # COPTIC SMALL LETTER BOHAIRIC KHEI
> +2D00..2D25 ; Ll # [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE
> +2D27 ; Ll # GEORGIAN SMALL LETTER YN
> +2D2D ; Ll # GEORGIAN SMALL LETTER AEN
> +A641 ; Ll # CYRILLIC SMALL LETTER ZEMLYA
> +A643 ; Ll # CYRILLIC SMALL LETTER DZELO
> +A645 ; Ll # CYRILLIC SMALL LETTER REVERSED DZE
> +A647 ; Ll # CYRILLIC SMALL LETTER IOTA
> +A649 ; Ll # CYRILLIC SMALL LETTER DJERV
> +A64B ; Ll # CYRILLIC SMALL LETTER MONOGRAPH UK
> +A64D ; Ll # CYRILLIC SMALL LETTER BROAD OMEGA
> +A64F ; Ll # CYRILLIC SMALL LETTER NEUTRAL YER
> +A651 ; Ll # CYRILLIC SMALL LETTER YERU WITH BACK YER
> +A653 ; Ll # CYRILLIC SMALL LETTER IOTIFIED YAT
> +A655 ; Ll # CYRILLIC SMALL LETTER REVERSED YU
> +A657 ; Ll # CYRILLIC SMALL LETTER IOTIFIED A
> +A659 ; Ll # CYRILLIC SMALL LETTER CLOSED LITTLE YUS
> +A65B ; Ll # CYRILLIC SMALL LETTER BLENDED YUS
> +A65D ; Ll # CYRILLIC SMALL LETTER IOTIFIED CLOSED LITTLE YUS
> +A65F ; Ll # CYRILLIC SMALL LETTER YN
> +A661 ; Ll # CYRILLIC SMALL LETTER REVERSED TSE
> +A663 ; Ll # CYRILLIC SMALL LETTER SOFT DE
> +A665 ; Ll # CYRILLIC SMALL LETTER SOFT EL
> +A667 ; Ll # CYRILLIC SMALL LETTER SOFT EM
> +A669 ; Ll # CYRILLIC SMALL LETTER MONOCULAR O
> +A66B ; Ll # CYRILLIC SMALL LETTER BINOCULAR O
> +A66D ; Ll # CYRILLIC SMALL LETTER DOUBLE MONOCULAR O
> +A681 ; Ll # CYRILLIC SMALL LETTER DWE
> +A683 ; Ll # CYRILLIC SMALL LETTER DZWE
> +A685 ; Ll # CYRILLIC SMALL LETTER ZHWE
> +A687 ; Ll # CYRILLIC SMALL LETTER CCHE
> +A689 ; Ll # CYRILLIC SMALL LETTER DZZE
> +A68B ; Ll # CYRILLIC SMALL LETTER TE WITH MIDDLE HOOK
> +A68D ; Ll # CYRILLIC SMALL LETTER TWE
> +A68F ; Ll # CYRILLIC SMALL LETTER TSWE
> +A691 ; Ll # CYRILLIC SMALL LETTER TSSE
> +A693 ; Ll # CYRILLIC SMALL LETTER TCHE
> +A695 ; Ll # CYRILLIC SMALL LETTER HWE
> +A697 ; Ll # CYRILLIC SMALL LETTER SHWE
> +A699 ; Ll # CYRILLIC SMALL LETTER DOUBLE O
> +A69B ; Ll # CYRILLIC SMALL LETTER CROSSED O
> +A723 ; Ll # LATIN SMALL LETTER EGYPTOLOGICAL ALEF
> +A725 ; Ll # LATIN SMALL LETTER EGYPTOLOGICAL AIN
> +A727 ; Ll # LATIN SMALL LETTER HENG
> +A729 ; Ll # LATIN SMALL LETTER TZ
> +A72B ; Ll # LATIN SMALL LETTER TRESILLO
> +A72D ; Ll # LATIN SMALL LETTER CUATRILLO
> +A72F..A731 ; Ll # [3] LATIN SMALL LETTER CUATRILLO WITH COMMA..LATIN LETTER SMALL CAPITAL S
> +A733 ; Ll # LATIN SMALL LETTER AA
> +A735 ; Ll # LATIN SMALL LETTER AO
> +A737 ; Ll # LATIN SMALL LETTER AU
> +A739 ; Ll # LATIN SMALL LETTER AV
> +A73B ; Ll # LATIN SMALL LETTER AV WITH HORIZONTAL BAR
> +A73D ; Ll # LATIN SMALL LETTER AY
> +A73F ; Ll # LATIN SMALL LETTER REVERSED C WITH DOT
> +A741 ; Ll # LATIN SMALL LETTER K WITH STROKE
> +A743 ; Ll # LATIN SMALL LETTER K WITH DIAGONAL STROKE
> +A745 ; Ll # LATIN SMALL LETTER K WITH STROKE AND DIAGONAL STROKE
> +A747 ; Ll # LATIN SMALL LETTER BROKEN L
> +A749 ; Ll # LATIN SMALL LETTER L WITH HIGH STROKE
> +A74B ; Ll # LATIN SMALL LETTER O WITH LONG STROKE OVERLAY
> +A74D ; Ll # LATIN SMALL LETTER O WITH LOOP
> +A74F ; Ll # LATIN SMALL LETTER OO
> +A751 ; Ll # LATIN SMALL LETTER P WITH STROKE THROUGH DESCENDER
> +A753 ; Ll # LATIN SMALL LETTER P WITH FLOURISH
> +A755 ; Ll # LATIN SMALL LETTER P WITH SQUIRREL TAIL
> +A757 ; Ll # LATIN SMALL LETTER Q WITH STROKE THROUGH DESCENDER
> +A759 ; Ll # LATIN SMALL LETTER Q WITH DIAGONAL STROKE
> +A75B ; Ll # LATIN SMALL LETTER R ROTUNDA
> +A75D ; Ll # LATIN SMALL LETTER RUM ROTUNDA
> +A75F ; Ll # LATIN SMALL LETTER V WITH DIAGONAL STROKE
> +A761 ; Ll # LATIN SMALL LETTER VY
> +A763 ; Ll # LATIN SMALL LETTER VISIGOTHIC Z
> +A765 ; Ll # LATIN SMALL LETTER THORN WITH STROKE
> +A767 ; Ll # LATIN SMALL LETTER THORN WITH STROKE THROUGH DESCENDER
> +A769 ; Ll # LATIN SMALL LETTER VEND
> +A76B ; Ll # LATIN SMALL LETTER ET
> +A76D ; Ll # LATIN SMALL LETTER IS
> +A76F ; Ll # LATIN SMALL LETTER CON
> +A771..A778 ; Ll # [8] LATIN SMALL LETTER DUM..LATIN SMALL LETTER UM
> +A77A ; Ll # LATIN SMALL LETTER INSULAR D
> +A77C ; Ll # LATIN SMALL LETTER INSULAR F
> +A77F ; Ll # LATIN SMALL LETTER TURNED INSULAR G
> +A781 ; Ll # LATIN SMALL LETTER TURNED L
> +A783 ; Ll # LATIN SMALL LETTER INSULAR R
> +A785 ; Ll # LATIN SMALL LETTER INSULAR S
> +A787 ; Ll # LATIN SMALL LETTER INSULAR T
> +A78C ; Ll # LATIN SMALL LETTER SALTILLO
> +A78E ; Ll # LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT
> +A791 ; Ll # LATIN SMALL LETTER N WITH DESCENDER
> +A793..A795 ; Ll # [3] LATIN SMALL LETTER C WITH BAR..LATIN SMALL LETTER H WITH PALATAL HOOK
> +A797 ; Ll # LATIN SMALL LETTER B WITH FLOURISH
> +A799 ; Ll # LATIN SMALL LETTER F WITH STROKE
> +A79B ; Ll # LATIN SMALL LETTER VOLAPUK AE
> +A79D ; Ll # LATIN SMALL LETTER VOLAPUK OE
> +A79F ; Ll # LATIN SMALL LETTER VOLAPUK UE
> +A7A1 ; Ll # LATIN SMALL LETTER G WITH OBLIQUE STROKE
> +A7A3 ; Ll # LATIN SMALL LETTER K WITH OBLIQUE STROKE
> +A7A5 ; Ll # LATIN SMALL LETTER N WITH OBLIQUE STROKE
> +A7A7 ; Ll # LATIN SMALL LETTER R WITH OBLIQUE STROKE
> +A7A9 ; Ll # LATIN SMALL LETTER S WITH OBLIQUE STROKE
> +A7AF ; Ll # LATIN LETTER SMALL CAPITAL Q
> +A7B5 ; Ll # LATIN SMALL LETTER BETA
> +A7B7 ; Ll # LATIN SMALL LETTER OMEGA
> +A7B9 ; Ll # LATIN SMALL LETTER U WITH STROKE
> +A7BB ; Ll # LATIN SMALL LETTER GLOTTAL A
> +A7BD ; Ll # LATIN SMALL LETTER GLOTTAL I
> +A7BF ; Ll # LATIN SMALL LETTER GLOTTAL U
> +A7C1 ; Ll # LATIN SMALL LETTER OLD POLISH O
> +A7C3 ; Ll # LATIN SMALL LETTER ANGLICANA W
> +A7C8 ; Ll # LATIN SMALL LETTER D WITH SHORT STROKE OVERLAY
> +A7CA ; Ll # LATIN SMALL LETTER S WITH SHORT STROKE OVERLAY
> +A7CD ; Ll # LATIN SMALL LETTER S WITH DIAGONAL STROKE
> +A7D1 ; Ll # LATIN SMALL LETTER CLOSED INSULAR G
> +A7D3 ; Ll # LATIN SMALL LETTER DOUBLE THORN
> +A7D5 ; Ll # LATIN SMALL LETTER DOUBLE WYNN
> +A7D7 ; Ll # LATIN SMALL LETTER MIDDLE SCOTS S
> +A7D9 ; Ll # LATIN SMALL LETTER SIGMOID S
> +A7DB ; Ll # LATIN SMALL LETTER LAMBDA
> +A7F6 ; Ll # LATIN SMALL LETTER REVERSED HALF H
> +A7FA ; Ll # LATIN LETTER SMALL CAPITAL TURNED M
> +AB30..AB5A ; Ll # [43] LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL LETTER Y WITH SHORT RIGHT LEG
> +AB60..AB68 ; Ll # [9] LATIN SMALL LETTER SAKHA YAT..LATIN SMALL LETTER TURNED R WITH MIDDLE TILDE
> +AB70..ABBF ; Ll # [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTER YA
> +FB00..FB06 ; Ll # [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE ST
> +FB13..FB17 ; Ll # [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH
> +FF41..FF5A ; Ll # [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z
> +10428..1044F ; Ll # [40] DESERET SMALL LETTER LONG I..DESERET SMALL LETTER EW
> +104D8..104FB ; Ll # [36] OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA
> +10597..105A1 ; Ll # [11] VITHKUQI SMALL LETTER A..VITHKUQI SMALL LETTER GA
> +105A3..105B1 ; Ll # [15] VITHKUQI SMALL LETTER HA..VITHKUQI SMALL LETTER RE
> +105B3..105B9 ; Ll # [7] VITHKUQI SMALL LETTER SE..VITHKUQI SMALL LETTER XE
> +105BB..105BC ; Ll # [2] VITHKUQI SMALL LETTER Y..VITHKUQI SMALL LETTER ZE
> +10CC0..10CF2 ; Ll # [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US
> +10D70..10D85 ; Ll # [22] GARAY SMALL LETTER A..GARAY SMALL LETTER OLD NA
> +118C0..118DF ; Ll # [32] WARANG CITI SMALL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
> +16E60..16E7F ; Ll # [32] MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL LETTER Y
> +1D41A..1D433 ; Ll # [26] MATHEMATICAL BOLD SMALL A..MATHEMATICAL BOLD SMALL Z
> +1D44E..1D454 ; Ll # [7] MATHEMATICAL ITALIC SMALL A..MATHEMATICAL ITALIC SMALL G
> +1D456..1D467 ; Ll # [18] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL ITALIC SMALL Z
> +1D482..1D49B ; Ll # [26] MATHEMATICAL BOLD ITALIC SMALL A..MATHEMATICAL BOLD ITALIC SMALL Z
> +1D4B6..1D4B9 ; Ll # [4] MATHEMATICAL SCRIPT SMALL A..MATHEMATICAL SCRIPT SMALL D
> +1D4BB ; Ll # MATHEMATICAL SCRIPT SMALL F
> +1D4BD..1D4C3 ; Ll # [7] MATHEMATICAL SCRIPT SMALL H..MATHEMATICAL SCRIPT SMALL N
> +1D4C5..1D4CF ; Ll # [11] MATHEMATICAL SCRIPT SMALL P..MATHEMATICAL SCRIPT SMALL Z
> +1D4EA..1D503 ; Ll # [26] MATHEMATICAL BOLD SCRIPT SMALL A..MATHEMATICAL BOLD SCRIPT SMALL Z
> +1D51E..1D537 ; Ll # [26] MATHEMATICAL FRAKTUR SMALL A..MATHEMATICAL FRAKTUR SMALL Z
> +1D552..1D56B ; Ll # [26] MATHEMATICAL DOUBLE-STRUCK SMALL A..MATHEMATICAL DOUBLE-STRUCK SMALL Z
> +1D586..1D59F ; Ll # [26] MATHEMATICAL BOLD FRAKTUR SMALL A..MATHEMATICAL BOLD FRAKTUR SMALL Z
> +1D5BA..1D5D3 ; Ll # [26] MATHEMATICAL SANS-SERIF SMALL A..MATHEMATICAL SANS-SERIF SMALL Z
> +1D5EE..1D607 ; Ll # [26] MATHEMATICAL SANS-SERIF BOLD SMALL A..MATHEMATICAL SANS-SERIF BOLD SMALL Z
> +1D622..1D63B ; Ll # [26] MATHEMATICAL SANS-SERIF ITALIC SMALL A..MATHEMATICAL SANS-SERIF ITALIC SMALL Z
> +1D656..1D66F ; Ll # [26] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL A..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL Z
> +1D68A..1D6A5 ; Ll # [28] MATHEMATICAL MONOSPACE SMALL A..MATHEMATICAL ITALIC SMALL DOTLESS J
> +1D6C2..1D6DA ; Ll # [25] MATHEMATICAL BOLD SMALL ALPHA..MATHEMATICAL BOLD SMALL OMEGA
> +1D6DC..1D6E1 ; Ll # [6] MATHEMATICAL BOLD EPSILON SYMBOL..MATHEMATICAL BOLD PI SYMBOL
> +1D6FC..1D714 ; Ll # [25] MATHEMATICAL ITALIC SMALL ALPHA..MATHEMATICAL ITALIC SMALL OMEGA
> +1D716..1D71B ; Ll # [6] MATHEMATICAL ITALIC EPSILON SYMBOL..MATHEMATICAL ITALIC PI SYMBOL
> +1D736..1D74E ; Ll # [25] MATHEMATICAL BOLD ITALIC SMALL ALPHA..MATHEMATICAL BOLD ITALIC SMALL OMEGA
> +1D750..1D755 ; Ll # [6] MATHEMATICAL BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD ITALIC PI SYMBOL
> +1D770..1D788 ; Ll # [25] MATHEMATICAL SANS-SERIF BOLD SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD SMALL OMEGA
> +1D78A..1D78F ; Ll # [6] MATHEMATICAL SANS-SERIF BOLD EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD PI SYMBOL
> +1D7AA..1D7C2 ; Ll # [25] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA
> +1D7C4..1D7C9 ; Ll # [6] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL
> +1D7CB ; Ll # MATHEMATICAL BOLD SMALL DIGAMMA
> +1DF00..1DF09 ; Ll # [10] LATIN SMALL LETTER FENG DIGRAPH WITH TRILL..LATIN SMALL LETTER T WITH HOOK AND RETROFLEX HOOK
> +1DF0B..1DF1E ; Ll # [20] LATIN SMALL LETTER ESH WITH DOUBLE BAR..LATIN SMALL LETTER S WITH CURL
> +1DF25..1DF2A ; Ll # [6] LATIN SMALL LETTER D WITH MID-HEIGHT LEFT HOOK..LATIN SMALL LETTER T WITH MID-HEIGHT LEFT HOOK
> +1E922..1E943 ; Ll # [34] ADLAM SMALL LETTER ALIF..ADLAM SMALL LETTER SHA
> +
> +# Total code points: 2258
> +
> +# ================================================
> +
> +# General_Category=Titlecase_Letter
> +
> +01C5 ; Lt # LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
> +01C8 ; Lt # LATIN CAPITAL LETTER L WITH SMALL LETTER J
> +01CB ; Lt # LATIN CAPITAL LETTER N WITH SMALL LETTER J
> +01F2 ; Lt # LATIN CAPITAL LETTER D WITH SMALL LETTER Z
> +1F88..1F8F ; Lt # [8] GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
> +1F98..1F9F ; Lt # [8] GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
> +1FA8..1FAF ; Lt # [8] GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI..GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
> +1FBC ; Lt # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
> +1FCC ; Lt # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
> +1FFC ; Lt # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
> +
> +# Total code points: 31
> +
> +# ================================================
> +
> +# General_Category=Modifier_Letter
> +
> +02B0..02C1 ; Lm # [18] MODIFIER LETTER SMALL H..MODIFIER LETTER REVERSED GLOTTAL STOP
> +02C6..02D1 ; Lm # [12] MODIFIER LETTER CIRCUMFLEX ACCENT..MODIFIER LETTER HALF TRIANGULAR COLON
> +02E0..02E4 ; Lm # [5] MODIFIER LETTER SMALL GAMMA..MODIFIER LETTER SMALL REVERSED GLOTTAL STOP
> +02EC ; Lm # MODIFIER LETTER VOICING
> +02EE ; Lm # MODIFIER LETTER DOUBLE APOSTROPHE
> +0374 ; Lm # GREEK NUMERAL SIGN
> +037A ; Lm # GREEK YPOGEGRAMMENI
> +0559 ; Lm # ARMENIAN MODIFIER LETTER LEFT HALF RING
> +0640 ; Lm # ARABIC TATWEEL
> +06E5..06E6 ; Lm # [2] ARABIC SMALL WAW..ARABIC SMALL YEH
> +07F4..07F5 ; Lm # [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE
> +07FA ; Lm # NKO LAJANYALAN
> +081A ; Lm # SAMARITAN MODIFIER LETTER EPENTHETIC YUT
> +0824 ; Lm # SAMARITAN MODIFIER LETTER SHORT A
> +0828 ; Lm # SAMARITAN MODIFIER LETTER I
> +08C9 ; Lm # ARABIC SMALL FARSI YEH
> +0971 ; Lm # DEVANAGARI SIGN HIGH SPACING DOT
> +0E46 ; Lm # THAI CHARACTER MAIYAMOK
> +0EC6 ; Lm # LAO KO LA
> +10FC ; Lm # MODIFIER LETTER GEORGIAN NAR
> +17D7 ; Lm # KHMER SIGN LEK TOO
> +1843 ; Lm # MONGOLIAN LETTER TODO LONG VOWEL SIGN
> +1AA7 ; Lm # TAI THAM SIGN MAI YAMOK
> +1C78..1C7D ; Lm # [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD
> +1D2C..1D6A ; Lm # [63] MODIFIER LETTER CAPITAL A..GREEK SUBSCRIPT SMALL LETTER CHI
> +1D78 ; Lm # MODIFIER LETTER CYRILLIC EN
> +1D9B..1DBF ; Lm # [37] MODIFIER LETTER SMALL TURNED ALPHA..MODIFIER LETTER SMALL THETA
> +2071 ; Lm # SUPERSCRIPT LATIN SMALL LETTER I
> +207F ; Lm # SUPERSCRIPT LATIN SMALL LETTER N
> +2090..209C ; Lm # [13] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER T
> +2C7C..2C7D ; Lm # [2] LATIN SUBSCRIPT SMALL LETTER J..MODIFIER LETTER CAPITAL V
> +2D6F ; Lm # TIFINAGH MODIFIER LETTER LABIALIZATION MARK
> +2E2F ; Lm # VERTICAL TILDE
> +3005 ; Lm # IDEOGRAPHIC ITERATION MARK
> +3031..3035 ; Lm # [5] VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT MARK LOWER HALF
> +303B ; Lm # VERTICAL IDEOGRAPHIC ITERATION MARK
> +309D..309E ; Lm # [2] HIRAGANA ITERATION MARK..HIRAGANA VOICED ITERATION MARK
> +30FC..30FE ; Lm # [3] KATAKANA-HIRAGANA PROLONGED SOUND MARK..KATAKANA VOICED ITERATION MARK
> +A015 ; Lm # YI SYLLABLE WU
> +A4F8..A4FD ; Lm # [6] LISU LETTER TONE MYA TI..LISU LETTER TONE MYA JEU
> +A60C ; Lm # VAI SYLLABLE LENGTHENER
> +A67F ; Lm # CYRILLIC PAYEROK
> +A69C..A69D ; Lm # [2] MODIFIER LETTER CYRILLIC HARD SIGN..MODIFIER LETTER CYRILLIC SOFT SIGN
> +A717..A71F ; Lm # [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK
> +A770 ; Lm # MODIFIER LETTER US
> +A788 ; Lm # MODIFIER LETTER LOW CIRCUMFLEX ACCENT
> +A7F2..A7F4 ; Lm # [3] MODIFIER LETTER CAPITAL C..MODIFIER LETTER CAPITAL Q
> +A7F8..A7F9 ; Lm # [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE
> +A9CF ; Lm # JAVANESE PANGRANGKEP
> +A9E6 ; Lm # MYANMAR MODIFIER LETTER SHAN REDUPLICATION
> +AA70 ; Lm # MYANMAR MODIFIER LETTER KHAMTI REDUPLICATION
> +AADD ; Lm # TAI VIET SYMBOL SAM
> +AAF3..AAF4 ; Lm # [2] MEETEI MAYEK SYLLABLE REPETITION MARK..MEETEI MAYEK WORD REPETITION MARK
> +AB5C..AB5F ; Lm # [4] MODIFIER LETTER SMALL HENG..MODIFIER LETTER SMALL U WITH LEFT HOOK
> +AB69 ; Lm # MODIFIER LETTER SMALL TURNED W
> +FF70 ; Lm # HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
> +FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK
> +10780..10785 ; Lm # [6] MODIFIER LETTER SMALL CAPITAL AA..MODIFIER LETTER SMALL B WITH HOOK
> +10787..107B0 ; Lm # [42] MODIFIER LETTER SMALL DZ DIGRAPH..MODIFIER LETTER SMALL V WITH RIGHT HOOK
> +107B2..107BA ; Lm # [9] MODIFIER LETTER SMALL CAPITAL Y..MODIFIER LETTER SMALL S WITH CURL
> +10D4E ; Lm # GARAY VOWEL LENGTH MARK
> +10D6F ; Lm # GARAY REDUPLICATION MARK
> +16B40..16B43 ; Lm # [4] PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN IB YAM
> +16D40..16D42 ; Lm # [3] KIRAT RAI SIGN ANUSVARA..KIRAT RAI SIGN VISARGA
> +16D6B..16D6C ; Lm # [2] KIRAT RAI SIGN VIRAMA..KIRAT RAI SIGN SAAT
> +16F93..16F9F ; Lm # [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
> +16FE0..16FE1 ; Lm # [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
> +16FE3 ; Lm # OLD CHINESE ITERATION MARK
> +1AFF0..1AFF3 ; Lm # [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
> +1AFF5..1AFFB ; Lm # [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
> +1AFFD..1AFFE ; Lm # [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
> +1E030..1E06D ; Lm # [62] MODIFIER LETTER CYRILLIC SMALL A..MODIFIER LETTER CYRILLIC SMALL STRAIGHT U WITH STROKE
> +1E137..1E13D ; Lm # [7] NYIAKENG PUACHUE HMONG SIGN FOR PERSON..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
> +1E4EB ; Lm # NAG MUNDARI SIGN OJOD
> +1E94B ; Lm # ADLAM NASALIZATION MARK
> +
> +# Total code points: 404
> +
> +# ================================================
> +
> +# General_Category=Other_Letter
> +
> +00AA ; Lo # FEMININE ORDINAL INDICATOR
> +00BA ; Lo # MASCULINE ORDINAL INDICATOR
> +01BB ; Lo # LATIN LETTER TWO WITH STROKE
> +01C0..01C3 ; Lo # [4] LATIN LETTER DENTAL CLICK..LATIN LETTER RETROFLEX CLICK
> +0294 ; Lo # LATIN LETTER GLOTTAL STOP
> +05D0..05EA ; Lo # [27] HEBREW LETTER ALEF..HEBREW LETTER TAV
> +05EF..05F2 ; Lo # [4] HEBREW YOD TRIANGLE..HEBREW LIGATURE YIDDISH DOUBLE YOD
> +0620..063F ; Lo # [32] ARABIC LETTER KASHMIRI YEH..ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE
> +0641..064A ; Lo # [10] ARABIC LETTER FEH..ARABIC LETTER YEH
> +066E..066F ; Lo # [2] ARABIC LETTER DOTLESS BEH..ARABIC LETTER DOTLESS QAF
> +0671..06D3 ; Lo # [99] ARABIC LETTER ALEF WASLA..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE
> +06D5 ; Lo # ARABIC LETTER AE
> +06EE..06EF ; Lo # [2] ARABIC LETTER DAL WITH INVERTED V..ARABIC LETTER REH WITH INVERTED V
> +06FA..06FC ; Lo # [3] ARABIC LETTER SHEEN WITH DOT BELOW..ARABIC LETTER GHAIN WITH DOT BELOW
> +06FF ; Lo # ARABIC LETTER HEH WITH INVERTED V
> +0710 ; Lo # SYRIAC LETTER ALAPH
> +0712..072F ; Lo # [30] SYRIAC LETTER BETH..SYRIAC LETTER PERSIAN DHALATH
> +074D..07A5 ; Lo # [89] SYRIAC LETTER SOGDIAN ZHAIN..THAANA LETTER WAAVU
> +07B1 ; Lo # THAANA LETTER NAA
> +07CA..07EA ; Lo # [33] NKO LETTER A..NKO LETTER JONA RA
> +0800..0815 ; Lo # [22] SAMARITAN LETTER ALAF..SAMARITAN LETTER TAAF
> +0840..0858 ; Lo # [25] MANDAIC LETTER HALQA..MANDAIC LETTER AIN
> +0860..086A ; Lo # [11] SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MALAYALAM SSA
> +0870..0887 ; Lo # [24] ARABIC LETTER ALEF WITH ATTACHED FATHA..ARABIC BASELINE ROUND DOT
> +0889..088E ; Lo # [6] ARABIC LETTER NOON WITH INVERTED SMALL V..ARABIC VERTICAL TAIL
> +08A0..08C8 ; Lo # [41] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER GRAF
> +0904..0939 ; Lo # [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA
> +093D ; Lo # DEVANAGARI SIGN AVAGRAHA
> +0950 ; Lo # DEVANAGARI OM
> +0958..0961 ; Lo # [10] DEVANAGARI LETTER QA..DEVANAGARI LETTER VOCALIC LL
> +0972..0980 ; Lo # [15] DEVANAGARI LETTER CANDRA A..BENGALI ANJI
> +0985..098C ; Lo # [8] BENGALI LETTER A..BENGALI LETTER VOCALIC L
> +098F..0990 ; Lo # [2] BENGALI LETTER E..BENGALI LETTER AI
> +0993..09A8 ; Lo # [22] BENGALI LETTER O..BENGALI LETTER NA
> +09AA..09B0 ; Lo # [7] BENGALI LETTER PA..BENGALI LETTER RA
> +09B2 ; Lo # BENGALI LETTER LA
> +09B6..09B9 ; Lo # [4] BENGALI LETTER SHA..BENGALI LETTER HA
> +09BD ; Lo # BENGALI SIGN AVAGRAHA
> +09CE ; Lo # BENGALI LETTER KHANDA TA
> +09DC..09DD ; Lo # [2] BENGALI LETTER RRA..BENGALI LETTER RHA
> +09DF..09E1 ; Lo # [3] BENGALI LETTER YYA..BENGALI LETTER VOCALIC LL
> +09F0..09F1 ; Lo # [2] BENGALI LETTER RA WITH MIDDLE DIAGONAL..BENGALI LETTER RA WITH LOWER DIAGONAL
> +09FC ; Lo # BENGALI LETTER VEDIC ANUSVARA
> +0A05..0A0A ; Lo # [6] GURMUKHI LETTER A..GURMUKHI LETTER UU
> +0A0F..0A10 ; Lo # [2] GURMUKHI LETTER EE..GURMUKHI LETTER AI
> +0A13..0A28 ; Lo # [22] GURMUKHI LETTER OO..GURMUKHI LETTER NA
> +0A2A..0A30 ; Lo # [7] GURMUKHI LETTER PA..GURMUKHI LETTER RA
> +0A32..0A33 ; Lo # [2] GURMUKHI LETTER LA..GURMUKHI LETTER LLA
> +0A35..0A36 ; Lo # [2] GURMUKHI LETTER VA..GURMUKHI LETTER SHA
> +0A38..0A39 ; Lo # [2] GURMUKHI LETTER SA..GURMUKHI LETTER HA
> +0A59..0A5C ; Lo # [4] GURMUKHI LETTER KHHA..GURMUKHI LETTER RRA
> +0A5E ; Lo # GURMUKHI LETTER FA
> +0A72..0A74 ; Lo # [3] GURMUKHI IRI..GURMUKHI EK ONKAR
> +0A85..0A8D ; Lo # [9] GUJARATI LETTER A..GUJARATI VOWEL CANDRA E
> +0A8F..0A91 ; Lo # [3] GUJARATI LETTER E..GUJARATI VOWEL CANDRA O
> +0A93..0AA8 ; Lo # [22] GUJARATI LETTER O..GUJARATI LETTER NA
> +0AAA..0AB0 ; Lo # [7] GUJARATI LETTER PA..GUJARATI LETTER RA
> +0AB2..0AB3 ; Lo # [2] GUJARATI LETTER LA..GUJARATI LETTER LLA
> +0AB5..0AB9 ; Lo # [5] GUJARATI LETTER VA..GUJARATI LETTER HA
> +0ABD ; Lo # GUJARATI SIGN AVAGRAHA
> +0AD0 ; Lo # GUJARATI OM
> +0AE0..0AE1 ; Lo # [2] GUJARATI LETTER VOCALIC RR..GUJARATI LETTER VOCALIC LL
> +0AF9 ; Lo # GUJARATI LETTER ZHA
> +0B05..0B0C ; Lo # [8] ORIYA LETTER A..ORIYA LETTER VOCALIC L
> +0B0F..0B10 ; Lo # [2] ORIYA LETTER E..ORIYA LETTER AI
> +0B13..0B28 ; Lo # [22] ORIYA LETTER O..ORIYA LETTER NA
> +0B2A..0B30 ; Lo # [7] ORIYA LETTER PA..ORIYA LETTER RA
> +0B32..0B33 ; Lo # [2] ORIYA LETTER LA..ORIYA LETTER LLA
> +0B35..0B39 ; Lo # [5] ORIYA LETTER VA..ORIYA LETTER HA
> +0B3D ; Lo # ORIYA SIGN AVAGRAHA
> +0B5C..0B5D ; Lo # [2] ORIYA LETTER RRA..ORIYA LETTER RHA
> +0B5F..0B61 ; Lo # [3] ORIYA LETTER YYA..ORIYA LETTER VOCALIC LL
> +0B71 ; Lo # ORIYA LETTER WA
> +0B83 ; Lo # TAMIL SIGN VISARGA
> +0B85..0B8A ; Lo # [6] TAMIL LETTER A..TAMIL LETTER UU
> +0B8E..0B90 ; Lo # [3] TAMIL LETTER E..TAMIL LETTER AI
> +0B92..0B95 ; Lo # [4] TAMIL LETTER O..TAMIL LETTER KA
> +0B99..0B9A ; Lo # [2] TAMIL LETTER NGA..TAMIL LETTER CA
> +0B9C ; Lo # TAMIL LETTER JA
> +0B9E..0B9F ; Lo # [2] TAMIL LETTER NYA..TAMIL LETTER TTA
> +0BA3..0BA4 ; Lo # [2] TAMIL LETTER NNA..TAMIL LETTER TA
> +0BA8..0BAA ; Lo # [3] TAMIL LETTER NA..TAMIL LETTER PA
> +0BAE..0BB9 ; Lo # [12] TAMIL LETTER MA..TAMIL LETTER HA
> +0BD0 ; Lo # TAMIL OM
> +0C05..0C0C ; Lo # [8] TELUGU LETTER A..TELUGU LETTER VOCALIC L
> +0C0E..0C10 ; Lo # [3] TELUGU LETTER E..TELUGU LETTER AI
> +0C12..0C28 ; Lo # [23] TELUGU LETTER O..TELUGU LETTER NA
> +0C2A..0C39 ; Lo # [16] TELUGU LETTER PA..TELUGU LETTER HA
> +0C3D ; Lo # TELUGU SIGN AVAGRAHA
> +0C58..0C5A ; Lo # [3] TELUGU LETTER TSA..TELUGU LETTER RRRA
> +0C5D ; Lo # TELUGU LETTER NAKAARA POLLU
> +0C60..0C61 ; Lo # [2] TELUGU LETTER VOCALIC RR..TELUGU LETTER VOCALIC LL
> +0C80 ; Lo # KANNADA SIGN SPACING CANDRABINDU
> +0C85..0C8C ; Lo # [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L
> +0C8E..0C90 ; Lo # [3] KANNADA LETTER E..KANNADA LETTER AI
> +0C92..0CA8 ; Lo # [23] KANNADA LETTER O..KANNADA LETTER NA
> +0CAA..0CB3 ; Lo # [10] KANNADA LETTER PA..KANNADA LETTER LLA
> +0CB5..0CB9 ; Lo # [5] KANNADA LETTER VA..KANNADA LETTER HA
> +0CBD ; Lo # KANNADA SIGN AVAGRAHA
> +0CDD..0CDE ; Lo # [2] KANNADA LETTER NAKAARA POLLU..KANNADA LETTER FA
> +0CE0..0CE1 ; Lo # [2] KANNADA LETTER VOCALIC RR..KANNADA LETTER VOCALIC LL
> +0CF1..0CF2 ; Lo # [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
> +0D04..0D0C ; Lo # [9] MALAYALAM LETTER VEDIC ANUSVARA..MALAYALAM LETTER VOCALIC L
> +0D0E..0D10 ; Lo # [3] MALAYALAM LETTER E..MALAYALAM LETTER AI
> +0D12..0D3A ; Lo # [41] MALAYALAM LETTER O..MALAYALAM LETTER TTTA
> +0D3D ; Lo # MALAYALAM SIGN AVAGRAHA
> +0D4E ; Lo # MALAYALAM LETTER DOT REPH
> +0D54..0D56 ; Lo # [3] MALAYALAM LETTER CHILLU M..MALAYALAM LETTER CHILLU LLL
> +0D5F..0D61 ; Lo # [3] MALAYALAM LETTER ARCHAIC II..MALAYALAM LETTER VOCALIC LL
> +0D7A..0D7F ; Lo # [6] MALAYALAM LETTER CHILLU NN..MALAYALAM LETTER CHILLU K
> +0D85..0D96 ; Lo # [18] SINHALA LETTER AYANNA..SINHALA LETTER AUYANNA
> +0D9A..0DB1 ; Lo # [24] SINHALA LETTER ALPAPRAANA KAYANNA..SINHALA LETTER DANTAJA NAYANNA
> +0DB3..0DBB ; Lo # [9] SINHALA LETTER SANYAKA DAYANNA..SINHALA LETTER RAYANNA
> +0DBD ; Lo # SINHALA LETTER DANTAJA LAYANNA
> +0DC0..0DC6 ; Lo # [7] SINHALA LETTER VAYANNA..SINHALA LETTER FAYANNA
> +0E01..0E30 ; Lo # [48] THAI CHARACTER KO KAI..THAI CHARACTER SARA A
> +0E32..0E33 ; Lo # [2] THAI CHARACTER SARA AA..THAI CHARACTER SARA AM
> +0E40..0E45 ; Lo # [6] THAI CHARACTER SARA E..THAI CHARACTER LAKKHANGYAO
> +0E81..0E82 ; Lo # [2] LAO LETTER KO..LAO LETTER KHO SUNG
> +0E84 ; Lo # LAO LETTER KHO TAM
> +0E86..0E8A ; Lo # [5] LAO LETTER PALI GHA..LAO LETTER SO TAM
> +0E8C..0EA3 ; Lo # [24] LAO LETTER PALI JHA..LAO LETTER LO LING
> +0EA5 ; Lo # LAO LETTER LO LOOT
> +0EA7..0EB0 ; Lo # [10] LAO LETTER WO..LAO VOWEL SIGN A
> +0EB2..0EB3 ; Lo # [2] LAO VOWEL SIGN AA..LAO VOWEL SIGN AM
> +0EBD ; Lo # LAO SEMIVOWEL SIGN NYO
> +0EC0..0EC4 ; Lo # [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI
> +0EDC..0EDF ; Lo # [4] LAO HO NO..LAO LETTER KHMU NYO
> +0F00 ; Lo # TIBETAN SYLLABLE OM
> +0F40..0F47 ; Lo # [8] TIBETAN LETTER KA..TIBETAN LETTER JA
> +0F49..0F6C ; Lo # [36] TIBETAN LETTER NYA..TIBETAN LETTER RRA
> +0F88..0F8C ; Lo # [5] TIBETAN SIGN LCE TSA CAN..TIBETAN SIGN INVERTED MCHU CAN
> +1000..102A ; Lo # [43] MYANMAR LETTER KA..MYANMAR LETTER AU
> +103F ; Lo # MYANMAR LETTER GREAT SA
> +1050..1055 ; Lo # [6] MYANMAR LETTER SHA..MYANMAR LETTER VOCALIC LL
> +105A..105D ; Lo # [4] MYANMAR LETTER MON NGA..MYANMAR LETTER MON BBE
> +1061 ; Lo # MYANMAR LETTER SGAW KAREN SHA
> +1065..1066 ; Lo # [2] MYANMAR LETTER WESTERN PWO KAREN THA..MYANMAR LETTER WESTERN PWO KAREN PWA
> +106E..1070 ; Lo # [3] MYANMAR LETTER EASTERN PWO KAREN NNA..MYANMAR LETTER EASTERN PWO KAREN GHWA
> +1075..1081 ; Lo # [13] MYANMAR LETTER SHAN KA..MYANMAR LETTER SHAN HA
> +108E ; Lo # MYANMAR LETTER RUMAI PALAUNG FA
> +1100..1248 ; Lo # [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA
> +124A..124D ; Lo # [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE
> +1250..1256 ; Lo # [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO
> +1258 ; Lo # ETHIOPIC SYLLABLE QHWA
> +125A..125D ; Lo # [4] ETHIOPIC SYLLABLE QHWI..ETHIOPIC SYLLABLE QHWE
> +1260..1288 ; Lo # [41] ETHIOPIC SYLLABLE BA..ETHIOPIC SYLLABLE XWA
> +128A..128D ; Lo # [4] ETHIOPIC SYLLABLE XWI..ETHIOPIC SYLLABLE XWE
> +1290..12B0 ; Lo # [33] ETHIOPIC SYLLABLE NA..ETHIOPIC SYLLABLE KWA
> +12B2..12B5 ; Lo # [4] ETHIOPIC SYLLABLE KWI..ETHIOPIC SYLLABLE KWE
> +12B8..12BE ; Lo # [7] ETHIOPIC SYLLABLE KXA..ETHIOPIC SYLLABLE KXO
> +12C0 ; Lo # ETHIOPIC SYLLABLE KXWA
> +12C2..12C5 ; Lo # [4] ETHIOPIC SYLLABLE KXWI..ETHIOPIC SYLLABLE KXWE
> +12C8..12D6 ; Lo # [15] ETHIOPIC SYLLABLE WA..ETHIOPIC SYLLABLE PHARYNGEAL O
> +12D8..1310 ; Lo # [57] ETHIOPIC SYLLABLE ZA..ETHIOPIC SYLLABLE GWA
> +1312..1315 ; Lo # [4] ETHIOPIC SYLLABLE GWI..ETHIOPIC SYLLABLE GWE
> +1318..135A ; Lo # [67] ETHIOPIC SYLLABLE GGA..ETHIOPIC SYLLABLE FYA
> +1380..138F ; Lo # [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE
> +1401..166C ; Lo # [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA
> +166F..167F ; Lo # [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W
> +1681..169A ; Lo # [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH
> +16A0..16EA ; Lo # [75] RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X
> +16F1..16F8 ; Lo # [8] RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AESC
> +1700..1711 ; Lo # [18] TAGALOG LETTER A..TAGALOG LETTER HA
> +171F..1731 ; Lo # [19] TAGALOG LETTER ARCHAIC RA..HANUNOO LETTER HA
> +1740..1751 ; Lo # [18] BUHID LETTER A..BUHID LETTER HA
> +1760..176C ; Lo # [13] TAGBANWA LETTER A..TAGBANWA LETTER YA
> +176E..1770 ; Lo # [3] TAGBANWA LETTER LA..TAGBANWA LETTER SA
> +1780..17B3 ; Lo # [52] KHMER LETTER KA..KHMER INDEPENDENT VOWEL QAU
> +17DC ; Lo # KHMER SIGN AVAKRAHASANYA
> +1820..1842 ; Lo # [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI
> +1844..1878 ; Lo # [53] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER CHA WITH TWO DOTS
> +1880..1884 ; Lo # [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA
> +1887..18A8 ; Lo # [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA
> +18AA ; Lo # MONGOLIAN LETTER MANCHU ALI GALI LHA
> +18B0..18F5 ; Lo # [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S
> +1900..191E ; Lo # [31] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER TRA
> +1950..196D ; Lo # [30] TAI LE LETTER KA..TAI LE LETTER AI
> +1970..1974 ; Lo # [5] TAI LE LETTER TONE-2..TAI LE LETTER TONE-6
> +1980..19AB ; Lo # [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA
> +19B0..19C9 ; Lo # [26] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE TONE MARK-2
> +1A00..1A16 ; Lo # [23] BUGINESE LETTER KA..BUGINESE LETTER HA
> +1A20..1A54 ; Lo # [53] TAI THAM LETTER HIGH KA..TAI THAM LETTER GREAT SA
> +1B05..1B33 ; Lo # [47] BALINESE LETTER AKARA..BALINESE LETTER HA
> +1B45..1B4C ; Lo # [8] BALINESE LETTER KAF SASAK..BALINESE LETTER ARCHAIC JNYA
> +1B83..1BA0 ; Lo # [30] SUNDANESE LETTER A..SUNDANESE LETTER HA
> +1BAE..1BAF ; Lo # [2] SUNDANESE LETTER KHA..SUNDANESE LETTER SYA
> +1BBA..1BE5 ; Lo # [44] SUNDANESE AVAGRAHA..BATAK LETTER U
> +1C00..1C23 ; Lo # [36] LEPCHA LETTER KA..LEPCHA LETTER A
> +1C4D..1C4F ; Lo # [3] LEPCHA LETTER TTA..LEPCHA LETTER DDA
> +1C5A..1C77 ; Lo # [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH
> +1CE9..1CEC ; Lo # [4] VEDIC SIGN ANUSVARA ANTARGOMUKHA..VEDIC SIGN ANUSVARA VAMAGOMUKHA WITH TAIL
> +1CEE..1CF3 ; Lo # [6] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ROTATED ARDHAVISARGA
> +1CF5..1CF6 ; Lo # [2] VEDIC SIGN JIHVAMULIYA..VEDIC SIGN UPADHMANIYA
> +1CFA ; Lo # VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
> +2135..2138 ; Lo # [4] ALEF SYMBOL..DALET SYMBOL
> +2D30..2D67 ; Lo # [56] TIFINAGH LETTER YA..TIFINAGH LETTER YO
> +2D80..2D96 ; Lo # [23] ETHIOPIC SYLLABLE LOA..ETHIOPIC SYLLABLE GGWE
> +2DA0..2DA6 ; Lo # [7] ETHIOPIC SYLLABLE SSA..ETHIOPIC SYLLABLE SSO
> +2DA8..2DAE ; Lo # [7] ETHIOPIC SYLLABLE CCA..ETHIOPIC SYLLABLE CCO
> +2DB0..2DB6 ; Lo # [7] ETHIOPIC SYLLABLE ZZA..ETHIOPIC SYLLABLE ZZO
> +2DB8..2DBE ; Lo # [7] ETHIOPIC SYLLABLE CCHA..ETHIOPIC SYLLABLE CCHO
> +2DC0..2DC6 ; Lo # [7] ETHIOPIC SYLLABLE QYA..ETHIOPIC SYLLABLE QYO
> +2DC8..2DCE ; Lo # [7] ETHIOPIC SYLLABLE KYA..ETHIOPIC SYLLABLE KYO
> +2DD0..2DD6 ; Lo # [7] ETHIOPIC SYLLABLE XYA..ETHIOPIC SYLLABLE XYO
> +2DD8..2DDE ; Lo # [7] ETHIOPIC SYLLABLE GYA..ETHIOPIC SYLLABLE GYO
> +3006 ; Lo # IDEOGRAPHIC CLOSING MARK
> +303C ; Lo # MASU MARK
> +3041..3096 ; Lo # [86] HIRAGANA LETTER SMALL A..HIRAGANA LETTER SMALL KE
> +309F ; Lo # HIRAGANA DIGRAPH YORI
> +30A1..30FA ; Lo # [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
> +30FF ; Lo # KATAKANA DIGRAPH KOTO
> +3105..312F ; Lo # [43] BOPOMOFO LETTER B..BOPOMOFO LETTER NN
> +3131..318E ; Lo # [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
> +31A0..31BF ; Lo # [32] BOPOMOFO LETTER BU..BOPOMOFO LETTER AH
> +31F0..31FF ; Lo # [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
> +3400..4DBF ; Lo # [6592] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DBF
> +4E00..A014 ; Lo # [21013] CJK UNIFIED IDEOGRAPH-4E00..YI SYLLABLE E
> +A016..A48C ; Lo # [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
> +A4D0..A4F7 ; Lo # [40] LISU LETTER BA..LISU LETTER OE
> +A500..A60B ; Lo # [268] VAI SYLLABLE EE..VAI SYLLABLE NG
> +A610..A61F ; Lo # [16] VAI SYLLABLE NDOLE FA..VAI SYMBOL JONG
> +A62A..A62B ; Lo # [2] VAI SYLLABLE NDOLE MA..VAI SYLLABLE NDOLE DO
> +A66E ; Lo # CYRILLIC LETTER MULTIOCULAR O
> +A6A0..A6E5 ; Lo # [70] BAMUM LETTER A..BAMUM LETTER KI
> +A78F ; Lo # LATIN LETTER SINOLOGICAL DOT
> +A7F7 ; Lo # LATIN EPIGRAPHIC LETTER SIDEWAYS I
> +A7FB..A801 ; Lo # [7] LATIN EPIGRAPHIC LETTER REVERSED F..SYLOTI NAGRI LETTER I
> +A803..A805 ; Lo # [3] SYLOTI NAGRI LETTER U..SYLOTI NAGRI LETTER O
> +A807..A80A ; Lo # [4] SYLOTI NAGRI LETTER KO..SYLOTI NAGRI LETTER GHO
> +A80C..A822 ; Lo # [23] SYLOTI NAGRI LETTER CO..SYLOTI NAGRI LETTER HO
> +A840..A873 ; Lo # [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU
> +A882..A8B3 ; Lo # [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA
> +A8F2..A8F7 ; Lo # [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA
> +A8FB ; Lo # DEVANAGARI HEADSTROKE
> +A8FD..A8FE ; Lo # [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY
> +A90A..A925 ; Lo # [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO
> +A930..A946 ; Lo # [23] REJANG LETTER KA..REJANG LETTER A
> +A960..A97C ; Lo # [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH
> +A984..A9B2 ; Lo # [47] JAVANESE LETTER A..JAVANESE LETTER HA
> +A9E0..A9E4 ; Lo # [5] MYANMAR LETTER SHAN GHA..MYANMAR LETTER SHAN BHA
> +A9E7..A9EF ; Lo # [9] MYANMAR LETTER TAI LAING NYA..MYANMAR LETTER TAI LAING NNA
> +A9FA..A9FE ; Lo # [5] MYANMAR LETTER TAI LAING LLA..MYANMAR LETTER TAI LAING BHA
> +AA00..AA28 ; Lo # [41] CHAM LETTER A..CHAM LETTER HA
> +AA40..AA42 ; Lo # [3] CHAM LETTER FINAL K..CHAM LETTER FINAL NG
> +AA44..AA4B ; Lo # [8] CHAM LETTER FINAL CH..CHAM LETTER FINAL SS
> +AA60..AA6F ; Lo # [16] MYANMAR LETTER KHAMTI GA..MYANMAR LETTER KHAMTI FA
> +AA71..AA76 ; Lo # [6] MYANMAR LETTER KHAMTI XA..MYANMAR LOGOGRAM KHAMTI HM
> +AA7A ; Lo # MYANMAR LETTER AITON RA
> +AA7E..AAAF ; Lo # [50] MYANMAR LETTER SHWE PALAUNG CHA..TAI VIET LETTER HIGH O
> +AAB1 ; Lo # TAI VIET VOWEL AA
> +AAB5..AAB6 ; Lo # [2] TAI VIET VOWEL E..TAI VIET VOWEL O
> +AAB9..AABD ; Lo # [5] TAI VIET VOWEL UEA..TAI VIET VOWEL AN
> +AAC0 ; Lo # TAI VIET TONE MAI NUENG
> +AAC2 ; Lo # TAI VIET TONE MAI SONG
> +AADB..AADC ; Lo # [2] TAI VIET SYMBOL KON..TAI VIET SYMBOL NUENG
> +AAE0..AAEA ; Lo # [11] MEETEI MAYEK LETTER E..MEETEI MAYEK LETTER SSA
> +AAF2 ; Lo # MEETEI MAYEK ANJI
> +AB01..AB06 ; Lo # [6] ETHIOPIC SYLLABLE TTHU..ETHIOPIC SYLLABLE TTHO
> +AB09..AB0E ; Lo # [6] ETHIOPIC SYLLABLE DDHU..ETHIOPIC SYLLABLE DDHO
> +AB11..AB16 ; Lo # [6] ETHIOPIC SYLLABLE DZU..ETHIOPIC SYLLABLE DZO
> +AB20..AB26 ; Lo # [7] ETHIOPIC SYLLABLE CCHHA..ETHIOPIC SYLLABLE CCHHO
> +AB28..AB2E ; Lo # [7] ETHIOPIC SYLLABLE BBA..ETHIOPIC SYLLABLE BBO
> +ABC0..ABE2 ; Lo # [35] MEETEI MAYEK LETTER KOK..MEETEI MAYEK LETTER I LONSUM
> +AC00..D7A3 ; Lo # [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH
> +D7B0..D7C6 ; Lo # [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E
> +D7CB..D7FB ; Lo # [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH
> +F900..FA6D ; Lo # [366] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA6D
> +FA70..FAD9 ; Lo # [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9
> +FB1D ; Lo # HEBREW LETTER YOD WITH HIRIQ
> +FB1F..FB28 ; Lo # [10] HEBREW LIGATURE YIDDISH YOD YOD PATAH..HEBREW LETTER WIDE TAV
> +FB2A..FB36 ; Lo # [13] HEBREW LETTER SHIN WITH SHIN DOT..HEBREW LETTER ZAYIN WITH DAGESH
> +FB38..FB3C ; Lo # [5] HEBREW LETTER TET WITH DAGESH..HEBREW LETTER LAMED WITH DAGESH
> +FB3E ; Lo # HEBREW LETTER MEM WITH DAGESH
> +FB40..FB41 ; Lo # [2] HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER SAMEKH WITH DAGESH
> +FB43..FB44 ; Lo # [2] HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETTER PE WITH DAGESH
> +FB46..FBB1 ; Lo # [108] HEBREW LETTER TSADI WITH DAGESH..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM
> +FBD3..FD3D ; Lo # [363] ARABIC LETTER NG ISOLATED FORM..ARABIC LIGATURE ALEF WITH FATHATAN ISOLATED FORM
> +FD50..FD8F ; Lo # [64] ARABIC LIGATURE TEH WITH JEEM WITH MEEM INITIAL FORM..ARABIC LIGATURE MEEM WITH KHAH WITH MEEM INITIAL FORM
> +FD92..FDC7 ; Lo # [54] ARABIC LIGATURE MEEM WITH JEEM WITH KHAH INITIAL FORM..ARABIC LIGATURE NOON WITH JEEM WITH YEH FINAL FORM
> +FDF0..FDFB ; Lo # [12] ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM..ARABIC LIGATURE JALLAJALALOUHOU
> +FE70..FE74 ; Lo # [5] ARABIC FATHATAN ISOLATED FORM..ARABIC KASRATAN ISOLATED FORM
> +FE76..FEFC ; Lo # [135] ARABIC FATHA ISOLATED FORM..ARABIC LIGATURE LAM WITH ALEF FINAL FORM
> +FF66..FF6F ; Lo # [10] HALFWIDTH KATAKANA LETTER WO..HALFWIDTH KATAKANA LETTER SMALL TU
> +FF71..FF9D ; Lo # [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N
> +FFA0..FFBE ; Lo # [31] HALFWIDTH HANGUL FILLER..HALFWIDTH HANGUL LETTER HIEUH
> +FFC2..FFC7 ; Lo # [6] HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E
> +FFCA..FFCF ; Lo # [6] HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE
> +FFD2..FFD7 ; Lo # [6] HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU
> +FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
> +10000..1000B ; Lo # [12] LINEAR B SYLLABLE B008 A..LINEAR B SYLLABLE B046 JE
> +1000D..10026 ; Lo # [26] LINEAR B SYLLABLE B036 JO..LINEAR B SYLLABLE B032 QO
> +10028..1003A ; Lo # [19] LINEAR B SYLLABLE B060 RA..LINEAR B SYLLABLE B042 WO
> +1003C..1003D ; Lo # [2] LINEAR B SYLLABLE B017 ZA..LINEAR B SYLLABLE B074 ZE
> +1003F..1004D ; Lo # [15] LINEAR B SYLLABLE B020 ZO..LINEAR B SYLLABLE B091 TWO
> +10050..1005D ; Lo # [14] LINEAR B SYMBOL B018..LINEAR B SYMBOL B089
> +10080..100FA ; Lo # [123] LINEAR B IDEOGRAM B100 MAN..LINEAR B IDEOGRAM VESSEL B305
> +10280..1029C ; Lo # [29] LYCIAN LETTER A..LYCIAN LETTER X
> +102A0..102D0 ; Lo # [49] CARIAN LETTER A..CARIAN LETTER UUU3
> +10300..1031F ; Lo # [32] OLD ITALIC LETTER A..OLD ITALIC LETTER ESS
> +1032D..10340 ; Lo # [20] OLD ITALIC LETTER YE..GOTHIC LETTER PAIRTHRA
> +10342..10349 ; Lo # [8] GOTHIC LETTER RAIDA..GOTHIC LETTER OTHAL
> +10350..10375 ; Lo # [38] OLD PERMIC LETTER AN..OLD PERMIC LETTER IA
> +10380..1039D ; Lo # [30] UGARITIC LETTER ALPA..UGARITIC LETTER SSU
> +103A0..103C3 ; Lo # [36] OLD PERSIAN SIGN A..OLD PERSIAN SIGN HA
> +103C8..103CF ; Lo # [8] OLD PERSIAN SIGN AURAMAZDAA..OLD PERSIAN SIGN BUUMISH
> +10450..1049D ; Lo # [78] SHAVIAN LETTER PEEP..OSMANYA LETTER OO
> +10500..10527 ; Lo # [40] ELBASAN LETTER A..ELBASAN LETTER KHE
> +10530..10563 ; Lo # [52] CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBANIAN LETTER KIW
> +105C0..105F3 ; Lo # [52] TODHRI LETTER A..TODHRI LETTER OO
> +10600..10736 ; Lo # [311] LINEAR A SIGN AB001..LINEAR A SIGN A664
> +10740..10755 ; Lo # [22] LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE
> +10760..10767 ; Lo # [8] LINEAR A SIGN A800..LINEAR A SIGN A807
> +10800..10805 ; Lo # [6] CYPRIOT SYLLABLE A..CYPRIOT SYLLABLE JA
> +10808 ; Lo # CYPRIOT SYLLABLE JO
> +1080A..10835 ; Lo # [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO
> +10837..10838 ; Lo # [2] CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE
> +1083C ; Lo # CYPRIOT SYLLABLE ZA
> +1083F..10855 ; Lo # [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW
> +10860..10876 ; Lo # [23] PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW
> +10880..1089E ; Lo # [31] NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTER TAW
> +108E0..108F2 ; Lo # [19] HATRAN LETTER ALEPH..HATRAN LETTER QOPH
> +108F4..108F5 ; Lo # [2] HATRAN LETTER SHIN..HATRAN LETTER TAW
> +10900..10915 ; Lo # [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU
> +10920..10939 ; Lo # [26] LYDIAN LETTER A..LYDIAN LETTER C
> +10980..109B7 ; Lo # [56] MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURSIVE LETTER DA
> +109BE..109BF ; Lo # [2] MEROITIC CURSIVE LOGOGRAM RMT..MEROITIC CURSIVE LOGOGRAM IMN
> +10A00 ; Lo # KHAROSHTHI LETTER A
> +10A10..10A13 ; Lo # [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA
> +10A15..10A17 ; Lo # [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA
> +10A19..10A35 ; Lo # [29] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER VHA
> +10A60..10A7C ; Lo # [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH
> +10A80..10A9C ; Lo # [29] OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABIAN LETTER ZAH
> +10AC0..10AC7 ; Lo # [8] MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WAW
> +10AC9..10AE4 ; Lo # [28] MANICHAEAN LETTER ZAYIN..MANICHAEAN LETTER TAW
> +10B00..10B35 ; Lo # [54] AVESTAN LETTER A..AVESTAN LETTER HE
> +10B40..10B55 ; Lo # [22] INSCRIPTIONAL PARTHIAN LETTER ALEPH..INSCRIPTIONAL PARTHIAN LETTER TAW
> +10B60..10B72 ; Lo # [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW
> +10B80..10B91 ; Lo # [18] PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI LETTER TAW
> +10C00..10C48 ; Lo # [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH
> +10D00..10D23 ; Lo # [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA
> +10D4A..10D4D ; Lo # [4] GARAY VOWEL SIGN A..GARAY VOWEL SIGN EE
> +10D4F ; Lo # GARAY SUKUN
> +10E80..10EA9 ; Lo # [42] YEZIDI LETTER ELIF..YEZIDI LETTER ET
> +10EB0..10EB1 ; Lo # [2] YEZIDI LETTER LAM WITH DOT ABOVE..YEZIDI LETTER YOT WITH CIRCUMFLEX ABOVE
> +10EC2..10EC4 ; Lo # [3] ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW..ARABIC LETTER KAF WITH TWO DOTS VERTICALLY BELOW
> +10F00..10F1C ; Lo # [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL
> +10F27 ; Lo # OLD SOGDIAN LIGATURE AYIN-DALETH
> +10F30..10F45 ; Lo # [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN
> +10F70..10F81 ; Lo # [18] OLD UYGHUR LETTER ALEPH..OLD UYGHUR LETTER LESH
> +10FB0..10FC4 ; Lo # [21] CHORASMIAN LETTER ALEPH..CHORASMIAN LETTER TAW
> +10FE0..10FF6 ; Lo # [23] ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-YODH
> +11003..11037 ; Lo # [53] BRAHMI SIGN JIHVAMULIYA..BRAHMI LETTER OLD TAMIL NNNA
> +11071..11072 ; Lo # [2] BRAHMI LETTER OLD TAMIL SHORT E..BRAHMI LETTER OLD TAMIL SHORT O
> +11075 ; Lo # BRAHMI LETTER OLD TAMIL LLA
> +11083..110AF ; Lo # [45] KAITHI LETTER A..KAITHI LETTER HA
> +110D0..110E8 ; Lo # [25] SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER MAE
> +11103..11126 ; Lo # [36] CHAKMA LETTER AA..CHAKMA LETTER HAA
> +11144 ; Lo # CHAKMA LETTER LHAA
> +11147 ; Lo # CHAKMA LETTER VAA
> +11150..11172 ; Lo # [35] MAHAJANI LETTER A..MAHAJANI LETTER RRA
> +11176 ; Lo # MAHAJANI LIGATURE SHRI
> +11183..111B2 ; Lo # [48] SHARADA LETTER A..SHARADA LETTER HA
> +111C1..111C4 ; Lo # [4] SHARADA SIGN AVAGRAHA..SHARADA OM
> +111DA ; Lo # SHARADA EKAM
> +111DC ; Lo # SHARADA HEADSTROKE
> +11200..11211 ; Lo # [18] KHOJKI LETTER A..KHOJKI LETTER JJA
> +11213..1122B ; Lo # [25] KHOJKI LETTER NYA..KHOJKI LETTER LLA
> +1123F..11240 ; Lo # [2] KHOJKI LETTER QA..KHOJKI LETTER SHORT I
> +11280..11286 ; Lo # [7] MULTANI LETTER A..MULTANI LETTER GA
> +11288 ; Lo # MULTANI LETTER GHA
> +1128A..1128D ; Lo # [4] MULTANI LETTER CA..MULTANI LETTER JJA
> +1128F..1129D ; Lo # [15] MULTANI LETTER NYA..MULTANI LETTER BA
> +1129F..112A8 ; Lo # [10] MULTANI LETTER BHA..MULTANI LETTER RHA
> +112B0..112DE ; Lo # [47] KHUDAWADI LETTER A..KHUDAWADI LETTER HA
> +11305..1130C ; Lo # [8] GRANTHA LETTER A..GRANTHA LETTER VOCALIC L
> +1130F..11310 ; Lo # [2] GRANTHA LETTER EE..GRANTHA LETTER AI
> +11313..11328 ; Lo # [22] GRANTHA LETTER OO..GRANTHA LETTER NA
> +1132A..11330 ; Lo # [7] GRANTHA LETTER PA..GRANTHA LETTER RA
> +11332..11333 ; Lo # [2] GRANTHA LETTER LA..GRANTHA LETTER LLA
> +11335..11339 ; Lo # [5] GRANTHA LETTER VA..GRANTHA LETTER HA
> +1133D ; Lo # GRANTHA SIGN AVAGRAHA
> +11350 ; Lo # GRANTHA OM
> +1135D..11361 ; Lo # [5] GRANTHA SIGN PLUTA..GRANTHA LETTER VOCALIC LL
> +11380..11389 ; Lo # [10] TULU-TIGALARI LETTER A..TULU-TIGALARI LETTER VOCALIC LL
> +1138B ; Lo # TULU-TIGALARI LETTER EE
> +1138E ; Lo # TULU-TIGALARI LETTER AI
> +11390..113B5 ; Lo # [38] TULU-TIGALARI LETTER OO..TULU-TIGALARI LETTER LLLA
> +113B7 ; Lo # TULU-TIGALARI SIGN AVAGRAHA
> +113D1 ; Lo # TULU-TIGALARI REPHA
> +113D3 ; Lo # TULU-TIGALARI SIGN PLUTA
> +11400..11434 ; Lo # [53] NEWA LETTER A..NEWA LETTER HA
> +11447..1144A ; Lo # [4] NEWA SIGN AVAGRAHA..NEWA SIDDHI
> +1145F..11461 ; Lo # [3] NEWA LETTER VEDIC ANUSVARA..NEWA SIGN UPADHMANIYA
> +11480..114AF ; Lo # [48] TIRHUTA ANJI..TIRHUTA LETTER HA
> +114C4..114C5 ; Lo # [2] TIRHUTA SIGN AVAGRAHA..TIRHUTA GVANG
> +114C7 ; Lo # TIRHUTA OM
> +11580..115AE ; Lo # [47] SIDDHAM LETTER A..SIDDHAM LETTER HA
> +115D8..115DB ; Lo # [4] SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDDHAM LETTER ALTERNATE U
> +11600..1162F ; Lo # [48] MODI LETTER A..MODI LETTER LLA
> +11644 ; Lo # MODI SIGN HUVA
> +11680..116AA ; Lo # [43] TAKRI LETTER A..TAKRI LETTER RRA
> +116B8 ; Lo # TAKRI LETTER ARCHAIC KHA
> +11700..1171A ; Lo # [27] AHOM LETTER KA..AHOM LETTER ALTERNATE BA
> +11740..11746 ; Lo # [7] AHOM LETTER CA..AHOM LETTER LLA
> +11800..1182B ; Lo # [44] DOGRA LETTER A..DOGRA LETTER RRA
> +118FF..11906 ; Lo # [8] WARANG CITI OM..DIVES AKURU LETTER E
> +11909 ; Lo # DIVES AKURU LETTER O
> +1190C..11913 ; Lo # [8] DIVES AKURU LETTER KA..DIVES AKURU LETTER JA
> +11915..11916 ; Lo # [2] DIVES AKURU LETTER NYA..DIVES AKURU LETTER TTA
> +11918..1192F ; Lo # [24] DIVES AKURU LETTER DDA..DIVES AKURU LETTER ZA
> +1193F ; Lo # DIVES AKURU PREFIXED NASAL SIGN
> +11941 ; Lo # DIVES AKURU INITIAL RA
> +119A0..119A7 ; Lo # [8] NANDINAGARI LETTER A..NANDINAGARI LETTER VOCALIC RR
> +119AA..119D0 ; Lo # [39] NANDINAGARI LETTER E..NANDINAGARI LETTER RRA
> +119E1 ; Lo # NANDINAGARI SIGN AVAGRAHA
> +119E3 ; Lo # NANDINAGARI HEADSTROKE
> +11A00 ; Lo # ZANABAZAR SQUARE LETTER A
> +11A0B..11A32 ; Lo # [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
> +11A3A ; Lo # ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
> +11A50 ; Lo # SOYOMBO LETTER A
> +11A5C..11A89 ; Lo # [46] SOYOMBO LETTER KA..SOYOMBO CLUSTER-INITIAL LETTER SA
> +11A9D ; Lo # SOYOMBO MARK PLUTA
> +11AB0..11AF8 ; Lo # [73] CANADIAN SYLLABICS NATTILIK HI..PAU CIN HAU GLOTTAL STOP FINAL
> +11BC0..11BE0 ; Lo # [33] SUNUWAR LETTER DEVI..SUNUWAR LETTER KLOKO
> +11C00..11C08 ; Lo # [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
> +11C0A..11C2E ; Lo # [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
> +11C40 ; Lo # BHAIKSUKI SIGN AVAGRAHA
> +11C72..11C8F ; Lo # [30] MARCHEN LETTER KA..MARCHEN LETTER A
> +11D00..11D06 ; Lo # [7] MASARAM GONDI LETTER A..MASARAM GONDI LETTER E
> +11D08..11D09 ; Lo # [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
> +11D0B..11D30 ; Lo # [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
> +11D46 ; Lo # MASARAM GONDI REPHA
> +11D60..11D65 ; Lo # [6] GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER UU
> +11D67..11D68 ; Lo # [2] GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER AI
> +11D6A..11D89 ; Lo # [32] GUNJALA GONDI LETTER OO..GUNJALA GONDI LETTER SA
> +11D98 ; Lo # GUNJALA GONDI OM
> +11EE0..11EF2 ; Lo # [19] MAKASAR LETTER KA..MAKASAR ANGKA
> +11F02 ; Lo # KAWI SIGN REPHA
> +11F04..11F10 ; Lo # [13] KAWI LETTER A..KAWI LETTER O
> +11F12..11F33 ; Lo # [34] KAWI LETTER KA..KAWI LETTER JNYA
> +11FB0 ; Lo # LISU LETTER YHA
> +12000..12399 ; Lo # [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
> +12480..12543 ; Lo # [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
> +12F90..12FF0 ; Lo # [97] CYPRO-MINOAN SIGN CM001..CYPRO-MINOAN SIGN CM114
> +13000..1342F ; Lo # [1072] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH V011D
> +13441..13446 ; Lo # [6] EGYPTIAN HIEROGLYPH FULL BLANK..EGYPTIAN HIEROGLYPH WIDE LOST SIGN
> +13460..143FA ; Lo # [3995] EGYPTIAN HIEROGLYPH-13460..EGYPTIAN HIEROGLYPH-143FA
> +14400..14646 ; Lo # [583] ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLYPH A530
> +16100..1611D ; Lo # [30] GURUNG KHEMA LETTER A..GURUNG KHEMA LETTER SA
> +16800..16A38 ; Lo # [569] BAMUM LETTER PHASE-A NGKUE MFON..BAMUM LETTER PHASE-F VUEQ
> +16A40..16A5E ; Lo # [31] MRO LETTER TA..MRO LETTER TEK
> +16A70..16ABE ; Lo # [79] TANGSA LETTER OZ..TANGSA LETTER ZA
> +16AD0..16AED ; Lo # [30] BASSA VAH LETTER ENNI..BASSA VAH LETTER I
> +16B00..16B2F ; Lo # [48] PAHAWH HMONG VOWEL KEEB..PAHAWH HMONG CONSONANT CAU
> +16B63..16B77 ; Lo # [21] PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN CIM NRES TOS
> +16B7D..16B8F ; Lo # [19] PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG CLAN SIGN VWJ
> +16D43..16D6A ; Lo # [40] KIRAT RAI LETTER A..KIRAT RAI VOWEL SIGN AU
> +16F00..16F4A ; Lo # [75] MIAO LETTER PA..MIAO LETTER RTE
> +16F50 ; Lo # MIAO LETTER NASALIZATION
> +17000..187F7 ; Lo # [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
> +18800..18CD5 ; Lo # [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
> +18CFF..18D08 ; Lo # [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
> +1B000..1B122 ; Lo # [291] KATAKANA LETTER ARCHAIC E..KATAKANA LETTER ARCHAIC WU
> +1B132 ; Lo # HIRAGANA LETTER SMALL KO
> +1B150..1B152 ; Lo # [3] HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMALL WO
> +1B155 ; Lo # KATAKANA LETTER SMALL KO
> +1B164..1B167 ; Lo # [4] KATAKANA LETTER SMALL WI..KATAKANA LETTER SMALL N
> +1B170..1B2FB ; Lo # [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
> +1BC00..1BC6A ; Lo # [107] DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
> +1BC70..1BC7C ; Lo # [13] DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOYAN AFFIX ATTACHED TANGENT HOOK
> +1BC80..1BC88 ; Lo # [9] DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIGH VERTICAL
> +1BC90..1BC99 ; Lo # [10] DUPLOYAN AFFIX LOW ACUTE..DUPLOYAN AFFIX LOW ARROW
> +1DF0A ; Lo # LATIN LETTER RETROFLEX CLICK WITH RETROFLEX HOOK
> +1E100..1E12C ; Lo # [45] NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PUACHUE HMONG LETTER W
> +1E14E ; Lo # NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ
> +1E290..1E2AD ; Lo # [30] TOTO LETTER PA..TOTO LETTER A
> +1E2C0..1E2EB ; Lo # [44] WANCHO LETTER AA..WANCHO LETTER YIH
> +1E4D0..1E4EA ; Lo # [27] NAG MUNDARI LETTER O..NAG MUNDARI LETTER ELL
> +1E5D0..1E5ED ; Lo # [30] OL ONAL LETTER O..OL ONAL LETTER EG
> +1E5F0 ; Lo # OL ONAL SIGN HODDOND
> +1E7E0..1E7E6 ; Lo # [7] ETHIOPIC SYLLABLE HHYA..ETHIOPIC SYLLABLE HHYO
> +1E7E8..1E7EB ; Lo # [4] ETHIOPIC SYLLABLE GURAGE HHWA..ETHIOPIC SYLLABLE HHWE
> +1E7ED..1E7EE ; Lo # [2] ETHIOPIC SYLLABLE GURAGE MWI..ETHIOPIC SYLLABLE GURAGE MWEE
> +1E7F0..1E7FE ; Lo # [15] ETHIOPIC SYLLABLE GURAGE QWI..ETHIOPIC SYLLABLE GURAGE PWEE
> +1E800..1E8C4 ; Lo # [197] MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI SYLLABLE M060 NYON
> +1EE00..1EE03 ; Lo # [4] ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL DAL
> +1EE05..1EE1F ; Lo # [27] ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL DOTLESS QAF
> +1EE21..1EE22 ; Lo # [2] ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHEMATICAL INITIAL JEEM
> +1EE24 ; Lo # ARABIC MATHEMATICAL INITIAL HEH
> +1EE27 ; Lo # ARABIC MATHEMATICAL INITIAL HAH
> +1EE29..1EE32 ; Lo # [10] ARABIC MATHEMATICAL INITIAL YEH..ARABIC MATHEMATICAL INITIAL QAF
> +1EE34..1EE37 ; Lo # [4] ARABIC MATHEMATICAL INITIAL SHEEN..ARABIC MATHEMATICAL INITIAL KHAH
> +1EE39 ; Lo # ARABIC MATHEMATICAL INITIAL DAD
> +1EE3B ; Lo # ARABIC MATHEMATICAL INITIAL GHAIN
> +1EE42 ; Lo # ARABIC MATHEMATICAL TAILED JEEM
> +1EE47 ; Lo # ARABIC MATHEMATICAL TAILED HAH
> +1EE49 ; Lo # ARABIC MATHEMATICAL TAILED YEH
> +1EE4B ; Lo # ARABIC MATHEMATICAL TAILED LAM
> +1EE4D..1EE4F ; Lo # [3] ARABIC MATHEMATICAL TAILED NOON..ARABIC MATHEMATICAL TAILED AIN
> +1EE51..1EE52 ; Lo # [2] ARABIC MATHEMATICAL TAILED SAD..ARABIC MATHEMATICAL TAILED QAF
> +1EE54 ; Lo # ARABIC MATHEMATICAL TAILED SHEEN
> +1EE57 ; Lo # ARABIC MATHEMATICAL TAILED KHAH
> +1EE59 ; Lo # ARABIC MATHEMATICAL TAILED DAD
> +1EE5B ; Lo # ARABIC MATHEMATICAL TAILED GHAIN
> +1EE5D ; Lo # ARABIC MATHEMATICAL TAILED DOTLESS NOON
> +1EE5F ; Lo # ARABIC MATHEMATICAL TAILED DOTLESS QAF
> +1EE61..1EE62 ; Lo # [2] ARABIC MATHEMATICAL STRETCHED BEH..ARABIC MATHEMATICAL STRETCHED JEEM
> +1EE64 ; Lo # ARABIC MATHEMATICAL STRETCHED HEH
> +1EE67..1EE6A ; Lo # [4] ARABIC MATHEMATICAL STRETCHED HAH..ARABIC MATHEMATICAL STRETCHED KAF
> +1EE6C..1EE72 ; Lo # [7] ARABIC MATHEMATICAL STRETCHED MEEM..ARABIC MATHEMATICAL STRETCHED QAF
> +1EE74..1EE77 ; Lo # [4] ARABIC MATHEMATICAL STRETCHED SHEEN..ARABIC MATHEMATICAL STRETCHED KHAH
> +1EE79..1EE7C ; Lo # [4] ARABIC MATHEMATICAL STRETCHED DAD..ARABIC MATHEMATICAL STRETCHED DOTLESS BEH
> +1EE7E ; Lo # ARABIC MATHEMATICAL STRETCHED DOTLESS FEH
> +1EE80..1EE89 ; Lo # [10] ARABIC MATHEMATICAL LOOPED ALEF..ARABIC MATHEMATICAL LOOPED YEH
> +1EE8B..1EE9B ; Lo # [17] ARABIC MATHEMATICAL LOOPED LAM..ARABIC MATHEMATICAL LOOPED GHAIN
> +1EEA1..1EEA3 ; Lo # [3] ARABIC MATHEMATICAL DOUBLE-STRUCK BEH..ARABIC MATHEMATICAL DOUBLE-STRUCK DAL
> +1EEA5..1EEA9 ; Lo # [5] ARABIC MATHEMATICAL DOUBLE-STRUCK WAW..ARABIC MATHEMATICAL DOUBLE-STRUCK YEH
> +1EEAB..1EEBB ; Lo # [17] ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN
> +20000..2A6DF ; Lo # [42720] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DF
> +2A700..2B739 ; Lo # [4154] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B739
> +2B740..2B81D ; Lo # [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
> +2B820..2CEA1 ; Lo # [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
> +2CEB0..2EBE0 ; Lo # [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
> +2EBF0..2EE5D ; Lo # [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
> +2F800..2FA1D ; Lo # [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
> +30000..3134A ; Lo # [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
> +31350..323AF ; Lo # [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
> +
> +# Total code points: 136477
> +
> +# ================================================
> +
> +# General_Category=Nonspacing_Mark
> +
> +0300..036F ; Mn # [112] COMBINING GRAVE ACCENT..COMBINING LATIN SMALL LETTER X
> +0483..0487 ; Mn # [5] COMBINING CYRILLIC TITLO..COMBINING CYRILLIC POKRYTIE
> +0591..05BD ; Mn # [45] HEBREW ACCENT ETNAHTA..HEBREW POINT METEG
> +05BF ; Mn # HEBREW POINT RAFE
> +05C1..05C2 ; Mn # [2] HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT
> +05C4..05C5 ; Mn # [2] HEBREW MARK UPPER DOT..HEBREW MARK LOWER DOT
> +05C7 ; Mn # HEBREW POINT QAMATS QATAN
> +0610..061A ; Mn # [11] ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM..ARABIC SMALL KASRA
> +064B..065F ; Mn # [21] ARABIC FATHATAN..ARABIC WAVY HAMZA BELOW
> +0670 ; Mn # ARABIC LETTER SUPERSCRIPT ALEF
> +06D6..06DC ; Mn # [7] ARABIC SMALL HIGH LIGATURE SAD WITH LAM WITH ALEF MAKSURA..ARABIC SMALL HIGH SEEN
> +06DF..06E4 ; Mn # [6] ARABIC SMALL HIGH ROUNDED ZERO..ARABIC SMALL HIGH MADDA
> +06E7..06E8 ; Mn # [2] ARABIC SMALL HIGH YEH..ARABIC SMALL HIGH NOON
> +06EA..06ED ; Mn # [4] ARABIC EMPTY CENTRE LOW STOP..ARABIC SMALL LOW MEEM
> +0711 ; Mn # SYRIAC LETTER SUPERSCRIPT ALAPH
> +0730..074A ; Mn # [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH
> +07A6..07B0 ; Mn # [11] THAANA ABAFILI..THAANA SUKUN
> +07EB..07F3 ; Mn # [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE
> +07FD ; Mn # NKO DANTAYALAN
> +0816..0819 ; Mn # [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH
> +081B..0823 ; Mn # [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A
> +0825..0827 ; Mn # [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U
> +0829..082D ; Mn # [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA
> +0859..085B ; Mn # [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
> +0897..089F ; Mn # [9] ARABIC PEPET..ARABIC HALF MADDA OVER MADDA
> +08CA..08E1 ; Mn # [24] ARABIC SMALL HIGH FARSI YEH..ARABIC SMALL HIGH SIGN SAFHA
> +08E3..0902 ; Mn # [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA
> +093A ; Mn # DEVANAGARI VOWEL SIGN OE
> +093C ; Mn # DEVANAGARI SIGN NUKTA
> +0941..0948 ; Mn # [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI
> +094D ; Mn # DEVANAGARI SIGN VIRAMA
> +0951..0957 ; Mn # [7] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN UUE
> +0962..0963 ; Mn # [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL
> +0981 ; Mn # BENGALI SIGN CANDRABINDU
> +09BC ; Mn # BENGALI SIGN NUKTA
> +09C1..09C4 ; Mn # [4] BENGALI VOWEL SIGN U..BENGALI VOWEL SIGN VOCALIC RR
> +09CD ; Mn # BENGALI SIGN VIRAMA
> +09E2..09E3 ; Mn # [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
> +09FE ; Mn # BENGALI SANDHI MARK
> +0A01..0A02 ; Mn # [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
> +0A3C ; Mn # GURMUKHI SIGN NUKTA
> +0A41..0A42 ; Mn # [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU
> +0A47..0A48 ; Mn # [2] GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI
> +0A4B..0A4D ; Mn # [3] GURMUKHI VOWEL SIGN OO..GURMUKHI SIGN VIRAMA
> +0A51 ; Mn # GURMUKHI SIGN UDAAT
> +0A70..0A71 ; Mn # [2] GURMUKHI TIPPI..GURMUKHI ADDAK
> +0A75 ; Mn # GURMUKHI SIGN YAKASH
> +0A81..0A82 ; Mn # [2] GUJARATI SIGN CANDRABINDU..GUJARATI SIGN ANUSVARA
> +0ABC ; Mn # GUJARATI SIGN NUKTA
> +0AC1..0AC5 ; Mn # [5] GUJARATI VOWEL SIGN U..GUJARATI VOWEL SIGN CANDRA E
> +0AC7..0AC8 ; Mn # [2] GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN AI
> +0ACD ; Mn # GUJARATI SIGN VIRAMA
> +0AE2..0AE3 ; Mn # [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL
> +0AFA..0AFF ; Mn # [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE
> +0B01 ; Mn # ORIYA SIGN CANDRABINDU
> +0B3C ; Mn # ORIYA SIGN NUKTA
> +0B3F ; Mn # ORIYA VOWEL SIGN I
> +0B41..0B44 ; Mn # [4] ORIYA VOWEL SIGN U..ORIYA VOWEL SIGN VOCALIC RR
> +0B4D ; Mn # ORIYA SIGN VIRAMA
> +0B55..0B56 ; Mn # [2] ORIYA SIGN OVERLINE..ORIYA AI LENGTH MARK
> +0B62..0B63 ; Mn # [2] ORIYA VOWEL SIGN VOCALIC L..ORIYA VOWEL SIGN VOCALIC LL
> +0B82 ; Mn # TAMIL SIGN ANUSVARA
> +0BC0 ; Mn # TAMIL VOWEL SIGN II
> +0BCD ; Mn # TAMIL SIGN VIRAMA
> +0C00 ; Mn # TELUGU SIGN COMBINING CANDRABINDU ABOVE
> +0C04 ; Mn # TELUGU SIGN COMBINING ANUSVARA ABOVE
> +0C3C ; Mn # TELUGU SIGN NUKTA
> +0C3E..0C40 ; Mn # [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II
> +0C46..0C48 ; Mn # [3] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI
> +0C4A..0C4D ; Mn # [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA
> +0C55..0C56 ; Mn # [2] TELUGU LENGTH MARK..TELUGU AI LENGTH MARK
> +0C62..0C63 ; Mn # [2] TELUGU VOWEL SIGN VOCALIC L..TELUGU VOWEL SIGN VOCALIC LL
> +0C81 ; Mn # KANNADA SIGN CANDRABINDU
> +0CBC ; Mn # KANNADA SIGN NUKTA
> +0CBF ; Mn # KANNADA VOWEL SIGN I
> +0CC6 ; Mn # KANNADA VOWEL SIGN E
> +0CCC..0CCD ; Mn # [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA
> +0CE2..0CE3 ; Mn # [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL
> +0D00..0D01 ; Mn # [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU
> +0D3B..0D3C ; Mn # [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA
> +0D41..0D44 ; Mn # [4] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN VOCALIC RR
> +0D4D ; Mn # MALAYALAM SIGN VIRAMA
> +0D62..0D63 ; Mn # [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL
> +0D81 ; Mn # SINHALA SIGN CANDRABINDU
> +0DCA ; Mn # SINHALA SIGN AL-LAKUNA
> +0DD2..0DD4 ; Mn # [3] SINHALA VOWEL SIGN KETTI IS-PILLA..SINHALA VOWEL SIGN KETTI PAA-PILLA
> +0DD6 ; Mn # SINHALA VOWEL SIGN DIGA PAA-PILLA
> +0E31 ; Mn # THAI CHARACTER MAI HAN-AKAT
> +0E34..0E3A ; Mn # [7] THAI CHARACTER SARA I..THAI CHARACTER PHINTHU
> +0E47..0E4E ; Mn # [8] THAI CHARACTER MAITAIKHU..THAI CHARACTER YAMAKKAN
> +0EB1 ; Mn # LAO VOWEL SIGN MAI KAN
> +0EB4..0EBC ; Mn # [9] LAO VOWEL SIGN I..LAO SEMIVOWEL SIGN LO
> +0EC8..0ECE ; Mn # [7] LAO TONE MAI EK..LAO YAMAKKAN
> +0F18..0F19 ; Mn # [2] TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN ASTROLOGICAL SIGN SDONG TSHUGS
> +0F35 ; Mn # TIBETAN MARK NGAS BZUNG NYI ZLA
> +0F37 ; Mn # TIBETAN MARK NGAS BZUNG SGOR RTAGS
> +0F39 ; Mn # TIBETAN MARK TSA -PHRU
> +0F71..0F7E ; Mn # [14] TIBETAN VOWEL SIGN AA..TIBETAN SIGN RJES SU NGA RO
> +0F80..0F84 ; Mn # [5] TIBETAN VOWEL SIGN REVERSED I..TIBETAN MARK HALANTA
> +0F86..0F87 ; Mn # [2] TIBETAN SIGN LCI RTAGS..TIBETAN SIGN YANG RTAGS
> +0F8D..0F97 ; Mn # [11] TIBETAN SUBJOINED SIGN LCE TSA CAN..TIBETAN SUBJOINED LETTER JA
> +0F99..0FBC ; Mn # [36] TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED LETTER FIXED-FORM RA
> +0FC6 ; Mn # TIBETAN SYMBOL PADMA GDAN
> +102D..1030 ; Mn # [4] MYANMAR VOWEL SIGN I..MYANMAR VOWEL SIGN UU
> +1032..1037 ; Mn # [6] MYANMAR VOWEL SIGN AI..MYANMAR SIGN DOT BELOW
> +1039..103A ; Mn # [2] MYANMAR SIGN VIRAMA..MYANMAR SIGN ASAT
> +103D..103E ; Mn # [2] MYANMAR CONSONANT SIGN MEDIAL WA..MYANMAR CONSONANT SIGN MEDIAL HA
> +1058..1059 ; Mn # [2] MYANMAR VOWEL SIGN VOCALIC L..MYANMAR VOWEL SIGN VOCALIC LL
> +105E..1060 ; Mn # [3] MYANMAR CONSONANT SIGN MON MEDIAL NA..MYANMAR CONSONANT SIGN MON MEDIAL LA
> +1071..1074 ; Mn # [4] MYANMAR VOWEL SIGN GEBA KAREN I..MYANMAR VOWEL SIGN KAYAH EE
> +1082 ; Mn # MYANMAR CONSONANT SIGN SHAN MEDIAL WA
> +1085..1086 ; Mn # [2] MYANMAR VOWEL SIGN SHAN E ABOVE..MYANMAR VOWEL SIGN SHAN FINAL Y
> +108D ; Mn # MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE
> +109D ; Mn # MYANMAR VOWEL SIGN AITON AI
> +135D..135F ; Mn # [3] ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK..ETHIOPIC COMBINING GEMINATION MARK
> +1712..1714 ; Mn # [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA
> +1732..1733 ; Mn # [2] HANUNOO VOWEL SIGN I..HANUNOO VOWEL SIGN U
> +1752..1753 ; Mn # [2] BUHID VOWEL SIGN I..BUHID VOWEL SIGN U
> +1772..1773 ; Mn # [2] TAGBANWA VOWEL SIGN I..TAGBANWA VOWEL SIGN U
> +17B4..17B5 ; Mn # [2] KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT AA
> +17B7..17BD ; Mn # [7] KHMER VOWEL SIGN I..KHMER VOWEL SIGN UA
> +17C6 ; Mn # KHMER SIGN NIKAHIT
> +17C9..17D3 ; Mn # [11] KHMER SIGN MUUSIKATOAN..KHMER SIGN BATHAMASAT
> +17DD ; Mn # KHMER SIGN ATTHACAN
> +180B..180D ; Mn # [3] MONGOLIAN FREE VARIATION SELECTOR ONE..MONGOLIAN FREE VARIATION SELECTOR THREE
> +180F ; Mn # MONGOLIAN FREE VARIATION SELECTOR FOUR
> +1885..1886 ; Mn # [2] MONGOLIAN LETTER ALI GALI BALUDA..MONGOLIAN LETTER ALI GALI THREE BALUDA
> +18A9 ; Mn # MONGOLIAN LETTER ALI GALI DAGALGA
> +1920..1922 ; Mn # [3] LIMBU VOWEL SIGN A..LIMBU VOWEL SIGN U
> +1927..1928 ; Mn # [2] LIMBU VOWEL SIGN E..LIMBU VOWEL SIGN O
> +1932 ; Mn # LIMBU SMALL LETTER ANUSVARA
> +1939..193B ; Mn # [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I
> +1A17..1A18 ; Mn # [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U
> +1A1B ; Mn # BUGINESE VOWEL SIGN AE
> +1A56 ; Mn # TAI THAM CONSONANT SIGN MEDIAL LA
> +1A58..1A5E ; Mn # [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA
> +1A60 ; Mn # TAI THAM SIGN SAKOT
> +1A62 ; Mn # TAI THAM VOWEL SIGN MAI SAT
> +1A65..1A6C ; Mn # [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW
> +1A73..1A7C ; Mn # [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN
> +1A7F ; Mn # TAI THAM COMBINING CRYPTOGRAMMIC DOT
> +1AB0..1ABD ; Mn # [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
> +1ABF..1ACE ; Mn # [16] COMBINING LATIN SMALL LETTER W BELOW..COMBINING LATIN SMALL LETTER INSULAR T
> +1B00..1B03 ; Mn # [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
> +1B34 ; Mn # BALINESE SIGN REREKAN
> +1B36..1B3A ; Mn # [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA
> +1B3C ; Mn # BALINESE VOWEL SIGN LA LENGA
> +1B42 ; Mn # BALINESE VOWEL SIGN PEPET
> +1B6B..1B73 ; Mn # [9] BALINESE MUSICAL SYMBOL COMBINING TEGEH..BALINESE MUSICAL SYMBOL COMBINING GONG
> +1B80..1B81 ; Mn # [2] SUNDANESE SIGN PANYECEK..SUNDANESE SIGN PANGLAYAR
> +1BA2..1BA5 ; Mn # [4] SUNDANESE CONSONANT SIGN PANYAKRA..SUNDANESE VOWEL SIGN PANYUKU
> +1BA8..1BA9 ; Mn # [2] SUNDANESE VOWEL SIGN PAMEPET..SUNDANESE VOWEL SIGN PANEULEUNG
> +1BAB..1BAD ; Mn # [3] SUNDANESE SIGN VIRAMA..SUNDANESE CONSONANT SIGN PASANGAN WA
> +1BE6 ; Mn # BATAK SIGN TOMPI
> +1BE8..1BE9 ; Mn # [2] BATAK VOWEL SIGN PAKPAK E..BATAK VOWEL SIGN EE
> +1BED ; Mn # BATAK VOWEL SIGN KARO O
> +1BEF..1BF1 ; Mn # [3] BATAK VOWEL SIGN U FOR SIMALUNGUN SA..BATAK CONSONANT SIGN H
> +1C2C..1C33 ; Mn # [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T
> +1C36..1C37 ; Mn # [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA
> +1CD0..1CD2 ; Mn # [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA
> +1CD4..1CE0 ; Mn # [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA
> +1CE2..1CE8 ; Mn # [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL
> +1CED ; Mn # VEDIC SIGN TIRYAK
> +1CF4 ; Mn # VEDIC TONE CANDRA ABOVE
> +1CF8..1CF9 ; Mn # [2] VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING ABOVE
> +1DC0..1DFF ; Mn # [64] COMBINING DOTTED GRAVE ACCENT..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW
> +20D0..20DC ; Mn # [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE
> +20E1 ; Mn # COMBINING LEFT RIGHT ARROW ABOVE
> +20E5..20F0 ; Mn # [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE
> +2CEF..2CF1 ; Mn # [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS
> +2D7F ; Mn # TIFINAGH CONSONANT JOINER
> +2DE0..2DFF ; Mn # [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS
> +302A..302D ; Mn # [4] IDEOGRAPHIC LEVEL TONE MARK..IDEOGRAPHIC ENTERING TONE MARK
> +3099..309A ; Mn # [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
> +A66F ; Mn # COMBINING CYRILLIC VZMET
> +A674..A67D ; Mn # [10] COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBINING CYRILLIC PAYEROK
> +A69E..A69F ; Mn # [2] COMBINING CYRILLIC LETTER EF..COMBINING CYRILLIC LETTER IOTIFIED E
> +A6F0..A6F1 ; Mn # [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS
> +A802 ; Mn # SYLOTI NAGRI SIGN DVISVARA
> +A806 ; Mn # SYLOTI NAGRI SIGN HASANTA
> +A80B ; Mn # SYLOTI NAGRI SIGN ANUSVARA
> +A825..A826 ; Mn # [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E
> +A82C ; Mn # SYLOTI NAGRI SIGN ALTERNATE HASANTA
> +A8C4..A8C5 ; Mn # [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU
> +A8E0..A8F1 ; Mn # [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA
> +A8FF ; Mn # DEVANAGARI VOWEL SIGN AY
> +A926..A92D ; Mn # [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU
> +A947..A951 ; Mn # [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R
> +A980..A982 ; Mn # [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR
> +A9B3 ; Mn # JAVANESE SIGN CECAK TELU
> +A9B6..A9B9 ; Mn # [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT
> +A9BC..A9BD ; Mn # [2] JAVANESE VOWEL SIGN PEPET..JAVANESE CONSONANT SIGN KERET
> +A9E5 ; Mn # MYANMAR SIGN SHAN SAW
> +AA29..AA2E ; Mn # [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE
> +AA31..AA32 ; Mn # [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE
> +AA35..AA36 ; Mn # [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA
> +AA43 ; Mn # CHAM CONSONANT SIGN FINAL NG
> +AA4C ; Mn # CHAM CONSONANT SIGN FINAL M
> +AA7C ; Mn # MYANMAR SIGN TAI LAING TONE-2
> +AAB0 ; Mn # TAI VIET MAI KANG
> +AAB2..AAB4 ; Mn # [3] TAI VIET VOWEL I..TAI VIET VOWEL U
> +AAB7..AAB8 ; Mn # [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA
> +AABE..AABF ; Mn # [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK
> +AAC1 ; Mn # TAI VIET TONE MAI THO
> +AAEC..AAED ; Mn # [2] MEETEI MAYEK VOWEL SIGN UU..MEETEI MAYEK VOWEL SIGN AAI
> +AAF6 ; Mn # MEETEI MAYEK VIRAMA
> +ABE5 ; Mn # MEETEI MAYEK VOWEL SIGN ANAP
> +ABE8 ; Mn # MEETEI MAYEK VOWEL SIGN UNAP
> +ABED ; Mn # MEETEI MAYEK APUN IYEK
> +FB1E ; Mn # HEBREW POINT JUDEO-SPANISH VARIKA
> +FE00..FE0F ; Mn # [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16
> +FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITLO RIGHT HALF
> +101FD ; Mn # PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE
> +102E0 ; Mn # COPTIC EPACT THOUSANDS MARK
> +10376..1037A ; Mn # [5] COMBINING OLD PERMIC LETTER AN..COMBINING OLD PERMIC LETTER SII
> +10A01..10A03 ; Mn # [3] KHAROSHTHI VOWEL SIGN I..KHAROSHTHI VOWEL SIGN VOCALIC R
> +10A05..10A06 ; Mn # [2] KHAROSHTHI VOWEL SIGN E..KHAROSHTHI VOWEL SIGN O
> +10A0C..10A0F ; Mn # [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA
> +10A38..10A3A ; Mn # [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW
> +10A3F ; Mn # KHAROSHTHI VIRAMA
> +10AE5..10AE6 ; Mn # [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW
> +10D24..10D27 ; Mn # [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
> +10D69..10D6D ; Mn # [5] GARAY VOWEL SIGN E..GARAY CONSONANT NASALIZATION MARK
> +10EAB..10EAC ; Mn # [2] YEZIDI COMBINING HAMZA MARK..YEZIDI COMBINING MADDA MARK
> +10EFC..10EFF ; Mn # [4] ARABIC COMBINING ALEF OVERLAY..ARABIC SMALL LOW WORD MADDA
> +10F46..10F50 ; Mn # [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
> +10F82..10F85 ; Mn # [4] OLD UYGHUR COMBINING DOT ABOVE..OLD UYGHUR COMBINING TWO DOTS BELOW
> +11001 ; Mn # BRAHMI SIGN ANUSVARA
> +11038..11046 ; Mn # [15] BRAHMI VOWEL SIGN AA..BRAHMI VIRAMA
> +11070 ; Mn # BRAHMI SIGN OLD TAMIL VIRAMA
> +11073..11074 ; Mn # [2] BRAHMI VOWEL SIGN OLD TAMIL SHORT E..BRAHMI VOWEL SIGN OLD TAMIL SHORT O
> +1107F..11081 ; Mn # [3] BRAHMI NUMBER JOINER..KAITHI SIGN ANUSVARA
> +110B3..110B6 ; Mn # [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI
> +110B9..110BA ; Mn # [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA
> +110C2 ; Mn # KAITHI VOWEL SIGN VOCALIC R
> +11100..11102 ; Mn # [3] CHAKMA SIGN CANDRABINDU..CHAKMA SIGN VISARGA
> +11127..1112B ; Mn # [5] CHAKMA VOWEL SIGN A..CHAKMA VOWEL SIGN UU
> +1112D..11134 ; Mn # [8] CHAKMA VOWEL SIGN AI..CHAKMA MAAYYAA
> +11173 ; Mn # MAHAJANI SIGN NUKTA
> +11180..11181 ; Mn # [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA
> +111B6..111BE ; Mn # [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O
> +111C9..111CC ; Mn # [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK
> +111CF ; Mn # SHARADA SIGN INVERTED CANDRABINDU
> +1122F..11231 ; Mn # [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI
> +11234 ; Mn # KHOJKI SIGN ANUSVARA
> +11236..11237 ; Mn # [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA
> +1123E ; Mn # KHOJKI SIGN SUKUN
> +11241 ; Mn # KHOJKI VOWEL SIGN VOCALIC R
> +112DF ; Mn # KHUDAWADI SIGN ANUSVARA
> +112E3..112EA ; Mn # [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA
> +11300..11301 ; Mn # [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU
> +1133B..1133C ; Mn # [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA
> +11340 ; Mn # GRANTHA VOWEL SIGN II
> +11366..1136C ; Mn # [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX
> +11370..11374 ; Mn # [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA
> +113BB..113C0 ; Mn # [6] TULU-TIGALARI VOWEL SIGN U..TULU-TIGALARI VOWEL SIGN VOCALIC LL
> +113CE ; Mn # TULU-TIGALARI SIGN VIRAMA
> +113D0 ; Mn # TULU-TIGALARI CONJOINER
> +113D2 ; Mn # TULU-TIGALARI GEMINATION MARK
> +113E1..113E2 ; Mn # [2] TULU-TIGALARI VEDIC TONE SVARITA..TULU-TIGALARI VEDIC TONE ANUDATTA
> +11438..1143F ; Mn # [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI
> +11442..11444 ; Mn # [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA
> +11446 ; Mn # NEWA SIGN NUKTA
> +1145E ; Mn # NEWA SANDHI MARK
> +114B3..114B8 ; Mn # [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL
> +114BA ; Mn # TIRHUTA VOWEL SIGN SHORT E
> +114BF..114C0 ; Mn # [2] TIRHUTA SIGN CANDRABINDU..TIRHUTA SIGN ANUSVARA
> +114C2..114C3 ; Mn # [2] TIRHUTA SIGN VIRAMA..TIRHUTA SIGN NUKTA
> +115B2..115B5 ; Mn # [4] SIDDHAM VOWEL SIGN U..SIDDHAM VOWEL SIGN VOCALIC RR
> +115BC..115BD ; Mn # [2] SIDDHAM SIGN CANDRABINDU..SIDDHAM SIGN ANUSVARA
> +115BF..115C0 ; Mn # [2] SIDDHAM SIGN VIRAMA..SIDDHAM SIGN NUKTA
> +115DC..115DD ; Mn # [2] SIDDHAM VOWEL SIGN ALTERNATE U..SIDDHAM VOWEL SIGN ALTERNATE UU
> +11633..1163A ; Mn # [8] MODI VOWEL SIGN U..MODI VOWEL SIGN AI
> +1163D ; Mn # MODI SIGN ANUSVARA
> +1163F..11640 ; Mn # [2] MODI SIGN VIRAMA..MODI SIGN ARDHACANDRA
> +116AB ; Mn # TAKRI SIGN ANUSVARA
> +116AD ; Mn # TAKRI VOWEL SIGN AA
> +116B0..116B5 ; Mn # [6] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN AU
> +116B7 ; Mn # TAKRI SIGN NUKTA
> +1171D ; Mn # AHOM CONSONANT SIGN MEDIAL LA
> +1171F ; Mn # AHOM CONSONANT SIGN MEDIAL LIGATING RA
> +11722..11725 ; Mn # [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
> +11727..1172B ; Mn # [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER
> +1182F..11837 ; Mn # [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
> +11839..1183A ; Mn # [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA
> +1193B..1193C ; Mn # [2] DIVES AKURU SIGN ANUSVARA..DIVES AKURU SIGN CANDRABINDU
> +1193E ; Mn # DIVES AKURU VIRAMA
> +11943 ; Mn # DIVES AKURU SIGN NUKTA
> +119D4..119D7 ; Mn # [4] NANDINAGARI VOWEL SIGN U..NANDINAGARI VOWEL SIGN VOCALIC RR
> +119DA..119DB ; Mn # [2] NANDINAGARI VOWEL SIGN E..NANDINAGARI VOWEL SIGN AI
> +119E0 ; Mn # NANDINAGARI SIGN VIRAMA
> +11A01..11A0A ; Mn # [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK
> +11A33..11A38 ; Mn # [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
> +11A3B..11A3E ; Mn # [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
> +11A47 ; Mn # ZANABAZAR SQUARE SUBJOINER
> +11A51..11A56 ; Mn # [6] SOYOMBO VOWEL SIGN I..SOYOMBO VOWEL SIGN OE
> +11A59..11A5B ; Mn # [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK
> +11A8A..11A96 ; Mn # [13] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO SIGN ANUSVARA
> +11A98..11A99 ; Mn # [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
> +11C30..11C36 ; Mn # [7] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN VOCALIC L
> +11C38..11C3D ; Mn # [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA
> +11C3F ; Mn # BHAIKSUKI SIGN VIRAMA
> +11C92..11CA7 ; Mn # [22] MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINED LETTER ZA
> +11CAA..11CB0 ; Mn # [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA
> +11CB2..11CB3 ; Mn # [2] MARCHEN VOWEL SIGN U..MARCHEN VOWEL SIGN E
> +11CB5..11CB6 ; Mn # [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU
> +11D31..11D36 ; Mn # [6] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN VOCALIC R
> +11D3A ; Mn # MASARAM GONDI VOWEL SIGN E
> +11D3C..11D3D ; Mn # [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
> +11D3F..11D45 ; Mn # [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
> +11D47 ; Mn # MASARAM GONDI RA-KARA
> +11D90..11D91 ; Mn # [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI
> +11D95 ; Mn # GUNJALA GONDI SIGN ANUSVARA
> +11D97 ; Mn # GUNJALA GONDI VIRAMA
> +11EF3..11EF4 ; Mn # [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U
> +11F00..11F01 ; Mn # [2] KAWI SIGN CANDRABINDU..KAWI SIGN ANUSVARA
> +11F36..11F3A ; Mn # [5] KAWI VOWEL SIGN I..KAWI VOWEL SIGN VOCALIC R
> +11F40 ; Mn # KAWI VOWEL SIGN EU
> +11F42 ; Mn # KAWI CONJOINER
> +11F5A ; Mn # KAWI SIGN NUKTA
> +13440 ; Mn # EGYPTIAN HIEROGLYPH MIRROR HORIZONTALLY
> +13447..13455 ; Mn # [15] EGYPTIAN HIEROGLYPH MODIFIER DAMAGED AT TOP START..EGYPTIAN HIEROGLYPH MODIFIER DAMAGED
> +1611E..16129 ; Mn # [12] GURUNG KHEMA VOWEL SIGN AA..GURUNG KHEMA VOWEL LENGTH MARK
> +1612D..1612F ; Mn # [3] GURUNG KHEMA SIGN ANUSVARA..GURUNG KHEMA SIGN THOLHOMA
> +16AF0..16AF4 ; Mn # [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
> +16B30..16B36 ; Mn # [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM
> +16F4F ; Mn # MIAO SIGN CONSONANT MODIFIER BAR
> +16F8F..16F92 ; Mn # [4] MIAO TONE RIGHT..MIAO TONE BELOW
> +16FE4 ; Mn # KHITAN SMALL SCRIPT FILLER
> +1BC9D..1BC9E ; Mn # [2] DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUBLE MARK
> +1CF00..1CF2D ; Mn # [46] ZNAMENNY COMBINING MARK GORAZDO NIZKO S KRYZHEM ON LEFT..ZNAMENNY COMBINING MARK KRYZH ON LEFT
> +1CF30..1CF46 ; Mn # [23] ZNAMENNY COMBINING TONAL RANGE MARK MRACHNO..ZNAMENNY PRIZNAK MODIFIER ROG
> +1D167..1D169 ; Mn # [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3
> +1D17B..1D182 ; Mn # [8] MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBOL COMBINING LOURE
> +1D185..1D18B ; Mn # [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE
> +1D1AA..1D1AD ; Mn # [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO
> +1D242..1D244 ; Mn # [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME
> +1DA00..1DA36 ; Mn # [55] SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING IN
> +1DA3B..1DA6C ; Mn # [50] SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING EXCITEMENT
> +1DA75 ; Mn # SIGNWRITING UPPER BODY TILTING FROM HIP JOINTS
> +1DA84 ; Mn # SIGNWRITING LOCATION HEAD NECK
> +1DA9B..1DA9F ; Mn # [5] SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL MODIFIER-6
> +1DAA1..1DAAF ; Mn # [15] SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING ROTATION MODIFIER-16
> +1E000..1E006 ; Mn # [7] COMBINING GLAGOLITIC LETTER AZU..COMBINING GLAGOLITIC LETTER ZHIVETE
> +1E008..1E018 ; Mn # [17] COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING GLAGOLITIC LETTER HERU
> +1E01B..1E021 ; Mn # [7] COMBINING GLAGOLITIC LETTER SHTA..COMBINING GLAGOLITIC LETTER YATI
> +1E023..1E024 ; Mn # [2] COMBINING GLAGOLITIC LETTER YU..COMBINING GLAGOLITIC LETTER SMALL YUS
> +1E026..1E02A ; Mn # [5] COMBINING GLAGOLITIC LETTER YO..COMBINING GLAGOLITIC LETTER FITA
> +1E08F ; Mn # COMBINING CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
> +1E130..1E136 ; Mn # [7] NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACHUE HMONG TONE-D
> +1E2AE ; Mn # TOTO SIGN RISING TONE
> +1E2EC..1E2EF ; Mn # [4] WANCHO TONE TUP..WANCHO TONE KOINI
> +1E4EC..1E4EF ; Mn # [4] NAG MUNDARI SIGN MUHOR..NAG MUNDARI SIGN SUTUH
> +1E5EE..1E5EF ; Mn # [2] OL ONAL SIGN MU..OL ONAL SIGN IKIR
> +1E8D0..1E8D6 ; Mn # [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
> +1E944..1E94A ; Mn # [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA
> +E0100..E01EF ; Mn # [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
> +
> +# Total code points: 2020
> +
> +# ================================================
> +
> +# General_Category=Enclosing_Mark
> +
> +0488..0489 ; Me # [2] COMBINING CYRILLIC HUNDRED THOUSANDS SIGN..COMBINING CYRILLIC MILLIONS SIGN
> +1ABE ; Me # COMBINING PARENTHESES OVERLAY
> +20DD..20E0 ; Me # [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH
> +20E2..20E4 ; Me # [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE
> +A670..A672 ; Me # [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN
> +
> +# Total code points: 13
> +
> +# ================================================
> +
> +# General_Category=Spacing_Mark
> +
> +0903 ; Mc # DEVANAGARI SIGN VISARGA
> +093B ; Mc # DEVANAGARI VOWEL SIGN OOE
> +093E..0940 ; Mc # [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II
> +0949..094C ; Mc # [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU
> +094E..094F ; Mc # [2] DEVANAGARI VOWEL SIGN PRISHTHAMATRA E..DEVANAGARI VOWEL SIGN AW
> +0982..0983 ; Mc # [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA
> +09BE..09C0 ; Mc # [3] BENGALI VOWEL SIGN AA..BENGALI VOWEL SIGN II
> +09C7..09C8 ; Mc # [2] BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI
> +09CB..09CC ; Mc # [2] BENGALI VOWEL SIGN O..BENGALI VOWEL SIGN AU
> +09D7 ; Mc # BENGALI AU LENGTH MARK
> +0A03 ; Mc # GURMUKHI SIGN VISARGA
> +0A3E..0A40 ; Mc # [3] GURMUKHI VOWEL SIGN AA..GURMUKHI VOWEL SIGN II
> +0A83 ; Mc # GUJARATI SIGN VISARGA
> +0ABE..0AC0 ; Mc # [3] GUJARATI VOWEL SIGN AA..GUJARATI VOWEL SIGN II
> +0AC9 ; Mc # GUJARATI VOWEL SIGN CANDRA O
> +0ACB..0ACC ; Mc # [2] GUJARATI VOWEL SIGN O..GUJARATI VOWEL SIGN AU
> +0B02..0B03 ; Mc # [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA
> +0B3E ; Mc # ORIYA VOWEL SIGN AA
> +0B40 ; Mc # ORIYA VOWEL SIGN II
> +0B47..0B48 ; Mc # [2] ORIYA VOWEL SIGN E..ORIYA VOWEL SIGN AI
> +0B4B..0B4C ; Mc # [2] ORIYA VOWEL SIGN O..ORIYA VOWEL SIGN AU
> +0B57 ; Mc # ORIYA AU LENGTH MARK
> +0BBE..0BBF ; Mc # [2] TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN I
> +0BC1..0BC2 ; Mc # [2] TAMIL VOWEL SIGN U..TAMIL VOWEL SIGN UU
> +0BC6..0BC8 ; Mc # [3] TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI
> +0BCA..0BCC ; Mc # [3] TAMIL VOWEL SIGN O..TAMIL VOWEL SIGN AU
> +0BD7 ; Mc # TAMIL AU LENGTH MARK
> +0C01..0C03 ; Mc # [3] TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA
> +0C41..0C44 ; Mc # [4] TELUGU VOWEL SIGN U..TELUGU VOWEL SIGN VOCALIC RR
> +0C82..0C83 ; Mc # [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA
> +0CBE ; Mc # KANNADA VOWEL SIGN AA
> +0CC0..0CC4 ; Mc # [5] KANNADA VOWEL SIGN II..KANNADA VOWEL SIGN VOCALIC RR
> +0CC7..0CC8 ; Mc # [2] KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN AI
> +0CCA..0CCB ; Mc # [2] KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO
> +0CD5..0CD6 ; Mc # [2] KANNADA LENGTH MARK..KANNADA AI LENGTH MARK
> +0CF3 ; Mc # KANNADA SIGN COMBINING ANUSVARA ABOVE RIGHT
> +0D02..0D03 ; Mc # [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA
> +0D3E..0D40 ; Mc # [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II
> +0D46..0D48 ; Mc # [3] MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI
> +0D4A..0D4C ; Mc # [3] MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU
> +0D57 ; Mc # MALAYALAM AU LENGTH MARK
> +0D82..0D83 ; Mc # [2] SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA
> +0DCF..0DD1 ; Mc # [3] SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SIGN DIGA AEDA-PILLA
> +0DD8..0DDF ; Mc # [8] SINHALA VOWEL SIGN GAETTA-PILLA..SINHALA VOWEL SIGN GAYANUKITTA
> +0DF2..0DF3 ; Mc # [2] SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA VOWEL SIGN DIGA GAYANUKITTA
> +0F3E..0F3F ; Mc # [2] TIBETAN SIGN YAR TSHES..TIBETAN SIGN MAR TSHES
> +0F7F ; Mc # TIBETAN SIGN RNAM BCAD
> +102B..102C ; Mc # [2] MYANMAR VOWEL SIGN TALL AA..MYANMAR VOWEL SIGN AA
> +1031 ; Mc # MYANMAR VOWEL SIGN E
> +1038 ; Mc # MYANMAR SIGN VISARGA
> +103B..103C ; Mc # [2] MYANMAR CONSONANT SIGN MEDIAL YA..MYANMAR CONSONANT SIGN MEDIAL RA
> +1056..1057 ; Mc # [2] MYANMAR VOWEL SIGN VOCALIC R..MYANMAR VOWEL SIGN VOCALIC RR
> +1062..1064 ; Mc # [3] MYANMAR VOWEL SIGN SGAW KAREN EU..MYANMAR TONE MARK SGAW KAREN KE PHO
> +1067..106D ; Mc # [7] MYANMAR VOWEL SIGN WESTERN PWO KAREN EU..MYANMAR SIGN WESTERN PWO KAREN TONE-5
> +1083..1084 ; Mc # [2] MYANMAR VOWEL SIGN SHAN AA..MYANMAR VOWEL SIGN SHAN E
> +1087..108C ; Mc # [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3
> +108F ; Mc # MYANMAR SIGN RUMAI PALAUNG TONE-5
> +109A..109C ; Mc # [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A
> +1715 ; Mc # TAGALOG SIGN PAMUDPOD
> +1734 ; Mc # HANUNOO SIGN PAMUDPOD
> +17B6 ; Mc # KHMER VOWEL SIGN AA
> +17BE..17C5 ; Mc # [8] KHMER VOWEL SIGN OE..KHMER VOWEL SIGN AU
> +17C7..17C8 ; Mc # [2] KHMER SIGN REAHMUK..KHMER SIGN YUUKALEAPINTU
> +1923..1926 ; Mc # [4] LIMBU VOWEL SIGN EE..LIMBU VOWEL SIGN AU
> +1929..192B ; Mc # [3] LIMBU SUBJOINED LETTER YA..LIMBU SUBJOINED LETTER WA
> +1930..1931 ; Mc # [2] LIMBU SMALL LETTER KA..LIMBU SMALL LETTER NGA
> +1933..1938 ; Mc # [6] LIMBU SMALL LETTER TA..LIMBU SMALL LETTER LA
> +1A19..1A1A ; Mc # [2] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN O
> +1A55 ; Mc # TAI THAM CONSONANT SIGN MEDIAL RA
> +1A57 ; Mc # TAI THAM CONSONANT SIGN LA TANG LAI
> +1A61 ; Mc # TAI THAM VOWEL SIGN A
> +1A63..1A64 ; Mc # [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA
> +1A6D..1A72 ; Mc # [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI
> +1B04 ; Mc # BALINESE SIGN BISAH
> +1B35 ; Mc # BALINESE VOWEL SIGN TEDUNG
> +1B3B ; Mc # BALINESE VOWEL SIGN RA REPA TEDUNG
> +1B3D..1B41 ; Mc # [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG
> +1B43..1B44 ; Mc # [2] BALINESE VOWEL SIGN PEPET TEDUNG..BALINESE ADEG ADEG
> +1B82 ; Mc # SUNDANESE SIGN PANGWISAD
> +1BA1 ; Mc # SUNDANESE CONSONANT SIGN PAMINGKAL
> +1BA6..1BA7 ; Mc # [2] SUNDANESE VOWEL SIGN PANAELAENG..SUNDANESE VOWEL SIGN PANOLONG
> +1BAA ; Mc # SUNDANESE SIGN PAMAAEH
> +1BE7 ; Mc # BATAK VOWEL SIGN E
> +1BEA..1BEC ; Mc # [3] BATAK VOWEL SIGN I..BATAK VOWEL SIGN O
> +1BEE ; Mc # BATAK VOWEL SIGN U
> +1BF2..1BF3 ; Mc # [2] BATAK PANGOLAT..BATAK PANONGONAN
> +1C24..1C2B ; Mc # [8] LEPCHA SUBJOINED LETTER YA..LEPCHA VOWEL SIGN UU
> +1C34..1C35 ; Mc # [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG
> +1CE1 ; Mc # VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA
> +1CF7 ; Mc # VEDIC SIGN ATIKRAMA
> +302E..302F ; Mc # [2] HANGUL SINGLE DOT TONE MARK..HANGUL DOUBLE DOT TONE MARK
> +A823..A824 ; Mc # [2] SYLOTI NAGRI VOWEL SIGN A..SYLOTI NAGRI VOWEL SIGN I
> +A827 ; Mc # SYLOTI NAGRI VOWEL SIGN OO
> +A880..A881 ; Mc # [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA
> +A8B4..A8C3 ; Mc # [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU
> +A952..A953 ; Mc # [2] REJANG CONSONANT SIGN H..REJANG VIRAMA
> +A983 ; Mc # JAVANESE SIGN WIGNYAN
> +A9B4..A9B5 ; Mc # [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG
> +A9BA..A9BB ; Mc # [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE
> +A9BE..A9C0 ; Mc # [3] JAVANESE CONSONANT SIGN PENGKAL..JAVANESE PANGKON
> +AA2F..AA30 ; Mc # [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI
> +AA33..AA34 ; Mc # [2] CHAM CONSONANT SIGN YA..CHAM CONSONANT SIGN RA
> +AA4D ; Mc # CHAM CONSONANT SIGN FINAL H
> +AA7B ; Mc # MYANMAR SIGN PAO KAREN TONE
> +AA7D ; Mc # MYANMAR SIGN TAI LAING TONE-5
> +AAEB ; Mc # MEETEI MAYEK VOWEL SIGN II
> +AAEE..AAEF ; Mc # [2] MEETEI MAYEK VOWEL SIGN AU..MEETEI MAYEK VOWEL SIGN AAU
> +AAF5 ; Mc # MEETEI MAYEK VOWEL SIGN VISARGA
> +ABE3..ABE4 ; Mc # [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP
> +ABE6..ABE7 ; Mc # [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP
> +ABE9..ABEA ; Mc # [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG
> +ABEC ; Mc # MEETEI MAYEK LUM IYEK
> +11000 ; Mc # BRAHMI SIGN CANDRABINDU
> +11002 ; Mc # BRAHMI SIGN VISARGA
> +11082 ; Mc # KAITHI SIGN VISARGA
> +110B0..110B2 ; Mc # [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II
> +110B7..110B8 ; Mc # [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU
> +1112C ; Mc # CHAKMA VOWEL SIGN E
> +11145..11146 ; Mc # [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI
> +11182 ; Mc # SHARADA SIGN VISARGA
> +111B3..111B5 ; Mc # [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II
> +111BF..111C0 ; Mc # [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA
> +111CE ; Mc # SHARADA VOWEL SIGN PRISHTHAMATRA E
> +1122C..1122E ; Mc # [3] KHOJKI VOWEL SIGN AA..KHOJKI VOWEL SIGN II
> +11232..11233 ; Mc # [2] KHOJKI VOWEL SIGN O..KHOJKI VOWEL SIGN AU
> +11235 ; Mc # KHOJKI SIGN VIRAMA
> +112E0..112E2 ; Mc # [3] KHUDAWADI VOWEL SIGN AA..KHUDAWADI VOWEL SIGN II
> +11302..11303 ; Mc # [2] GRANTHA SIGN ANUSVARA..GRANTHA SIGN VISARGA
> +1133E..1133F ; Mc # [2] GRANTHA VOWEL SIGN AA..GRANTHA VOWEL SIGN I
> +11341..11344 ; Mc # [4] GRANTHA VOWEL SIGN U..GRANTHA VOWEL SIGN VOCALIC RR
> +11347..11348 ; Mc # [2] GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI
> +1134B..1134D ; Mc # [3] GRANTHA VOWEL SIGN OO..GRANTHA SIGN VIRAMA
> +11357 ; Mc # GRANTHA AU LENGTH MARK
> +11362..11363 ; Mc # [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL
> +113B8..113BA ; Mc # [3] TULU-TIGALARI VOWEL SIGN AA..TULU-TIGALARI VOWEL SIGN II
> +113C2 ; Mc # TULU-TIGALARI VOWEL SIGN EE
> +113C5 ; Mc # TULU-TIGALARI VOWEL SIGN AI
> +113C7..113CA ; Mc # [4] TULU-TIGALARI VOWEL SIGN OO..TULU-TIGALARI SIGN CANDRA ANUNASIKA
> +113CC..113CD ; Mc # [2] TULU-TIGALARI SIGN ANUSVARA..TULU-TIGALARI SIGN VISARGA
> +113CF ; Mc # TULU-TIGALARI SIGN LOOPED VIRAMA
> +11435..11437 ; Mc # [3] NEWA VOWEL SIGN AA..NEWA VOWEL SIGN II
> +11440..11441 ; Mc # [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU
> +11445 ; Mc # NEWA SIGN VISARGA
> +114B0..114B2 ; Mc # [3] TIRHUTA VOWEL SIGN AA..TIRHUTA VOWEL SIGN II
> +114B9 ; Mc # TIRHUTA VOWEL SIGN E
> +114BB..114BE ; Mc # [4] TIRHUTA VOWEL SIGN AI..TIRHUTA VOWEL SIGN AU
> +114C1 ; Mc # TIRHUTA SIGN VISARGA
> +115AF..115B1 ; Mc # [3] SIDDHAM VOWEL SIGN AA..SIDDHAM VOWEL SIGN II
> +115B8..115BB ; Mc # [4] SIDDHAM VOWEL SIGN E..SIDDHAM VOWEL SIGN AU
> +115BE ; Mc # SIDDHAM SIGN VISARGA
> +11630..11632 ; Mc # [3] MODI VOWEL SIGN AA..MODI VOWEL SIGN II
> +1163B..1163C ; Mc # [2] MODI VOWEL SIGN O..MODI VOWEL SIGN AU
> +1163E ; Mc # MODI SIGN VISARGA
> +116AC ; Mc # TAKRI SIGN VISARGA
> +116AE..116AF ; Mc # [2] TAKRI VOWEL SIGN I..TAKRI VOWEL SIGN II
> +116B6 ; Mc # TAKRI SIGN VIRAMA
> +1171E ; Mc # AHOM CONSONANT SIGN MEDIAL RA
> +11720..11721 ; Mc # [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA
> +11726 ; Mc # AHOM VOWEL SIGN E
> +1182C..1182E ; Mc # [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
> +11838 ; Mc # DOGRA SIGN VISARGA
> +11930..11935 ; Mc # [6] DIVES AKURU VOWEL SIGN AA..DIVES AKURU VOWEL SIGN E
> +11937..11938 ; Mc # [2] DIVES AKURU VOWEL SIGN AI..DIVES AKURU VOWEL SIGN O
> +1193D ; Mc # DIVES AKURU SIGN HALANTA
> +11940 ; Mc # DIVES AKURU MEDIAL YA
> +11942 ; Mc # DIVES AKURU MEDIAL RA
> +119D1..119D3 ; Mc # [3] NANDINAGARI VOWEL SIGN AA..NANDINAGARI VOWEL SIGN II
> +119DC..119DF ; Mc # [4] NANDINAGARI VOWEL SIGN O..NANDINAGARI SIGN VISARGA
> +119E4 ; Mc # NANDINAGARI VOWEL SIGN PRISHTHAMATRA E
> +11A39 ; Mc # ZANABAZAR SQUARE SIGN VISARGA
> +11A57..11A58 ; Mc # [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
> +11A97 ; Mc # SOYOMBO SIGN VISARGA
> +11C2F ; Mc # BHAIKSUKI VOWEL SIGN AA
> +11C3E ; Mc # BHAIKSUKI SIGN VISARGA
> +11CA9 ; Mc # MARCHEN SUBJOINED LETTER YA
> +11CB1 ; Mc # MARCHEN VOWEL SIGN I
> +11CB4 ; Mc # MARCHEN VOWEL SIGN O
> +11D8A..11D8E ; Mc # [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU
> +11D93..11D94 ; Mc # [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU
> +11D96 ; Mc # GUNJALA GONDI SIGN VISARGA
> +11EF5..11EF6 ; Mc # [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O
> +11F03 ; Mc # KAWI SIGN VISARGA
> +11F34..11F35 ; Mc # [2] KAWI VOWEL SIGN AA..KAWI VOWEL SIGN ALTERNATE AA
> +11F3E..11F3F ; Mc # [2] KAWI VOWEL SIGN E..KAWI VOWEL SIGN AI
> +11F41 ; Mc # KAWI SIGN KILLER
> +1612A..1612C ; Mc # [3] GURUNG KHEMA CONSONANT SIGN MEDIAL YA..GURUNG KHEMA CONSONANT SIGN MEDIAL HA
> +16F51..16F87 ; Mc # [55] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN UI
> +16FF0..16FF1 ; Mc # [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
> +1D165..1D166 ; Mc # [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM
> +1D16D..1D172 ; Mc # [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5
> +
> +# Total code points: 468
> +
> +# ================================================
> +
> +# General_Category=Decimal_Number
> +
> +0030..0039 ; Nd # [10] DIGIT ZERO..DIGIT NINE
> +0660..0669 ; Nd # [10] ARABIC-INDIC DIGIT ZERO..ARABIC-INDIC DIGIT NINE
> +06F0..06F9 ; Nd # [10] EXTENDED ARABIC-INDIC DIGIT ZERO..EXTENDED ARABIC-INDIC DIGIT NINE
> +07C0..07C9 ; Nd # [10] NKO DIGIT ZERO..NKO DIGIT NINE
> +0966..096F ; Nd # [10] DEVANAGARI DIGIT ZERO..DEVANAGARI DIGIT NINE
> +09E6..09EF ; Nd # [10] BENGALI DIGIT ZERO..BENGALI DIGIT NINE
> +0A66..0A6F ; Nd # [10] GURMUKHI DIGIT ZERO..GURMUKHI DIGIT NINE
> +0AE6..0AEF ; Nd # [10] GUJARATI DIGIT ZERO..GUJARATI DIGIT NINE
> +0B66..0B6F ; Nd # [10] ORIYA DIGIT ZERO..ORIYA DIGIT NINE
> +0BE6..0BEF ; Nd # [10] TAMIL DIGIT ZERO..TAMIL DIGIT NINE
> +0C66..0C6F ; Nd # [10] TELUGU DIGIT ZERO..TELUGU DIGIT NINE
> +0CE6..0CEF ; Nd # [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
> +0D66..0D6F ; Nd # [10] MALAYALAM DIGIT ZERO..MALAYALAM DIGIT NINE
> +0DE6..0DEF ; Nd # [10] SINHALA LITH DIGIT ZERO..SINHALA LITH DIGIT NINE
> +0E50..0E59 ; Nd # [10] THAI DIGIT ZERO..THAI DIGIT NINE
> +0ED0..0ED9 ; Nd # [10] LAO DIGIT ZERO..LAO DIGIT NINE
> +0F20..0F29 ; Nd # [10] TIBETAN DIGIT ZERO..TIBETAN DIGIT NINE
> +1040..1049 ; Nd # [10] MYANMAR DIGIT ZERO..MYANMAR DIGIT NINE
> +1090..1099 ; Nd # [10] MYANMAR SHAN DIGIT ZERO..MYANMAR SHAN DIGIT NINE
> +17E0..17E9 ; Nd # [10] KHMER DIGIT ZERO..KHMER DIGIT NINE
> +1810..1819 ; Nd # [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE
> +1946..194F ; Nd # [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE
> +19D0..19D9 ; Nd # [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE
> +1A80..1A89 ; Nd # [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE
> +1A90..1A99 ; Nd # [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE
> +1B50..1B59 ; Nd # [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE
> +1BB0..1BB9 ; Nd # [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE
> +1C40..1C49 ; Nd # [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE
> +1C50..1C59 ; Nd # [10] OL CHIKI DIGIT ZERO..OL CHIKI DIGIT NINE
> +A620..A629 ; Nd # [10] VAI DIGIT ZERO..VAI DIGIT NINE
> +A8D0..A8D9 ; Nd # [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE
> +A900..A909 ; Nd # [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE
> +A9D0..A9D9 ; Nd # [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE
> +A9F0..A9F9 ; Nd # [10] MYANMAR TAI LAING DIGIT ZERO..MYANMAR TAI LAING DIGIT NINE
> +AA50..AA59 ; Nd # [10] CHAM DIGIT ZERO..CHAM DIGIT NINE
> +ABF0..ABF9 ; Nd # [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE
> +FF10..FF19 ; Nd # [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE
> +104A0..104A9 ; Nd # [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
> +10D30..10D39 ; Nd # [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE
> +10D40..10D49 ; Nd # [10] GARAY DIGIT ZERO..GARAY DIGIT NINE
> +11066..1106F ; Nd # [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE
> +110F0..110F9 ; Nd # [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE
> +11136..1113F ; Nd # [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
> +111D0..111D9 ; Nd # [10] SHARADA DIGIT ZERO..SHARADA DIGIT NINE
> +112F0..112F9 ; Nd # [10] KHUDAWADI DIGIT ZERO..KHUDAWADI DIGIT NINE
> +11450..11459 ; Nd # [10] NEWA DIGIT ZERO..NEWA DIGIT NINE
> +114D0..114D9 ; Nd # [10] TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE
> +11650..11659 ; Nd # [10] MODI DIGIT ZERO..MODI DIGIT NINE
> +116C0..116C9 ; Nd # [10] TAKRI DIGIT ZERO..TAKRI DIGIT NINE
> +116D0..116E3 ; Nd # [20] MYANMAR PAO DIGIT ZERO..MYANMAR EASTERN PWO KAREN DIGIT NINE
> +11730..11739 ; Nd # [10] AHOM DIGIT ZERO..AHOM DIGIT NINE
> +118E0..118E9 ; Nd # [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE
> +11950..11959 ; Nd # [10] DIVES AKURU DIGIT ZERO..DIVES AKURU DIGIT NINE
> +11BF0..11BF9 ; Nd # [10] SUNUWAR DIGIT ZERO..SUNUWAR DIGIT NINE
> +11C50..11C59 ; Nd # [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE
> +11D50..11D59 ; Nd # [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE
> +11DA0..11DA9 ; Nd # [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE
> +11F50..11F59 ; Nd # [10] KAWI DIGIT ZERO..KAWI DIGIT NINE
> +16130..16139 ; Nd # [10] GURUNG KHEMA DIGIT ZERO..GURUNG KHEMA DIGIT NINE
> +16A60..16A69 ; Nd # [10] MRO DIGIT ZERO..MRO DIGIT NINE
> +16AC0..16AC9 ; Nd # [10] TANGSA DIGIT ZERO..TANGSA DIGIT NINE
> +16B50..16B59 ; Nd # [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE
> +16D70..16D79 ; Nd # [10] KIRAT RAI DIGIT ZERO..KIRAT RAI DIGIT NINE
> +1CCF0..1CCF9 ; Nd # [10] OUTLINED DIGIT ZERO..OUTLINED DIGIT NINE
> +1D7CE..1D7FF ; Nd # [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE
> +1E140..1E149 ; Nd # [10] NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG PUACHUE HMONG DIGIT NINE
> +1E2F0..1E2F9 ; Nd # [10] WANCHO DIGIT ZERO..WANCHO DIGIT NINE
> +1E4F0..1E4F9 ; Nd # [10] NAG MUNDARI DIGIT ZERO..NAG MUNDARI DIGIT NINE
> +1E5F1..1E5FA ; Nd # [10] OL ONAL DIGIT ZERO..OL ONAL DIGIT NINE
> +1E950..1E959 ; Nd # [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE
> +1FBF0..1FBF9 ; Nd # [10] SEGMENTED DIGIT ZERO..SEGMENTED DIGIT NINE
> +
> +# Total code points: 760
> +
> +# ================================================
> +
> +# General_Category=Letter_Number
> +
> +16EE..16F0 ; Nl # [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL
> +2160..2182 ; Nl # [35] ROMAN NUMERAL ONE..ROMAN NUMERAL TEN THOUSAND
> +2185..2188 ; Nl # [4] ROMAN NUMERAL SIX LATE FORM..ROMAN NUMERAL ONE HUNDRED THOUSAND
> +3007 ; Nl # IDEOGRAPHIC NUMBER ZERO
> +3021..3029 ; Nl # [9] HANGZHOU NUMERAL ONE..HANGZHOU NUMERAL NINE
> +3038..303A ; Nl # [3] HANGZHOU NUMERAL TEN..HANGZHOU NUMERAL THIRTY
> +A6E6..A6EF ; Nl # [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM
> +10140..10174 ; Nl # [53] GREEK ACROPHONIC ATTIC ONE QUARTER..GREEK ACROPHONIC STRATIAN FIFTY MNAS
> +10341 ; Nl # GOTHIC LETTER NINETY
> +1034A ; Nl # GOTHIC LETTER NINE HUNDRED
> +103D1..103D5 ; Nl # [5] OLD PERSIAN NUMBER ONE..OLD PERSIAN NUMBER HUNDRED
> +12400..1246E ; Nl # [111] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN NINE U VARIANT FORM
> +
> +# Total code points: 236
> +
> +# ================================================
> +
> +# General_Category=Other_Number
> +
> +00B2..00B3 ; No # [2] SUPERSCRIPT TWO..SUPERSCRIPT THREE
> +00B9 ; No # SUPERSCRIPT ONE
> +00BC..00BE ; No # [3] VULGAR FRACTION ONE QUARTER..VULGAR FRACTION THREE QUARTERS
> +09F4..09F9 ; No # [6] BENGALI CURRENCY NUMERATOR ONE..BENGALI CURRENCY DENOMINATOR SIXTEEN
> +0B72..0B77 ; No # [6] ORIYA FRACTION ONE QUARTER..ORIYA FRACTION THREE SIXTEENTHS
> +0BF0..0BF2 ; No # [3] TAMIL NUMBER TEN..TAMIL NUMBER ONE THOUSAND
> +0C78..0C7E ; No # [7] TELUGU FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR..TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR
> +0D58..0D5E ; No # [7] MALAYALAM FRACTION ONE ONE-HUNDRED-AND-SIXTIETH..MALAYALAM FRACTION ONE FIFTH
> +0D70..0D78 ; No # [9] MALAYALAM NUMBER TEN..MALAYALAM FRACTION THREE SIXTEENTHS
> +0F2A..0F33 ; No # [10] TIBETAN DIGIT HALF ONE..TIBETAN DIGIT HALF ZERO
> +1369..137C ; No # [20] ETHIOPIC DIGIT ONE..ETHIOPIC NUMBER TEN THOUSAND
> +17F0..17F9 ; No # [10] KHMER SYMBOL LEK ATTAK SON..KHMER SYMBOL LEK ATTAK PRAM-BUON
> +19DA ; No # NEW TAI LUE THAM DIGIT ONE
> +2070 ; No # SUPERSCRIPT ZERO
> +2074..2079 ; No # [6] SUPERSCRIPT FOUR..SUPERSCRIPT NINE
> +2080..2089 ; No # [10] SUBSCRIPT ZERO..SUBSCRIPT NINE
> +2150..215F ; No # [16] VULGAR FRACTION ONE SEVENTH..FRACTION NUMERATOR ONE
> +2189 ; No # VULGAR FRACTION ZERO THIRDS
> +2460..249B ; No # [60] CIRCLED DIGIT ONE..NUMBER TWENTY FULL STOP
> +24EA..24FF ; No # [22] CIRCLED DIGIT ZERO..NEGATIVE CIRCLED DIGIT ZERO
> +2776..2793 ; No # [30] DINGBAT NEGATIVE CIRCLED DIGIT ONE..DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN
> +2CFD ; No # COPTIC FRACTION ONE HALF
> +3192..3195 ; No # [4] IDEOGRAPHIC ANNOTATION ONE MARK..IDEOGRAPHIC ANNOTATION FOUR MARK
> +3220..3229 ; No # [10] PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEOGRAPH TEN
> +3248..324F ; No # [8] CIRCLED NUMBER TEN ON BLACK SQUARE..CIRCLED NUMBER EIGHTY ON BLACK SQUARE
> +3251..325F ; No # [15] CIRCLED NUMBER TWENTY ONE..CIRCLED NUMBER THIRTY FIVE
> +3280..3289 ; No # [10] CIRCLED IDEOGRAPH ONE..CIRCLED IDEOGRAPH TEN
> +32B1..32BF ; No # [15] CIRCLED NUMBER THIRTY SIX..CIRCLED NUMBER FIFTY
> +A830..A835 ; No # [6] NORTH INDIC FRACTION ONE QUARTER..NORTH INDIC FRACTION THREE SIXTEENTHS
> +10107..10133 ; No # [45] AEGEAN NUMBER ONE..AEGEAN NUMBER NINETY THOUSAND
> +10175..10178 ; No # [4] GREEK ONE HALF SIGN..GREEK THREE QUARTERS SIGN
> +1018A..1018B ; No # [2] GREEK ZERO SIGN..GREEK ONE QUARTER SIGN
> +102E1..102FB ; No # [27] COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER NINE HUNDRED
> +10320..10323 ; No # [4] OLD ITALIC NUMERAL ONE..OLD ITALIC NUMERAL FIFTY
> +10858..1085F ; No # [8] IMPERIAL ARAMAIC NUMBER ONE..IMPERIAL ARAMAIC NUMBER TEN THOUSAND
> +10879..1087F ; No # [7] PALMYRENE NUMBER ONE..PALMYRENE NUMBER TWENTY
> +108A7..108AF ; No # [9] NABATAEAN NUMBER ONE..NABATAEAN NUMBER ONE HUNDRED
> +108FB..108FF ; No # [5] HATRAN NUMBER ONE..HATRAN NUMBER ONE HUNDRED
> +10916..1091B ; No # [6] PHOENICIAN NUMBER ONE..PHOENICIAN NUMBER THREE
> +109BC..109BD ; No # [2] MEROITIC CURSIVE FRACTION ELEVEN TWELFTHS..MEROITIC CURSIVE FRACTION ONE HALF
> +109C0..109CF ; No # [16] MEROITIC CURSIVE NUMBER ONE..MEROITIC CURSIVE NUMBER SEVENTY
> +109D2..109FF ; No # [46] MEROITIC CURSIVE NUMBER ONE HUNDRED..MEROITIC CURSIVE FRACTION TEN TWELFTHS
> +10A40..10A48 ; No # [9] KHAROSHTHI DIGIT ONE..KHAROSHTHI FRACTION ONE HALF
> +10A7D..10A7E ; No # [2] OLD SOUTH ARABIAN NUMBER ONE..OLD SOUTH ARABIAN NUMBER FIFTY
> +10A9D..10A9F ; No # [3] OLD NORTH ARABIAN NUMBER ONE..OLD NORTH ARABIAN NUMBER TWENTY
> +10AEB..10AEF ; No # [5] MANICHAEAN NUMBER ONE..MANICHAEAN NUMBER ONE HUNDRED
> +10B58..10B5F ; No # [8] INSCRIPTIONAL PARTHIAN NUMBER ONE..INSCRIPTIONAL PARTHIAN NUMBER ONE THOUSAND
> +10B78..10B7F ; No # [8] INSCRIPTIONAL PAHLAVI NUMBER ONE..INSCRIPTIONAL PAHLAVI NUMBER ONE THOUSAND
> +10BA9..10BAF ; No # [7] PSALTER PAHLAVI NUMBER ONE..PSALTER PAHLAVI NUMBER ONE HUNDRED
> +10CFA..10CFF ; No # [6] OLD HUNGARIAN NUMBER ONE..OLD HUNGARIAN NUMBER ONE THOUSAND
> +10E60..10E7E ; No # [31] RUMI DIGIT ONE..RUMI FRACTION TWO THIRDS
> +10F1D..10F26 ; No # [10] OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION ONE HALF
> +10F51..10F54 ; No # [4] SOGDIAN NUMBER ONE..SOGDIAN NUMBER ONE HUNDRED
> +10FC5..10FCB ; No # [7] CHORASMIAN NUMBER ONE..CHORASMIAN NUMBER ONE HUNDRED
> +11052..11065 ; No # [20] BRAHMI NUMBER ONE..BRAHMI NUMBER ONE THOUSAND
> +111E1..111F4 ; No # [20] SINHALA ARCHAIC DIGIT ONE..SINHALA ARCHAIC NUMBER ONE THOUSAND
> +1173A..1173B ; No # [2] AHOM NUMBER TEN..AHOM NUMBER TWENTY
> +118EA..118F2 ; No # [9] WARANG CITI NUMBER TEN..WARANG CITI NUMBER NINETY
> +11C5A..11C6C ; No # [19] BHAIKSUKI NUMBER ONE..BHAIKSUKI HUNDREDS UNIT MARK
> +11FC0..11FD4 ; No # [21] TAMIL FRACTION ONE THREE-HUNDRED-AND-TWENTIETH..TAMIL FRACTION DOWNSCALING FACTOR KIIZH
> +16B5B..16B61 ; No # [7] PAHAWH HMONG NUMBER TENS..PAHAWH HMONG NUMBER TRILLIONS
> +16E80..16E96 ; No # [23] MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN DIGIT THREE ALTERNATE FORM
> +1D2C0..1D2D3 ; No # [20] KAKTOVIK NUMERAL ZERO..KAKTOVIK NUMERAL NINETEEN
> +1D2E0..1D2F3 ; No # [20] MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
> +1D360..1D378 ; No # [25] COUNTING ROD UNIT DIGIT ONE..TALLY MARK FIVE
> +1E8C7..1E8CF ; No # [9] MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT NINE
> +1EC71..1ECAB ; No # [59] INDIC SIYAQ NUMBER ONE..INDIC SIYAQ NUMBER PREFIXED NINE
> +1ECAD..1ECAF ; No # [3] INDIC SIYAQ FRACTION ONE QUARTER..INDIC SIYAQ FRACTION THREE QUARTERS
> +1ECB1..1ECB4 ; No # [4] INDIC SIYAQ NUMBER ALTERNATE ONE..INDIC SIYAQ ALTERNATE LAKH MARK
> +1ED01..1ED2D ; No # [45] OTTOMAN SIYAQ NUMBER ONE..OTTOMAN SIYAQ NUMBER NINETY THOUSAND
> +1ED2F..1ED3D ; No # [15] OTTOMAN SIYAQ ALTERNATE NUMBER TWO..OTTOMAN SIYAQ FRACTION ONE SIXTH
> +1F100..1F10C ; No # [13] DIGIT ZERO FULL STOP..DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
> +
> +# Total code points: 915
> +
> +# ================================================
> +
> +# General_Category=Space_Separator
> +
> +0020 ; Zs # SPACE
> +00A0 ; Zs # NO-BREAK SPACE
> +1680 ; Zs # OGHAM SPACE MARK
> +2000..200A ; Zs # [11] EN QUAD..HAIR SPACE
> +202F ; Zs # NARROW NO-BREAK SPACE
> +205F ; Zs # MEDIUM MATHEMATICAL SPACE
> +3000 ; Zs # IDEOGRAPHIC SPACE
> +
> +# Total code points: 17
> +
> +# ================================================
> +
> +# General_Category=Line_Separator
> +
> +2028 ; Zl # LINE SEPARATOR
> +
> +# Total code points: 1
> +
> +# ================================================
> +
> +# General_Category=Paragraph_Separator
> +
> +2029 ; Zp # PARAGRAPH SEPARATOR
> +
> +# Total code points: 1
> +
> +# ================================================
> +
> +# General_Category=Control
> +
> +0000..001F ; Cc # [32] <control-0000>..<control-001F>
> +007F..009F ; Cc # [33] <control-007F>..<control-009F>
> +
> +# Total code points: 65
> +
> +# ================================================
> +
> +# General_Category=Format
> +
> +00AD ; Cf # SOFT HYPHEN
> +0600..0605 ; Cf # [6] ARABIC NUMBER SIGN..ARABIC NUMBER MARK ABOVE
> +061C ; Cf # ARABIC LETTER MARK
> +06DD ; Cf # ARABIC END OF AYAH
> +070F ; Cf # SYRIAC ABBREVIATION MARK
> +0890..0891 ; Cf # [2] ARABIC POUND MARK ABOVE..ARABIC PIASTRE MARK ABOVE
> +08E2 ; Cf # ARABIC DISPUTED END OF AYAH
> +180E ; Cf # MONGOLIAN VOWEL SEPARATOR
> +200B..200F ; Cf # [5] ZERO WIDTH SPACE..RIGHT-TO-LEFT MARK
> +202A..202E ; Cf # [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE
> +2060..2064 ; Cf # [5] WORD JOINER..INVISIBLE PLUS
> +2066..206F ; Cf # [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES
> +FEFF ; Cf # ZERO WIDTH NO-BREAK SPACE
> +FFF9..FFFB ; Cf # [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR
> +110BD ; Cf # KAITHI NUMBER SIGN
> +110CD ; Cf # KAITHI NUMBER SIGN ABOVE
> +13430..1343F ; Cf # [16] EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE
> +1BCA0..1BCA3 ; Cf # [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
> +1D173..1D17A ; Cf # [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE
> +E0001 ; Cf # LANGUAGE TAG
> +E0020..E007F ; Cf # [96] TAG SPACE..CANCEL TAG
> +
> +# Total code points: 170
> +
> +# ================================================
> +
> +# General_Category=Private_Use
> +
> +E000..F8FF ; Co # [6400] <private-use-E000>..<private-use-F8FF>
> +F0000..FFFFD ; Co # [65534] <private-use-F0000>..<private-use-FFFFD>
> +100000..10FFFD; Co # [65534] <private-use-100000>..<private-use-10FFFD>
> +
> +# Total code points: 137468
> +
> +# ================================================
> +
> +# General_Category=Surrogate
> +
> +D800..DFFF ; Cs # [2048] <surrogate-D800>..<surrogate-DFFF>
> +
> +# Total code points: 2048
> +
> +# ================================================
> +
> +# General_Category=Dash_Punctuation
> +
> +002D ; Pd # HYPHEN-MINUS
> +058A ; Pd # ARMENIAN HYPHEN
> +05BE ; Pd # HEBREW PUNCTUATION MAQAF
> +1400 ; Pd # CANADIAN SYLLABICS HYPHEN
> +1806 ; Pd # MONGOLIAN TODO SOFT HYPHEN
> +2010..2015 ; Pd # [6] HYPHEN..HORIZONTAL BAR
> +2E17 ; Pd # DOUBLE OBLIQUE HYPHEN
> +2E1A ; Pd # HYPHEN WITH DIAERESIS
> +2E3A..2E3B ; Pd # [2] TWO-EM DASH..THREE-EM DASH
> +2E40 ; Pd # DOUBLE HYPHEN
> +2E5D ; Pd # OBLIQUE HYPHEN
> +301C ; Pd # WAVE DASH
> +3030 ; Pd # WAVY DASH
> +30A0 ; Pd # KATAKANA-HIRAGANA DOUBLE HYPHEN
> +FE31..FE32 ; Pd # [2] PRESENTATION FORM FOR VERTICAL EM DASH..PRESENTATION FORM FOR VERTICAL EN DASH
> +FE58 ; Pd # SMALL EM DASH
> +FE63 ; Pd # SMALL HYPHEN-MINUS
> +FF0D ; Pd # FULLWIDTH HYPHEN-MINUS
> +10D6E ; Pd # GARAY HYPHEN
> +10EAD ; Pd # YEZIDI HYPHENATION MARK
> +
> +# Total code points: 27
> +
> +# ================================================
> +
> +# General_Category=Open_Punctuation
> +
> +0028 ; Ps # LEFT PARENTHESIS
> +005B ; Ps # LEFT SQUARE BRACKET
> +007B ; Ps # LEFT CURLY BRACKET
> +0F3A ; Ps # TIBETAN MARK GUG RTAGS GYON
> +0F3C ; Ps # TIBETAN MARK ANG KHANG GYON
> +169B ; Ps # OGHAM FEATHER MARK
> +201A ; Ps # SINGLE LOW-9 QUOTATION MARK
> +201E ; Ps # DOUBLE LOW-9 QUOTATION MARK
> +2045 ; Ps # LEFT SQUARE BRACKET WITH QUILL
> +207D ; Ps # SUPERSCRIPT LEFT PARENTHESIS
> +208D ; Ps # SUBSCRIPT LEFT PARENTHESIS
> +2308 ; Ps # LEFT CEILING
> +230A ; Ps # LEFT FLOOR
> +2329 ; Ps # LEFT-POINTING ANGLE BRACKET
> +2768 ; Ps # MEDIUM LEFT PARENTHESIS ORNAMENT
> +276A ; Ps # MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
> +276C ; Ps # MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
> +276E ; Ps # HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
> +2770 ; Ps # HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
> +2772 ; Ps # LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
> +2774 ; Ps # MEDIUM LEFT CURLY BRACKET ORNAMENT
> +27C5 ; Ps # LEFT S-SHAPED BAG DELIMITER
> +27E6 ; Ps # MATHEMATICAL LEFT WHITE SQUARE BRACKET
> +27E8 ; Ps # MATHEMATICAL LEFT ANGLE BRACKET
> +27EA ; Ps # MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
> +27EC ; Ps # MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
> +27EE ; Ps # MATHEMATICAL LEFT FLATTENED PARENTHESIS
> +2983 ; Ps # LEFT WHITE CURLY BRACKET
> +2985 ; Ps # LEFT WHITE PARENTHESIS
> +2987 ; Ps # Z NOTATION LEFT IMAGE BRACKET
> +2989 ; Ps # Z NOTATION LEFT BINDING BRACKET
> +298B ; Ps # LEFT SQUARE BRACKET WITH UNDERBAR
> +298D ; Ps # LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
> +298F ; Ps # LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
> +2991 ; Ps # LEFT ANGLE BRACKET WITH DOT
> +2993 ; Ps # LEFT ARC LESS-THAN BRACKET
> +2995 ; Ps # DOUBLE LEFT ARC GREATER-THAN BRACKET
> +2997 ; Ps # LEFT BLACK TORTOISE SHELL BRACKET
> +29D8 ; Ps # LEFT WIGGLY FENCE
> +29DA ; Ps # LEFT DOUBLE WIGGLY FENCE
> +29FC ; Ps # LEFT-POINTING CURVED ANGLE BRACKET
> +2E22 ; Ps # TOP LEFT HALF BRACKET
> +2E24 ; Ps # BOTTOM LEFT HALF BRACKET
> +2E26 ; Ps # LEFT SIDEWAYS U BRACKET
> +2E28 ; Ps # LEFT DOUBLE PARENTHESIS
> +2E42 ; Ps # DOUBLE LOW-REVERSED-9 QUOTATION MARK
> +2E55 ; Ps # LEFT SQUARE BRACKET WITH STROKE
> +2E57 ; Ps # LEFT SQUARE BRACKET WITH DOUBLE STROKE
> +2E59 ; Ps # TOP HALF LEFT PARENTHESIS
> +2E5B ; Ps # BOTTOM HALF LEFT PARENTHESIS
> +3008 ; Ps # LEFT ANGLE BRACKET
> +300A ; Ps # LEFT DOUBLE ANGLE BRACKET
> +300C ; Ps # LEFT CORNER BRACKET
> +300E ; Ps # LEFT WHITE CORNER BRACKET
> +3010 ; Ps # LEFT BLACK LENTICULAR BRACKET
> +3014 ; Ps # LEFT TORTOISE SHELL BRACKET
> +3016 ; Ps # LEFT WHITE LENTICULAR BRACKET
> +3018 ; Ps # LEFT WHITE TORTOISE SHELL BRACKET
> +301A ; Ps # LEFT WHITE SQUARE BRACKET
> +301D ; Ps # REVERSED DOUBLE PRIME QUOTATION MARK
> +FD3F ; Ps # ORNATE RIGHT PARENTHESIS
> +FE17 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET
> +FE35 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS
> +FE37 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET
> +FE39 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET
> +FE3B ; Ps # PRESENTATION FORM FOR VERTICAL LEFT BLACK LENTICULAR BRACKET
> +FE3D ; Ps # PRESENTATION FORM FOR VERTICAL LEFT DOUBLE ANGLE BRACKET
> +FE3F ; Ps # PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET
> +FE41 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET
> +FE43 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET
> +FE47 ; Ps # PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET
> +FE59 ; Ps # SMALL LEFT PARENTHESIS
> +FE5B ; Ps # SMALL LEFT CURLY BRACKET
> +FE5D ; Ps # SMALL LEFT TORTOISE SHELL BRACKET
> +FF08 ; Ps # FULLWIDTH LEFT PARENTHESIS
> +FF3B ; Ps # FULLWIDTH LEFT SQUARE BRACKET
> +FF5B ; Ps # FULLWIDTH LEFT CURLY BRACKET
> +FF5F ; Ps # FULLWIDTH LEFT WHITE PARENTHESIS
> +FF62 ; Ps # HALFWIDTH LEFT CORNER BRACKET
> +
> +# Total code points: 79
> +
> +# ================================================
> +
> +# General_Category=Close_Punctuation
> +
> +0029 ; Pe # RIGHT PARENTHESIS
> +005D ; Pe # RIGHT SQUARE BRACKET
> +007D ; Pe # RIGHT CURLY BRACKET
> +0F3B ; Pe # TIBETAN MARK GUG RTAGS GYAS
> +0F3D ; Pe # TIBETAN MARK ANG KHANG GYAS
> +169C ; Pe # OGHAM REVERSED FEATHER MARK
> +2046 ; Pe # RIGHT SQUARE BRACKET WITH QUILL
> +207E ; Pe # SUPERSCRIPT RIGHT PARENTHESIS
> +208E ; Pe # SUBSCRIPT RIGHT PARENTHESIS
> +2309 ; Pe # RIGHT CEILING
> +230B ; Pe # RIGHT FLOOR
> +232A ; Pe # RIGHT-POINTING ANGLE BRACKET
> +2769 ; Pe # MEDIUM RIGHT PARENTHESIS ORNAMENT
> +276B ; Pe # MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
> +276D ; Pe # MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
> +276F ; Pe # HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
> +2771 ; Pe # HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
> +2773 ; Pe # LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
> +2775 ; Pe # MEDIUM RIGHT CURLY BRACKET ORNAMENT
> +27C6 ; Pe # RIGHT S-SHAPED BAG DELIMITER
> +27E7 ; Pe # MATHEMATICAL RIGHT WHITE SQUARE BRACKET
> +27E9 ; Pe # MATHEMATICAL RIGHT ANGLE BRACKET
> +27EB ; Pe # MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
> +27ED ; Pe # MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
> +27EF ; Pe # MATHEMATICAL RIGHT FLATTENED PARENTHESIS
> +2984 ; Pe # RIGHT WHITE CURLY BRACKET
> +2986 ; Pe # RIGHT WHITE PARENTHESIS
> +2988 ; Pe # Z NOTATION RIGHT IMAGE BRACKET
> +298A ; Pe # Z NOTATION RIGHT BINDING BRACKET
> +298C ; Pe # RIGHT SQUARE BRACKET WITH UNDERBAR
> +298E ; Pe # RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
> +2990 ; Pe # RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
> +2992 ; Pe # RIGHT ANGLE BRACKET WITH DOT
> +2994 ; Pe # RIGHT ARC GREATER-THAN BRACKET
> +2996 ; Pe # DOUBLE RIGHT ARC LESS-THAN BRACKET
> +2998 ; Pe # RIGHT BLACK TORTOISE SHELL BRACKET
> +29D9 ; Pe # RIGHT WIGGLY FENCE
> +29DB ; Pe # RIGHT DOUBLE WIGGLY FENCE
> +29FD ; Pe # RIGHT-POINTING CURVED ANGLE BRACKET
> +2E23 ; Pe # TOP RIGHT HALF BRACKET
> +2E25 ; Pe # BOTTOM RIGHT HALF BRACKET
> +2E27 ; Pe # RIGHT SIDEWAYS U BRACKET
> +2E29 ; Pe # RIGHT DOUBLE PARENTHESIS
> +2E56 ; Pe # RIGHT SQUARE BRACKET WITH STROKE
> +2E58 ; Pe # RIGHT SQUARE BRACKET WITH DOUBLE STROKE
> +2E5A ; Pe # TOP HALF RIGHT PARENTHESIS
> +2E5C ; Pe # BOTTOM HALF RIGHT PARENTHESIS
> +3009 ; Pe # RIGHT ANGLE BRACKET
> +300B ; Pe # RIGHT DOUBLE ANGLE BRACKET
> +300D ; Pe # RIGHT CORNER BRACKET
> +300F ; Pe # RIGHT WHITE CORNER BRACKET
> +3011 ; Pe # RIGHT BLACK LENTICULAR BRACKET
> +3015 ; Pe # RIGHT TORTOISE SHELL BRACKET
> +3017 ; Pe # RIGHT WHITE LENTICULAR BRACKET
> +3019 ; Pe # RIGHT WHITE TORTOISE SHELL BRACKET
> +301B ; Pe # RIGHT WHITE SQUARE BRACKET
> +301E..301F ; Pe # [2] DOUBLE PRIME QUOTATION MARK..LOW DOUBLE PRIME QUOTATION MARK
> +FD3E ; Pe # ORNATE LEFT PARENTHESIS
> +FE18 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
> +FE36 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS
> +FE38 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET
> +FE3A ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET
> +FE3C ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT BLACK LENTICULAR BRACKET
> +FE3E ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT DOUBLE ANGLE BRACKET
> +FE40 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT ANGLE BRACKET
> +FE42 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET
> +FE44 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET
> +FE48 ; Pe # PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET
> +FE5A ; Pe # SMALL RIGHT PARENTHESIS
> +FE5C ; Pe # SMALL RIGHT CURLY BRACKET
> +FE5E ; Pe # SMALL RIGHT TORTOISE SHELL BRACKET
> +FF09 ; Pe # FULLWIDTH RIGHT PARENTHESIS
> +FF3D ; Pe # FULLWIDTH RIGHT SQUARE BRACKET
> +FF5D ; Pe # FULLWIDTH RIGHT CURLY BRACKET
> +FF60 ; Pe # FULLWIDTH RIGHT WHITE PARENTHESIS
> +FF63 ; Pe # HALFWIDTH RIGHT CORNER BRACKET
> +
> +# Total code points: 77
> +
> +# ================================================
> +
> +# General_Category=Connector_Punctuation
> +
> +005F ; Pc # LOW LINE
> +203F..2040 ; Pc # [2] UNDERTIE..CHARACTER TIE
> +2054 ; Pc # INVERTED UNDERTIE
> +FE33..FE34 ; Pc # [2] PRESENTATION FORM FOR VERTICAL LOW LINE..PRESENTATION FORM FOR VERTICAL WAVY LOW LINE
> +FE4D..FE4F ; Pc # [3] DASHED LOW LINE..WAVY LOW LINE
> +FF3F ; Pc # FULLWIDTH LOW LINE
> +
> +# Total code points: 10
> +
> +# ================================================
> +
> +# General_Category=Other_Punctuation
> +
> +0021..0023 ; Po # [3] EXCLAMATION MARK..NUMBER SIGN
> +0025..0027 ; Po # [3] PERCENT SIGN..APOSTROPHE
> +002A ; Po # ASTERISK
> +002C ; Po # COMMA
> +002E..002F ; Po # [2] FULL STOP..SOLIDUS
> +003A..003B ; Po # [2] COLON..SEMICOLON
> +003F..0040 ; Po # [2] QUESTION MARK..COMMERCIAL AT
> +005C ; Po # REVERSE SOLIDUS
> +00A1 ; Po # INVERTED EXCLAMATION MARK
> +00A7 ; Po # SECTION SIGN
> +00B6..00B7 ; Po # [2] PILCROW SIGN..MIDDLE DOT
> +00BF ; Po # INVERTED QUESTION MARK
> +037E ; Po # GREEK QUESTION MARK
> +0387 ; Po # GREEK ANO TELEIA
> +055A..055F ; Po # [6] ARMENIAN APOSTROPHE..ARMENIAN ABBREVIATION MARK
> +0589 ; Po # ARMENIAN FULL STOP
> +05C0 ; Po # HEBREW PUNCTUATION PASEQ
> +05C3 ; Po # HEBREW PUNCTUATION SOF PASUQ
> +05C6 ; Po # HEBREW PUNCTUATION NUN HAFUKHA
> +05F3..05F4 ; Po # [2] HEBREW PUNCTUATION GERESH..HEBREW PUNCTUATION GERSHAYIM
> +0609..060A ; Po # [2] ARABIC-INDIC PER MILLE SIGN..ARABIC-INDIC PER TEN THOUSAND SIGN
> +060C..060D ; Po # [2] ARABIC COMMA..ARABIC DATE SEPARATOR
> +061B ; Po # ARABIC SEMICOLON
> +061D..061F ; Po # [3] ARABIC END OF TEXT MARK..ARABIC QUESTION MARK
> +066A..066D ; Po # [4] ARABIC PERCENT SIGN..ARABIC FIVE POINTED STAR
> +06D4 ; Po # ARABIC FULL STOP
> +0700..070D ; Po # [14] SYRIAC END OF PARAGRAPH..SYRIAC HARKLEAN ASTERISCUS
> +07F7..07F9 ; Po # [3] NKO SYMBOL GBAKURUNEN..NKO EXCLAMATION MARK
> +0830..083E ; Po # [15] SAMARITAN PUNCTUATION NEQUDAA..SAMARITAN PUNCTUATION ANNAAU
> +085E ; Po # MANDAIC PUNCTUATION
> +0964..0965 ; Po # [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA
> +0970 ; Po # DEVANAGARI ABBREVIATION SIGN
> +09FD ; Po # BENGALI ABBREVIATION SIGN
> +0A76 ; Po # GURMUKHI ABBREVIATION SIGN
> +0AF0 ; Po # GUJARATI ABBREVIATION SIGN
> +0C77 ; Po # TELUGU SIGN SIDDHAM
> +0C84 ; Po # KANNADA SIGN SIDDHAM
> +0DF4 ; Po # SINHALA PUNCTUATION KUNDDALIYA
> +0E4F ; Po # THAI CHARACTER FONGMAN
> +0E5A..0E5B ; Po # [2] THAI CHARACTER ANGKHANKHU..THAI CHARACTER KHOMUT
> +0F04..0F12 ; Po # [15] TIBETAN MARK INITIAL YIG MGO MDUN MA..TIBETAN MARK RGYA GRAM SHAD
> +0F14 ; Po # TIBETAN MARK GTER TSHEG
> +0F85 ; Po # TIBETAN MARK PALUTA
> +0FD0..0FD4 ; Po # [5] TIBETAN MARK BSKA- SHOG GI MGO RGYAN..TIBETAN MARK CLOSING BRDA RNYING YIG MGO SGAB MA
> +0FD9..0FDA ; Po # [2] TIBETAN MARK LEADING MCHAN RTAGS..TIBETAN MARK TRAILING MCHAN RTAGS
> +104A..104F ; Po # [6] MYANMAR SIGN LITTLE SECTION..MYANMAR SYMBOL GENITIVE
> +10FB ; Po # GEORGIAN PARAGRAPH SEPARATOR
> +1360..1368 ; Po # [9] ETHIOPIC SECTION MARK..ETHIOPIC PARAGRAPH SEPARATOR
> +166E ; Po # CANADIAN SYLLABICS FULL STOP
> +16EB..16ED ; Po # [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION
> +1735..1736 ; Po # [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION
> +17D4..17D6 ; Po # [3] KHMER SIGN KHAN..KHMER SIGN CAMNUC PII KUUH
> +17D8..17DA ; Po # [3] KHMER SIGN BEYYAL..KHMER SIGN KOOMUUT
> +1800..1805 ; Po # [6] MONGOLIAN BIRGA..MONGOLIAN FOUR DOTS
> +1807..180A ; Po # [4] MONGOLIAN SIBE SYLLABLE BOUNDARY MARKER..MONGOLIAN NIRUGU
> +1944..1945 ; Po # [2] LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK
> +1A1E..1A1F ; Po # [2] BUGINESE PALLAWA..BUGINESE END OF SECTION
> +1AA0..1AA6 ; Po # [7] TAI THAM SIGN WIANG..TAI THAM SIGN REVERSED ROTATED RANA
> +1AA8..1AAD ; Po # [6] TAI THAM SIGN KAAN..TAI THAM SIGN CAANG
> +1B4E..1B4F ; Po # [2] BALINESE INVERTED CARIK SIKI..BALINESE INVERTED CARIK PAREREN
> +1B5A..1B60 ; Po # [7] BALINESE PANTI..BALINESE PAMENENG
> +1B7D..1B7F ; Po # [3] BALINESE PANTI LANTANG..BALINESE PANTI BAWAK
> +1BFC..1BFF ; Po # [4] BATAK SYMBOL BINDU NA METEK..BATAK SYMBOL BINDU PANGOLAT
> +1C3B..1C3F ; Po # [5] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION TSHOOK
> +1C7E..1C7F ; Po # [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD
> +1CC0..1CC7 ; Po # [8] SUNDANESE PUNCTUATION BINDU SURYA..SUNDANESE PUNCTUATION BINDU BA SATANGA
> +1CD3 ; Po # VEDIC SIGN NIHSHVASA
> +2016..2017 ; Po # [2] DOUBLE VERTICAL LINE..DOUBLE LOW LINE
> +2020..2027 ; Po # [8] DAGGER..HYPHENATION POINT
> +2030..2038 ; Po # [9] PER MILLE SIGN..CARET
> +203B..203E ; Po # [4] REFERENCE MARK..OVERLINE
> +2041..2043 ; Po # [3] CARET INSERTION POINT..HYPHEN BULLET
> +2047..2051 ; Po # [11] DOUBLE QUESTION MARK..TWO ASTERISKS ALIGNED VERTICALLY
> +2053 ; Po # SWUNG DASH
> +2055..205E ; Po # [10] FLOWER PUNCTUATION MARK..VERTICAL FOUR DOTS
> +2CF9..2CFC ; Po # [4] COPTIC OLD NUBIAN FULL STOP..COPTIC OLD NUBIAN VERSE DIVIDER
> +2CFE..2CFF ; Po # [2] COPTIC FULL STOP..COPTIC MORPHOLOGICAL DIVIDER
> +2D70 ; Po # TIFINAGH SEPARATOR MARK
> +2E00..2E01 ; Po # [2] RIGHT ANGLE SUBSTITUTION MARKER..RIGHT ANGLE DOTTED SUBSTITUTION MARKER
> +2E06..2E08 ; Po # [3] RAISED INTERPOLATION MARKER..DOTTED TRANSPOSITION MARKER
> +2E0B ; Po # RAISED SQUARE
> +2E0E..2E16 ; Po # [9] EDITORIAL CORONIS..DOTTED RIGHT-POINTING ANGLE
> +2E18..2E19 ; Po # [2] INVERTED INTERROBANG..PALM BRANCH
> +2E1B ; Po # TILDE WITH RING ABOVE
> +2E1E..2E1F ; Po # [2] TILDE WITH DOT ABOVE..TILDE WITH DOT BELOW
> +2E2A..2E2E ; Po # [5] TWO DOTS OVER ONE DOT PUNCTUATION..REVERSED QUESTION MARK
> +2E30..2E39 ; Po # [10] RING POINT..TOP HALF SECTION SIGN
> +2E3C..2E3F ; Po # [4] STENOGRAPHIC FULL STOP..CAPITULUM
> +2E41 ; Po # REVERSED COMMA
> +2E43..2E4F ; Po # [13] DASH WITH LEFT UPTURN..CORNISH VERSE DIVIDER
> +2E52..2E54 ; Po # [3] TIRONIAN SIGN CAPITAL ET..MEDIEVAL QUESTION MARK
> +3001..3003 ; Po # [3] IDEOGRAPHIC COMMA..DITTO MARK
> +303D ; Po # PART ALTERNATION MARK
> +30FB ; Po # KATAKANA MIDDLE DOT
> +A4FE..A4FF ; Po # [2] LISU PUNCTUATION COMMA..LISU PUNCTUATION FULL STOP
> +A60D..A60F ; Po # [3] VAI COMMA..VAI QUESTION MARK
> +A673 ; Po # SLAVONIC ASTERISK
> +A67E ; Po # CYRILLIC KAVYKA
> +A6F2..A6F7 ; Po # [6] BAMUM NJAEMLI..BAMUM QUESTION MARK
> +A874..A877 ; Po # [4] PHAGS-PA SINGLE HEAD MARK..PHAGS-PA MARK DOUBLE SHAD
> +A8CE..A8CF ; Po # [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA
> +A8F8..A8FA ; Po # [3] DEVANAGARI SIGN PUSHPIKA..DEVANAGARI CARET
> +A8FC ; Po # DEVANAGARI SIGN SIDDHAM
> +A92E..A92F ; Po # [2] KAYAH LI SIGN CWI..KAYAH LI SIGN SHYA
> +A95F ; Po # REJANG SECTION MARK
> +A9C1..A9CD ; Po # [13] JAVANESE LEFT RERENGGAN..JAVANESE TURNED PADA PISELEH
> +A9DE..A9DF ; Po # [2] JAVANESE PADA TIRTA TUMETES..JAVANESE PADA ISEN-ISEN
> +AA5C..AA5F ; Po # [4] CHAM PUNCTUATION SPIRAL..CHAM PUNCTUATION TRIPLE DANDA
> +AADE..AADF ; Po # [2] TAI VIET SYMBOL HO HOI..TAI VIET SYMBOL KOI KOI
> +AAF0..AAF1 ; Po # [2] MEETEI MAYEK CHEIKHAN..MEETEI MAYEK AHANG KHUDAM
> +ABEB ; Po # MEETEI MAYEK CHEIKHEI
> +FE10..FE16 ; Po # [7] PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL QUESTION MARK
> +FE19 ; Po # PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS
> +FE30 ; Po # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER
> +FE45..FE46 ; Po # [2] SESAME DOT..WHITE SESAME DOT
> +FE49..FE4C ; Po # [4] DASHED OVERLINE..DOUBLE WAVY OVERLINE
> +FE50..FE52 ; Po # [3] SMALL COMMA..SMALL FULL STOP
> +FE54..FE57 ; Po # [4] SMALL SEMICOLON..SMALL EXCLAMATION MARK
> +FE5F..FE61 ; Po # [3] SMALL NUMBER SIGN..SMALL ASTERISK
> +FE68 ; Po # SMALL REVERSE SOLIDUS
> +FE6A..FE6B ; Po # [2] SMALL PERCENT SIGN..SMALL COMMERCIAL AT
> +FF01..FF03 ; Po # [3] FULLWIDTH EXCLAMATION MARK..FULLWIDTH NUMBER SIGN
> +FF05..FF07 ; Po # [3] FULLWIDTH PERCENT SIGN..FULLWIDTH APOSTROPHE
> +FF0A ; Po # FULLWIDTH ASTERISK
> +FF0C ; Po # FULLWIDTH COMMA
> +FF0E..FF0F ; Po # [2] FULLWIDTH FULL STOP..FULLWIDTH SOLIDUS
> +FF1A..FF1B ; Po # [2] FULLWIDTH COLON..FULLWIDTH SEMICOLON
> +FF1F..FF20 ; Po # [2] FULLWIDTH QUESTION MARK..FULLWIDTH COMMERCIAL AT
> +FF3C ; Po # FULLWIDTH REVERSE SOLIDUS
> +FF61 ; Po # HALFWIDTH IDEOGRAPHIC FULL STOP
> +FF64..FF65 ; Po # [2] HALFWIDTH IDEOGRAPHIC COMMA..HALFWIDTH KATAKANA MIDDLE DOT
> +10100..10102 ; Po # [3] AEGEAN WORD SEPARATOR LINE..AEGEAN CHECK MARK
> +1039F ; Po # UGARITIC WORD DIVIDER
> +103D0 ; Po # OLD PERSIAN WORD DIVIDER
> +1056F ; Po # CAUCASIAN ALBANIAN CITATION MARK
> +10857 ; Po # IMPERIAL ARAMAIC SECTION SIGN
> +1091F ; Po # PHOENICIAN WORD SEPARATOR
> +1093F ; Po # LYDIAN TRIANGULAR MARK
> +10A50..10A58 ; Po # [9] KHAROSHTHI PUNCTUATION DOT..KHAROSHTHI PUNCTUATION LINES
> +10A7F ; Po # OLD SOUTH ARABIAN NUMERIC INDICATOR
> +10AF0..10AF6 ; Po # [7] MANICHAEAN PUNCTUATION STAR..MANICHAEAN PUNCTUATION LINE FILLER
> +10B39..10B3F ; Po # [7] AVESTAN ABBREVIATION MARK..LARGE ONE RING OVER TWO RINGS PUNCTUATION
> +10B99..10B9C ; Po # [4] PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI FOUR DOTS WITH DOT
> +10F55..10F59 ; Po # [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
> +10F86..10F89 ; Po # [4] OLD UYGHUR PUNCTUATION BAR..OLD UYGHUR PUNCTUATION FOUR DOTS
> +11047..1104D ; Po # [7] BRAHMI DANDA..BRAHMI PUNCTUATION LOTUS
> +110BB..110BC ; Po # [2] KAITHI ABBREVIATION SIGN..KAITHI ENUMERATION SIGN
> +110BE..110C1 ; Po # [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
> +11140..11143 ; Po # [4] CHAKMA SECTION MARK..CHAKMA QUESTION MARK
> +11174..11175 ; Po # [2] MAHAJANI ABBREVIATION SIGN..MAHAJANI SECTION MARK
> +111C5..111C8 ; Po # [4] SHARADA DANDA..SHARADA SEPARATOR
> +111CD ; Po # SHARADA SUTRA MARK
> +111DB ; Po # SHARADA SIGN SIDDHAM
> +111DD..111DF ; Po # [3] SHARADA CONTINUATION SIGN..SHARADA SECTION MARK-2
> +11238..1123D ; Po # [6] KHOJKI DANDA..KHOJKI ABBREVIATION SIGN
> +112A9 ; Po # MULTANI SECTION MARK
> +113D4..113D5 ; Po # [2] TULU-TIGALARI DANDA..TULU-TIGALARI DOUBLE DANDA
> +113D7..113D8 ; Po # [2] TULU-TIGALARI SIGN OM PUSHPIKA..TULU-TIGALARI SIGN SHRII PUSHPIKA
> +1144B..1144F ; Po # [5] NEWA DANDA..NEWA ABBREVIATION SIGN
> +1145A..1145B ; Po # [2] NEWA DOUBLE COMMA..NEWA PLACEHOLDER MARK
> +1145D ; Po # NEWA INSERTION SIGN
> +114C6 ; Po # TIRHUTA ABBREVIATION SIGN
> +115C1..115D7 ; Po # [23] SIDDHAM SIGN SIDDHAM..SIDDHAM SECTION MARK WITH CIRCLES AND FOUR ENCLOSURES
> +11641..11643 ; Po # [3] MODI DANDA..MODI ABBREVIATION SIGN
> +11660..1166C ; Po # [13] MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURNED SWIRL BIRGA WITH DOUBLE ORNAMENT
> +116B9 ; Po # TAKRI ABBREVIATION SIGN
> +1173C..1173E ; Po # [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI
> +1183B ; Po # DOGRA ABBREVIATION SIGN
> +11944..11946 ; Po # [3] DIVES AKURU DOUBLE DANDA..DIVES AKURU END OF TEXT MARK
> +119E2 ; Po # NANDINAGARI SIGN SIDDHAM
> +11A3F..11A46 ; Po # [8] ZANABAZAR SQUARE INITIAL HEAD MARK..ZANABAZAR SQUARE CLOSING DOUBLE-LINED HEAD MARK
> +11A9A..11A9C ; Po # [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
> +11A9E..11AA2 ; Po # [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
> +11B00..11B09 ; Po # [10] DEVANAGARI HEAD MARK..DEVANAGARI SIGN MINDU
> +11BE1 ; Po # SUNUWAR SIGN PVO
> +11C41..11C45 ; Po # [5] BHAIKSUKI DANDA..BHAIKSUKI GAP FILLER-2
> +11C70..11C71 ; Po # [2] MARCHEN HEAD MARK..MARCHEN MARK SHAD
> +11EF7..11EF8 ; Po # [2] MAKASAR PASSIMBANG..MAKASAR END OF SECTION
> +11F43..11F4F ; Po # [13] KAWI DANDA..KAWI PUNCTUATION CLOSING SPIRAL
> +11FFF ; Po # TAMIL PUNCTUATION END OF TEXT
> +12470..12474 ; Po # [5] CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER..CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
> +12FF1..12FF2 ; Po # [2] CYPRO-MINOAN SIGN CM301..CYPRO-MINOAN SIGN CM302
> +16A6E..16A6F ; Po # [2] MRO DANDA..MRO DOUBLE DANDA
> +16AF5 ; Po # BASSA VAH FULL STOP
> +16B37..16B3B ; Po # [5] PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN VOS FEEM
> +16B44 ; Po # PAHAWH HMONG SIGN XAUS
> +16D6D..16D6F ; Po # [3] KIRAT RAI SIGN YUPI..KIRAT RAI DOUBLE DANDA
> +16E97..16E9A ; Po # [4] MEDEFAIDRIN COMMA..MEDEFAIDRIN EXCLAMATION OH
> +16FE2 ; Po # OLD CHINESE HOOK MARK
> +1BC9F ; Po # DUPLOYAN PUNCTUATION CHINOOK FULL STOP
> +1DA87..1DA8B ; Po # [5] SIGNWRITING COMMA..SIGNWRITING PARENTHESIS
> +1E5FF ; Po # OL ONAL ABBREVIATION SIGN
> +1E95E..1E95F ; Po # [2] ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL QUESTION MARK
> +
> +# Total code points: 640
> +
> +# ================================================
> +
> +# General_Category=Math_Symbol
> +
> +002B ; Sm # PLUS SIGN
> +003C..003E ; Sm # [3] LESS-THAN SIGN..GREATER-THAN SIGN
> +007C ; Sm # VERTICAL LINE
> +007E ; Sm # TILDE
> +00AC ; Sm # NOT SIGN
> +00B1 ; Sm # PLUS-MINUS SIGN
> +00D7 ; Sm # MULTIPLICATION SIGN
> +00F7 ; Sm # DIVISION SIGN
> +03F6 ; Sm # GREEK REVERSED LUNATE EPSILON SYMBOL
> +0606..0608 ; Sm # [3] ARABIC-INDIC CUBE ROOT..ARABIC RAY
> +2044 ; Sm # FRACTION SLASH
> +2052 ; Sm # COMMERCIAL MINUS SIGN
> +207A..207C ; Sm # [3] SUPERSCRIPT PLUS SIGN..SUPERSCRIPT EQUALS SIGN
> +208A..208C ; Sm # [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
> +2118 ; Sm # SCRIPT CAPITAL P
> +2140..2144 ; Sm # [5] DOUBLE-STRUCK N-ARY SUMMATION..TURNED SANS-SERIF CAPITAL Y
> +214B ; Sm # TURNED AMPERSAND
> +2190..2194 ; Sm # [5] LEFTWARDS ARROW..LEFT RIGHT ARROW
> +219A..219B ; Sm # [2] LEFTWARDS ARROW WITH STROKE..RIGHTWARDS ARROW WITH STROKE
> +21A0 ; Sm # RIGHTWARDS TWO HEADED ARROW
> +21A3 ; Sm # RIGHTWARDS ARROW WITH TAIL
> +21A6 ; Sm # RIGHTWARDS ARROW FROM BAR
> +21AE ; Sm # LEFT RIGHT ARROW WITH STROKE
> +21CE..21CF ; Sm # [2] LEFT RIGHT DOUBLE ARROW WITH STROKE..RIGHTWARDS DOUBLE ARROW WITH STROKE
> +21D2 ; Sm # RIGHTWARDS DOUBLE ARROW
> +21D4 ; Sm # LEFT RIGHT DOUBLE ARROW
> +21F4..22FF ; Sm # [268] RIGHT ARROW WITH SMALL CIRCLE..Z NOTATION BAG MEMBERSHIP
> +2320..2321 ; Sm # [2] TOP HALF INTEGRAL..BOTTOM HALF INTEGRAL
> +237C ; Sm # RIGHT ANGLE WITH DOWNWARDS ZIGZAG ARROW
> +239B..23B3 ; Sm # [25] LEFT PARENTHESIS UPPER HOOK..SUMMATION BOTTOM
> +23DC..23E1 ; Sm # [6] TOP PARENTHESIS..BOTTOM TORTOISE SHELL BRACKET
> +25B7 ; Sm # WHITE RIGHT-POINTING TRIANGLE
> +25C1 ; Sm # WHITE LEFT-POINTING TRIANGLE
> +25F8..25FF ; Sm # [8] UPPER LEFT TRIANGLE..LOWER RIGHT TRIANGLE
> +266F ; Sm # MUSIC SHARP SIGN
> +27C0..27C4 ; Sm # [5] THREE DIMENSIONAL ANGLE..OPEN SUPERSET
> +27C7..27E5 ; Sm # [31] OR WITH DOT INSIDE..WHITE SQUARE WITH RIGHTWARDS TICK
> +27F0..27FF ; Sm # [16] UPWARDS QUADRUPLE ARROW..LONG RIGHTWARDS SQUIGGLE ARROW
> +2900..2982 ; Sm # [131] RIGHTWARDS TWO-HEADED ARROW WITH VERTICAL STROKE..Z NOTATION TYPE COLON
> +2999..29D7 ; Sm # [63] DOTTED FENCE..BLACK HOURGLASS
> +29DC..29FB ; Sm # [32] INCOMPLETE INFINITY..TRIPLE PLUS
> +29FE..2AFF ; Sm # [258] TINY..N-ARY WHITE VERTICAL BAR
> +2B30..2B44 ; Sm # [21] LEFT ARROW WITH SMALL CIRCLE..RIGHTWARDS ARROW THROUGH SUPERSET
> +2B47..2B4C ; Sm # [6] REVERSE TILDE OPERATOR ABOVE RIGHTWARDS ARROW..RIGHTWARDS ARROW ABOVE REVERSE TILDE OPERATOR
> +FB29 ; Sm # HEBREW LETTER ALTERNATIVE PLUS SIGN
> +FE62 ; Sm # SMALL PLUS SIGN
> +FE64..FE66 ; Sm # [3] SMALL LESS-THAN SIGN..SMALL EQUALS SIGN
> +FF0B ; Sm # FULLWIDTH PLUS SIGN
> +FF1C..FF1E ; Sm # [3] FULLWIDTH LESS-THAN SIGN..FULLWIDTH GREATER-THAN SIGN
> +FF5C ; Sm # FULLWIDTH VERTICAL LINE
> +FF5E ; Sm # FULLWIDTH TILDE
> +FFE2 ; Sm # FULLWIDTH NOT SIGN
> +FFE9..FFEC ; Sm # [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS ARROW
> +10D8E..10D8F ; Sm # [2] GARAY PLUS SIGN..GARAY MINUS SIGN
> +1D6C1 ; Sm # MATHEMATICAL BOLD NABLA
> +1D6DB ; Sm # MATHEMATICAL BOLD PARTIAL DIFFERENTIAL
> +1D6FB ; Sm # MATHEMATICAL ITALIC NABLA
> +1D715 ; Sm # MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL
> +1D735 ; Sm # MATHEMATICAL BOLD ITALIC NABLA
> +1D74F ; Sm # MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL
> +1D76F ; Sm # MATHEMATICAL SANS-SERIF BOLD NABLA
> +1D789 ; Sm # MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL
> +1D7A9 ; Sm # MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA
> +1D7C3 ; Sm # MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL
> +1EEF0..1EEF1 ; Sm # [2] ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WITH TATWEEL..ARABIC MATHEMATICAL OPERATOR HAH WITH DAL
> +
> +# Total code points: 950
> +
> +# ================================================
> +
> +# General_Category=Currency_Symbol
> +
> +0024 ; Sc # DOLLAR SIGN
> +00A2..00A5 ; Sc # [4] CENT SIGN..YEN SIGN
> +058F ; Sc # ARMENIAN DRAM SIGN
> +060B ; Sc # AFGHANI SIGN
> +07FE..07FF ; Sc # [2] NKO DOROME SIGN..NKO TAMAN SIGN
> +09F2..09F3 ; Sc # [2] BENGALI RUPEE MARK..BENGALI RUPEE SIGN
> +09FB ; Sc # BENGALI GANDA MARK
> +0AF1 ; Sc # GUJARATI RUPEE SIGN
> +0BF9 ; Sc # TAMIL RUPEE SIGN
> +0E3F ; Sc # THAI CURRENCY SYMBOL BAHT
> +17DB ; Sc # KHMER CURRENCY SYMBOL RIEL
> +20A0..20C0 ; Sc # [33] EURO-CURRENCY SIGN..SOM SIGN
> +A838 ; Sc # NORTH INDIC RUPEE MARK
> +FDFC ; Sc # RIAL SIGN
> +FE69 ; Sc # SMALL DOLLAR SIGN
> +FF04 ; Sc # FULLWIDTH DOLLAR SIGN
> +FFE0..FFE1 ; Sc # [2] FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN
> +FFE5..FFE6 ; Sc # [2] FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN
> +11FDD..11FE0 ; Sc # [4] TAMIL SIGN KAACU..TAMIL SIGN VARAAKAN
> +1E2FF ; Sc # WANCHO NGUN SIGN
> +1ECB0 ; Sc # INDIC SIYAQ RUPEE MARK
> +
> +# Total code points: 63
> +
> +# ================================================
> +
> +# General_Category=Modifier_Symbol
> +
> +005E ; Sk # CIRCUMFLEX ACCENT
> +0060 ; Sk # GRAVE ACCENT
> +00A8 ; Sk # DIAERESIS
> +00AF ; Sk # MACRON
> +00B4 ; Sk # ACUTE ACCENT
> +00B8 ; Sk # CEDILLA
> +02C2..02C5 ; Sk # [4] MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER DOWN ARROWHEAD
> +02D2..02DF ; Sk # [14] MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFIER LETTER CROSS ACCENT
> +02E5..02EB ; Sk # [7] MODIFIER LETTER EXTRA-HIGH TONE BAR..MODIFIER LETTER YANG DEPARTING TONE MARK
> +02ED ; Sk # MODIFIER LETTER UNASPIRATED
> +02EF..02FF ; Sk # [17] MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LETTER LOW LEFT ARROW
> +0375 ; Sk # GREEK LOWER NUMERAL SIGN
> +0384..0385 ; Sk # [2] GREEK TONOS..GREEK DIALYTIKA TONOS
> +0888 ; Sk # ARABIC RAISED ROUND DOT
> +1FBD ; Sk # GREEK KORONIS
> +1FBF..1FC1 ; Sk # [3] GREEK PSILI..GREEK DIALYTIKA AND PERISPOMENI
> +1FCD..1FCF ; Sk # [3] GREEK PSILI AND VARIA..GREEK PSILI AND PERISPOMENI
> +1FDD..1FDF ; Sk # [3] GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOMENI
> +1FED..1FEF ; Sk # [3] GREEK DIALYTIKA AND VARIA..GREEK VARIA
> +1FFD..1FFE ; Sk # [2] GREEK OXIA..GREEK DASIA
> +309B..309C ; Sk # [2] KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
> +A700..A716 ; Sk # [23] MODIFIER LETTER CHINESE TONE YIN PING..MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR
> +A720..A721 ; Sk # [2] MODIFIER LETTER STRESS AND HIGH TONE..MODIFIER LETTER STRESS AND LOW TONE
> +A789..A78A ; Sk # [2] MODIFIER LETTER COLON..MODIFIER LETTER SHORT EQUALS SIGN
> +AB5B ; Sk # MODIFIER BREVE WITH INVERTED BREVE
> +AB6A..AB6B ; Sk # [2] MODIFIER LETTER LEFT TACK..MODIFIER LETTER RIGHT TACK
> +FBB2..FBC2 ; Sk # [17] ARABIC SYMBOL DOT ABOVE..ARABIC SYMBOL WASLA ABOVE
> +FF3E ; Sk # FULLWIDTH CIRCUMFLEX ACCENT
> +FF40 ; Sk # FULLWIDTH GRAVE ACCENT
> +FFE3 ; Sk # FULLWIDTH MACRON
> +1F3FB..1F3FF ; Sk # [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
> +
> +# Total code points: 125
> +
> +# ================================================
> +
> +# General_Category=Other_Symbol
> +
> +00A6 ; So # BROKEN BAR
> +00A9 ; So # COPYRIGHT SIGN
> +00AE ; So # REGISTERED SIGN
> +00B0 ; So # DEGREE SIGN
> +0482 ; So # CYRILLIC THOUSANDS SIGN
> +058D..058E ; So # [2] RIGHT-FACING ARMENIAN ETERNITY SIGN..LEFT-FACING ARMENIAN ETERNITY SIGN
> +060E..060F ; So # [2] ARABIC POETIC VERSE SIGN..ARABIC SIGN MISRA
> +06DE ; So # ARABIC START OF RUB EL HIZB
> +06E9 ; So # ARABIC PLACE OF SAJDAH
> +06FD..06FE ; So # [2] ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SINDHI POSTPOSITION MEN
> +07F6 ; So # NKO SYMBOL OO DENNEN
> +09FA ; So # BENGALI ISSHAR
> +0B70 ; So # ORIYA ISSHAR
> +0BF3..0BF8 ; So # [6] TAMIL DAY SIGN..TAMIL AS ABOVE SIGN
> +0BFA ; So # TAMIL NUMBER SIGN
> +0C7F ; So # TELUGU SIGN TUUMU
> +0D4F ; So # MALAYALAM SIGN PARA
> +0D79 ; So # MALAYALAM DATE MARK
> +0F01..0F03 ; So # [3] TIBETAN MARK GTER YIG MGO TRUNCATED A..TIBETAN MARK GTER YIG MGO -UM GTER TSHEG MA
> +0F13 ; So # TIBETAN MARK CARET -DZUD RTAGS ME LONG CAN
> +0F15..0F17 ; So # [3] TIBETAN LOGOTYPE SIGN CHAD RTAGS..TIBETAN ASTROLOGICAL SIGN SGRA GCAN -CHAR RTAGS
> +0F1A..0F1F ; So # [6] TIBETAN SIGN RDEL DKAR GCIG..TIBETAN SIGN RDEL DKAR RDEL NAG
> +0F34 ; So # TIBETAN MARK BSDUS RTAGS
> +0F36 ; So # TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN
> +0F38 ; So # TIBETAN MARK CHE MGO
> +0FBE..0FC5 ; So # [8] TIBETAN KU RU KHA..TIBETAN SYMBOL RDO RJE
> +0FC7..0FCC ; So # [6] TIBETAN SYMBOL RDO RJE RGYA GRAM..TIBETAN SYMBOL NOR BU BZHI -KHYIL
> +0FCE..0FCF ; So # [2] TIBETAN SIGN RDEL NAG RDEL DKAR..TIBETAN SIGN RDEL NAG GSUM
> +0FD5..0FD8 ; So # [4] RIGHT-FACING SVASTI SIGN..LEFT-FACING SVASTI SIGN WITH DOTS
> +109E..109F ; So # [2] MYANMAR SYMBOL SHAN ONE..MYANMAR SYMBOL SHAN EXCLAMATION
> +1390..1399 ; So # [10] ETHIOPIC TONAL MARK YIZET..ETHIOPIC TONAL MARK KURT
> +166D ; So # CANADIAN SYLLABICS CHI SIGN
> +1940 ; So # LIMBU SIGN LOO
> +19DE..19FF ; So # [34] NEW TAI LUE SIGN LAE..KHMER SYMBOL DAP-PRAM ROC
> +1B61..1B6A ; So # [10] BALINESE MUSICAL SYMBOL DONG..BALINESE MUSICAL SYMBOL DANG GEDE
> +1B74..1B7C ; So # [9] BALINESE MUSICAL SYMBOL RIGHT-HAND OPEN DUG..BALINESE MUSICAL SYMBOL LEFT-HAND OPEN PING
> +2100..2101 ; So # [2] ACCOUNT OF..ADDRESSED TO THE SUBJECT
> +2103..2106 ; So # [4] DEGREE CELSIUS..CADA UNA
> +2108..2109 ; So # [2] SCRUPLE..DEGREE FAHRENHEIT
> +2114 ; So # L B BAR SYMBOL
> +2116..2117 ; So # [2] NUMERO SIGN..SOUND RECORDING COPYRIGHT
> +211E..2123 ; So # [6] PRESCRIPTION TAKE..VERSICLE
> +2125 ; So # OUNCE SIGN
> +2127 ; So # INVERTED OHM SIGN
> +2129 ; So # TURNED GREEK SMALL LETTER IOTA
> +212E ; So # ESTIMATED SYMBOL
> +213A..213B ; So # [2] ROTATED CAPITAL Q..FACSIMILE SIGN
> +214A ; So # PROPERTY LINE
> +214C..214D ; So # [2] PER SIGN..AKTIESELSKAB
> +214F ; So # SYMBOL FOR SAMARITAN SOURCE
> +218A..218B ; So # [2] TURNED DIGIT TWO..TURNED DIGIT THREE
> +2195..2199 ; So # [5] UP DOWN ARROW..SOUTH WEST ARROW
> +219C..219F ; So # [4] LEFTWARDS WAVE ARROW..UPWARDS TWO HEADED ARROW
> +21A1..21A2 ; So # [2] DOWNWARDS TWO HEADED ARROW..LEFTWARDS ARROW WITH TAIL
> +21A4..21A5 ; So # [2] LEFTWARDS ARROW FROM BAR..UPWARDS ARROW FROM BAR
> +21A7..21AD ; So # [7] DOWNWARDS ARROW FROM BAR..LEFT RIGHT WAVE ARROW
> +21AF..21CD ; So # [31] DOWNWARDS ZIGZAG ARROW..LEFTWARDS DOUBLE ARROW WITH STROKE
> +21D0..21D1 ; So # [2] LEFTWARDS DOUBLE ARROW..UPWARDS DOUBLE ARROW
> +21D3 ; So # DOWNWARDS DOUBLE ARROW
> +21D5..21F3 ; So # [31] UP DOWN DOUBLE ARROW..UP DOWN WHITE ARROW
> +2300..2307 ; So # [8] DIAMETER SIGN..WAVY LINE
> +230C..231F ; So # [20] BOTTOM RIGHT CROP..BOTTOM RIGHT CORNER
> +2322..2328 ; So # [7] FROWN..KEYBOARD
> +232B..237B ; So # [81] ERASE TO THE LEFT..NOT CHECK MARK
> +237D..239A ; So # [30] SHOULDERED OPEN BOX..CLEAR SCREEN SYMBOL
> +23B4..23DB ; So # [40] TOP SQUARE BRACKET..FUSE
> +23E2..2429 ; So # [72] WHITE TRAPEZIUM..SYMBOL FOR DELETE MEDIUM SHADE FORM
> +2440..244A ; So # [11] OCR HOOK..OCR DOUBLE BACKSLASH
> +249C..24E9 ; So # [78] PARENTHESIZED LATIN SMALL LETTER A..CIRCLED LATIN SMALL LETTER Z
> +2500..25B6 ; So # [183] BOX DRAWINGS LIGHT HORIZONTAL..BLACK RIGHT-POINTING TRIANGLE
> +25B8..25C0 ; So # [9] BLACK RIGHT-POINTING SMALL TRIANGLE..BLACK LEFT-POINTING TRIANGLE
> +25C2..25F7 ; So # [54] BLACK LEFT-POINTING SMALL TRIANGLE..WHITE CIRCLE WITH UPPER RIGHT QUADRANT
> +2600..266E ; So # [111] BLACK SUN WITH RAYS..MUSIC NATURAL SIGN
> +2670..2767 ; So # [248] WEST SYRIAC CROSS..ROTATED FLORAL HEART BULLET
> +2794..27BF ; So # [44] HEAVY WIDE-HEADED RIGHTWARDS ARROW..DOUBLE CURLY LOOP
> +2800..28FF ; So # [256] BRAILLE PATTERN BLANK..BRAILLE PATTERN DOTS-12345678
> +2B00..2B2F ; So # [48] NORTH EAST WHITE ARROW..WHITE VERTICAL ELLIPSE
> +2B45..2B46 ; So # [2] LEFTWARDS QUADRUPLE ARROW..RIGHTWARDS QUADRUPLE ARROW
> +2B4D..2B73 ; So # [39] DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
> +2B76..2B95 ; So # [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW
> +2B97..2BFF ; So # [105] SYMBOL FOR TYPE A ELECTRONICS..HELLSCHREIBER PAUSE SYMBOL
> +2CE5..2CEA ; So # [6] COPTIC SYMBOL MI RO..COPTIC SYMBOL SHIMA SIMA
> +2E50..2E51 ; So # [2] CROSS PATTY WITH RIGHT CROSSBAR..CROSS PATTY WITH LEFT CROSSBAR
> +2E80..2E99 ; So # [26] CJK RADICAL REPEAT..CJK RADICAL RAP
> +2E9B..2EF3 ; So # [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
> +2F00..2FD5 ; So # [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
> +2FF0..2FFF ; So # [16] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION
> +3004 ; So # JAPANESE INDUSTRIAL STANDARD SYMBOL
> +3012..3013 ; So # [2] POSTAL MARK..GETA MARK
> +3020 ; So # POSTAL MARK FACE
> +3036..3037 ; So # [2] CIRCLED POSTAL MARK..IDEOGRAPHIC TELEGRAPH LINE FEED SEPARATOR SYMBOL
> +303E..303F ; So # [2] IDEOGRAPHIC VARIATION INDICATOR..IDEOGRAPHIC HALF FILL SPACE
> +3190..3191 ; So # [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK
> +3196..319F ; So # [10] IDEOGRAPHIC ANNOTATION TOP MARK..IDEOGRAPHIC ANNOTATION MAN MARK
> +31C0..31E5 ; So # [38] CJK STROKE T..CJK STROKE SZP
> +31EF ; So # IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION
> +3200..321E ; So # [31] PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU
> +322A..3247 ; So # [30] PARENTHESIZED IDEOGRAPH MOON..CIRCLED IDEOGRAPH KOTO
> +3250 ; So # PARTNERSHIP SIGN
> +3260..327F ; So # [32] CIRCLED HANGUL KIYEOK..KOREAN STANDARD SYMBOL
> +328A..32B0 ; So # [39] CIRCLED IDEOGRAPH MOON..CIRCLED IDEOGRAPH NIGHT
> +32C0..33FF ; So # [320] IDEOGRAPHIC TELEGRAPH SYMBOL FOR JANUARY..SQUARE GAL
> +4DC0..4DFF ; So # [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION
> +A490..A4C6 ; So # [55] YI RADICAL QOT..YI RADICAL KE
> +A828..A82B ; So # [4] SYLOTI NAGRI POETRY MARK-1..SYLOTI NAGRI POETRY MARK-4
> +A836..A837 ; So # [2] NORTH INDIC QUARTER MARK..NORTH INDIC PLACEHOLDER MARK
> +A839 ; So # NORTH INDIC QUANTITY MARK
> +AA77..AA79 ; So # [3] MYANMAR SYMBOL AITON EXCLAMATION..MYANMAR SYMBOL AITON TWO
> +FD40..FD4F ; So # [16] ARABIC LIGATURE RAHIMAHU ALLAAH..ARABIC LIGATURE RAHIMAHUM ALLAAH
> +FDCF ; So # ARABIC LIGATURE SALAAMUHU ALAYNAA
> +FDFD..FDFF ; So # [3] ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM..ARABIC LIGATURE AZZA WA JALL
> +FFE4 ; So # FULLWIDTH BROKEN BAR
> +FFE8 ; So # HALFWIDTH FORMS LIGHT VERTICAL
> +FFED..FFEE ; So # [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CIRCLE
> +FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
> +10137..1013F ; So # [9] AEGEAN WEIGHT BASE UNIT..AEGEAN MEASURE THIRD SUBUNIT
> +10179..10189 ; So # [17] GREEK YEAR SIGN..GREEK TRYBLION BASE SIGN
> +1018C..1018E ; So # [3] GREEK SINUSOID SIGN..NOMISMA SIGN
> +10190..1019C ; So # [13] ROMAN SEXTANS SIGN..ASCIA SYMBOL
> +101A0 ; So # GREEK SYMBOL TAU RHO
> +101D0..101FC ; So # [45] PHAISTOS DISC SIGN PEDESTRIAN..PHAISTOS DISC SIGN WAVY BAND
> +10877..10878 ; So # [2] PALMYRENE LEFT-POINTING FLEURON..PALMYRENE RIGHT-POINTING FLEURON
> +10AC8 ; So # MANICHAEAN SIGN UD
> +1173F ; So # AHOM SYMBOL VI
> +11FD5..11FDC ; So # [8] TAMIL SIGN NEL..TAMIL SIGN MUKKURUNI
> +11FE1..11FF1 ; So # [17] TAMIL SIGN PAARAM..TAMIL SIGN VAKAIYARAA
> +16B3C..16B3F ; So # [4] PAHAWH HMONG SIGN XYEEM NTXIV..PAHAWH HMONG SIGN XYEEM FAIB
> +16B45 ; So # PAHAWH HMONG SIGN CIM TSOV ROG
> +1BC9C ; So # DUPLOYAN SIGN O WITH CROSS
> +1CC00..1CCEF ; So # [240] UP-POINTING GO-KART..OUTLINED LATIN CAPITAL LETTER Z
> +1CD00..1CEB3 ; So # [436] BLOCK OCTANT-3..BLACK RIGHT TRIANGLE CARET
> +1CF50..1CFC3 ; So # [116] ZNAMENNY NEUME KRYUK..ZNAMENNY NEUME PAUK
> +1D000..1D0F5 ; So # [246] BYZANTINE MUSICAL SYMBOL PSILI..BYZANTINE MUSICAL SYMBOL GORGON NEO KATO
> +1D100..1D126 ; So # [39] MUSICAL SYMBOL SINGLE BARLINE..MUSICAL SYMBOL DRUM CLEF-2
> +1D129..1D164 ; So # [60] MUSICAL SYMBOL MULTIPLE MEASURE REST..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE
> +1D16A..1D16C ; So # [3] MUSICAL SYMBOL FINGERED TREMOLO-1..MUSICAL SYMBOL FINGERED TREMOLO-3
> +1D183..1D184 ; So # [2] MUSICAL SYMBOL ARPEGGIATO UP..MUSICAL SYMBOL ARPEGGIATO DOWN
> +1D18C..1D1A9 ; So # [30] MUSICAL SYMBOL RINFORZANDO..MUSICAL SYMBOL DEGREE SLASH
> +1D1AE..1D1EA ; So # [61] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL KORON
> +1D200..1D241 ; So # [66] GREEK VOCAL NOTATION SYMBOL-1..GREEK INSTRUMENTAL NOTATION SYMBOL-54
> +1D245 ; So # GREEK MUSICAL LEIMMA
> +1D300..1D356 ; So # [87] MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING
> +1D800..1D9FF ; So # [512] SIGNWRITING HAND-FIST INDEX..SIGNWRITING HEAD
> +1DA37..1DA3A ; So # [4] SIGNWRITING AIR BLOW SMALL ROTATIONS..SIGNWRITING BREATH EXHALE
> +1DA6D..1DA74 ; So # [8] SIGNWRITING SHOULDER HIP SPINE..SIGNWRITING TORSO-FLOORPLANE TWISTING
> +1DA76..1DA83 ; So # [14] SIGNWRITING LIMB COMBINATION..SIGNWRITING LOCATION DEPTH
> +1DA85..1DA86 ; So # [2] SIGNWRITING LOCATION TORSO..SIGNWRITING LOCATION LIMBS DIGITS
> +1E14F ; So # NYIAKENG PUACHUE HMONG CIRCLED CA
> +1ECAC ; So # INDIC SIYAQ PLACEHOLDER
> +1ED2E ; So # OTTOMAN SIYAQ MARRATAN
> +1F000..1F02B ; So # [44] MAHJONG TILE EAST WIND..MAHJONG TILE BACK
> +1F030..1F093 ; So # [100] DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06
> +1F0A0..1F0AE ; So # [15] PLAYING CARD BACK..PLAYING CARD KING OF SPADES
> +1F0B1..1F0BF ; So # [15] PLAYING CARD ACE OF HEARTS..PLAYING CARD RED JOKER
> +1F0C1..1F0CF ; So # [15] PLAYING CARD ACE OF DIAMONDS..PLAYING CARD BLACK JOKER
> +1F0D1..1F0F5 ; So # [37] PLAYING CARD ACE OF CLUBS..PLAYING CARD TRUMP-21
> +1F10D..1F1AD ; So # [161] CIRCLED ZERO WITH SLASH..MASK WORK SYMBOL
> +1F1E6..1F202 ; So # [29] REGIONAL INDICATOR SYMBOL LETTER A..SQUARED KATAKANA SA
> +1F210..1F23B ; So # [44] SQUARED CJK UNIFIED IDEOGRAPH-624B..SQUARED CJK UNIFIED IDEOGRAPH-914D
> +1F240..1F248 ; So # [9] TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-672C..TORTOISE SHELL BRACKETED CJK UNIFIED IDEOGRAPH-6557
> +1F250..1F251 ; So # [2] CIRCLED IDEOGRAPH ADVANTAGE..CIRCLED IDEOGRAPH ACCEPT
> +1F260..1F265 ; So # [6] ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI
> +1F300..1F3FA ; So # [251] CYCLONE..AMPHORA
> +1F400..1F6D7 ; So # [728] RAT..ELEVATOR
> +1F6DC..1F6EC ; So # [17] WIRELESS..AIRPLANE ARRIVING
> +1F6F0..1F6FC ; So # [13] SATELLITE..ROLLER SKATE
> +1F700..1F776 ; So # [119] ALCHEMICAL SYMBOL FOR QUINTESSENCE..LUNAR ECLIPSE
> +1F77B..1F7D9 ; So # [95] HAUMEA..NINE POINTED WHITE STAR
> +1F7E0..1F7EB ; So # [12] LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
> +1F7F0 ; So # HEAVY EQUALS SIGN
> +1F800..1F80B ; So # [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
> +1F810..1F847 ; So # [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
> +1F850..1F859 ; So # [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
> +1F860..1F887 ; So # [40] WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-HEADED SOUTH WEST VERY HEAVY BARB ARROW
> +1F890..1F8AD ; So # [30] LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHAFT WIDTH TWO THIRDS
> +1F8B0..1F8BB ; So # [12] ARROW POINTING UPWARDS THEN NORTH WEST..SOUTH WEST ARROW FROM BAR
> +1F8C0..1F8C1 ; So # [2] LEFTWARDS ARROW FROM DOWNWARDS ARROW..RIGHTWARDS ARROW FROM DOWNWARDS ARROW
> +1F900..1FA53 ; So # [340] CIRCLED CROSS FORMEE WITH FOUR DOTS..BLACK CHESS KNIGHT-BISHOP
> +1FA60..1FA6D ; So # [14] XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
> +1FA70..1FA7C ; So # [13] BALLET SHOES..CRUTCH
> +1FA80..1FA89 ; So # [10] YO-YO..HARP
> +1FA8F..1FAC6 ; So # [56] SHOVEL..FINGERPRINT
> +1FACE..1FADC ; So # [15] MOOSE..ROOT VEGETABLE
> +1FADF..1FAE9 ; So # [11] SPLATTER..FACE WITH BAGS UNDER EYES
> +1FAF0..1FAF8 ; So # [9] HAND WITH INDEX FINGER AND THUMB CROSSED..RIGHTWARDS PUSHING HAND
> +1FB00..1FB92 ; So # [147] BLOCK SEXTANT-1..UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
> +1FB94..1FBEF ; So # [92] LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK..TOP LEFT JUSTIFIED LOWER RIGHT QUARTER BLACK CIRCLE
> +
> +# Total code points: 7376
> +
> +# ================================================
> +
> +# General_Category=Initial_Punctuation
> +
> +00AB ; Pi # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
> +2018 ; Pi # LEFT SINGLE QUOTATION MARK
> +201B..201C ; Pi # [2] SINGLE HIGH-REVERSED-9 QUOTATION MARK..LEFT DOUBLE QUOTATION MARK
> +201F ; Pi # DOUBLE HIGH-REVERSED-9 QUOTATION MARK
> +2039 ; Pi # SINGLE LEFT-POINTING ANGLE QUOTATION MARK
> +2E02 ; Pi # LEFT SUBSTITUTION BRACKET
> +2E04 ; Pi # LEFT DOTTED SUBSTITUTION BRACKET
> +2E09 ; Pi # LEFT TRANSPOSITION BRACKET
> +2E0C ; Pi # LEFT RAISED OMISSION BRACKET
> +2E1C ; Pi # LEFT LOW PARAPHRASE BRACKET
> +2E20 ; Pi # LEFT VERTICAL BAR WITH QUILL
> +
> +# Total code points: 12
> +
> +# ================================================
> +
> +# General_Category=Final_Punctuation
> +
> +00BB ; Pf # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
> +2019 ; Pf # RIGHT SINGLE QUOTATION MARK
> +201D ; Pf # RIGHT DOUBLE QUOTATION MARK
> +203A ; Pf # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
> +2E03 ; Pf # RIGHT SUBSTITUTION BRACKET
> +2E05 ; Pf # RIGHT DOTTED SUBSTITUTION BRACKET
> +2E0A ; Pf # RIGHT TRANSPOSITION BRACKET
> +2E0D ; Pf # RIGHT RAISED OMISSION BRACKET
> +2E1D ; Pf # RIGHT LOW PARAPHRASE BRACKET
> +2E21 ; Pf # RIGHT VERTICAL BAR WITH QUILL
> +
> +# Total code points: 10
> +
> +# EOF
> diff --git a/contrib/unicode/README b/contrib/unicode/README
> index 9ef80fbb9b2..a459f3496e0 100644
> --- a/contrib/unicode/README
> +++ b/contrib/unicode/README
> @@ -16,10 +16,11 @@ ftp://ftp.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt
> ftp://ftp.unicode.org/Public/UNIDATA/DerivedCoreProperties.txt
> ftp://ftp.unicode.org/Public/UNIDATA/NameAliases.txt
>
> -Two additional files are needed for lookup tables in libstdc++:
> +Three additional files are needed for lookup tables in libstdc++:
>
> ftp://ftp.unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakProperty.txt
> ftp://ftp.unicode.org/Public/UNIDATA/emoji/emoji-data.txt
> +ftp://ftp.unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt
>
> All these files have been added to source control in this directory;
> please see unicode-license.txt for the relevant copyright information.
> diff --git a/contrib/unicode/gen_libstdcxx_unicode_data.py b/contrib/unicode/gen_libstdcxx_unicode_data.py
> index ff4bee4d1ba..d580d205083 100755
> --- a/contrib/unicode/gen_libstdcxx_unicode_data.py
> +++ b/contrib/unicode/gen_libstdcxx_unicode_data.py
> @@ -126,7 +126,7 @@ edges = find_edges(all_code_points, 1)
>
> # Table for std::__unicode::__format_width(char32_t)
>
> -print(" // Table generated by contrib/unicode/gen_std_format_width.py,")
> +print(" // Table generated by contrib/unicode/gen_libstdcxx_unicode_data.py,")
> print(" // from EastAsianWidth.txt from the Unicode standard.");
> print(" inline constexpr char32_t __width_edges[] = {", end="")
> for i, e in enumerate(edges):
> @@ -138,6 +138,44 @@ for i, e in enumerate(edges):
> print("{:#x},".format(c), end="")
> print("\n };\n")
>
> +# By default escape each code point
> +all_code_points = [True] * (1 + 0x10FFFF)
> +
> +escaped_general_categories = {
> + # Separator (Z)
> + "Zs", "Zl", "Zp",
> + # Other (C)
> + "Cc", "Cf", "Cs", "Co", "Cn",
> +}
> +
> +# Extract Grapheme_Cluster_Break property for all code points.
> +for line in open("DerivedGeneralCategory.txt", "r"):
> + # Example lines:
> + # 0530 ; Cn # <reserved-0530>
> + # 0557..0558 ; Cn # [2] <reserved-0557>..<reserved-0558>
> + line = line.split("#")[0]
> + if re.match(r'^[\dA-Fa-f][^;]+;', line):
> + code_points, general_category = line.split(";")
> + gc_escaped = general_category.strip() in escaped_general_categories
> + process_code_points(code_points, gc_escaped)
> +
> +edges = find_edges(all_code_points)
> +
> +shift_bits = 1
> +print(" // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,")
> +print(" // from DerivedGeneralCategory.txt from the Unicode standard.");
> +print(" // Entries are (code_point << 1) + escape.")
> +print(" inline constexpr uint32_t __escape_edges[] = {", end="")
> +for i, e in enumerate(edges):
> + if i % 6:
> + print(" ", end="")
> + else:
> + print("\n ", end="")
> + c, p = e
> + x = (c << shift_bits) + (1 if p else 0)
> + print("{0:#x},".format(x), end="")
> +print("\n };\n")
> +
> # By default every code point has Grapheme_Cluster_Break=Other.
> all_code_points = ["Other"] * (1 + 0x10FFFF)
>
> @@ -167,7 +205,7 @@ print(" };\n")
>
> # Tables for std::__unicode::_Grapheme_cluster_state
>
> -print(" // Values generated by contrib/unicode/gen_std_format_width.py,")
> +print(" // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,")
> print(" // from GraphemeBreakProperty.txt from the Unicode standard.");
> print(" // Entries are (code_point << shift_bits) + property.")
> print(" inline constexpr int __gcb_shift_bits = {:#x};".format(shift_bits))
> @@ -209,7 +247,7 @@ edges = find_edges(all_code_points)
> incb_props = {None:0, "Consonant":1, "Extend":2}
> print(" enum class _InCB { _Consonant = 1, _Extend = 2 };\n")
> # Table for std::__unicode::__incb_property
> -print(" // Values generated by contrib/unicode/gen_std_format_width.py,")
> +print(" // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,")
> print(" // from DerivedCoreProperties.txt from the Unicode standard.");
> print(" // Entries are (code_point << 2) + property.")
> print(" inline constexpr uint32_t __incb_edges[] = {", end="")
> @@ -238,7 +276,7 @@ for line in open("emoji-data.txt", "r"):
> edges = find_edges(all_code_points, False)
>
> # Table for std::__unicode::__is_extended_pictographic
> -print(" // Table generated by contrib/unicode/gen_std_format_width.py,")
> +print(" // Table generated by contrib/unicode/gen_libstdcxx_unicode_data.py,")
> print(" // from emoji-data.txt from the Unicode standard.");
> print(" inline constexpr char32_t __xpicto_edges[] = {", end="")
> for i, e in enumerate(edges):
> diff --git a/libstdc++-v3/include/bits/chrono_io.h b/libstdc++-v3/include/bits/chrono_io.h
> index d8721093706..311df4541b8 100644
> --- a/libstdc++-v3/include/bits/chrono_io.h
> +++ b/libstdc++-v3/include/bits/chrono_io.h
> @@ -57,23 +57,6 @@ namespace chrono
> /// @cond undocumented
> namespace __detail
> {
> - // STATICALLY-WIDEN, see C++20 [time.general]
> - // It doesn't matter for format strings (which can only be char or wchar_t)
> - // but this returns the narrow string for anything that isn't wchar_t. This
> - // is done because const char* can be inserted into any ostream type, and
> - // will be widened at runtime if necessary.
> - template<typename _CharT>
> - consteval auto
> - _Widen(const char* __narrow, const wchar_t* __wide)
> - {
> - if constexpr (is_same_v<_CharT, wchar_t>)
> - return __wide;
> - else
> - return __narrow;
> - }
> -#define _GLIBCXX_WIDEN_(C, S) ::std::chrono::__detail::_Widen<C>(S, L##S)
> -#define _GLIBCXX_WIDEN(S) _GLIBCXX_WIDEN_(_CharT, S)
> -
> template<typename _Period, typename _CharT>
> constexpr basic_string_view<_CharT>
> __units_suffix() noexcept
> diff --git a/libstdc++-v3/include/bits/unicode-data.h b/libstdc++-v3/include/bits/unicode-data.h
> index fc0a4b3a271..0ab5ecb56a7 100644
> --- a/libstdc++-v3/include/bits/unicode-data.h
> +++ b/libstdc++-v3/include/bits/unicode-data.h
> @@ -33,7 +33,7 @@
> # error "Version mismatch for Unicode static data"
> #endif
>
> - // Table generated by contrib/unicode/gen_std_format_width.py,
> + // Table generated by contrib/unicode/gen_libstdcxx_unicode_data.py,
> // from EastAsianWidth.txt from the Unicode standard.
> inline constexpr char32_t __width_edges[] = {
> 0x1100, 0x1160, 0x231a, 0x231c, 0x2329, 0x232b, 0x23e9, 0x23ed,
> @@ -64,6 +64,258 @@
> 0x1faf0, 0x1faf9, 0x20000, 0x2fffe, 0x30000, 0x3fffe,
> };
>
> + // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,
> + // from DerivedGeneralCategory.txt from the Unicode standard.
> + // Entries are (code_point << 1) + escape.
> + inline constexpr uint32_t __escape_edges[] = {
> + 0x1, 0x42, 0xff, 0x142, 0x15b, 0x15c,
> + 0x6f1, 0x6f4, 0x701, 0x708, 0x717, 0x718,
> + 0x71b, 0x71c, 0x745, 0x746, 0xa61, 0xa62,
> + 0xaaf, 0xab2, 0xb17, 0xb1a, 0xb21, 0xb22,
> + 0xb91, 0xba0, 0xbd7, 0xbde, 0xbeb, 0xc0c,
> + 0xc39, 0xc3a, 0xdbb, 0xdbc, 0xe1d, 0xe20,
> + 0xe97, 0xe9a, 0xf65, 0xf80, 0xff7, 0xffa,
> + 0x105d, 0x1060, 0x107f, 0x1080, 0x10b9, 0x10bc,
> + 0x10bf, 0x10c0, 0x10d7, 0x10e0, 0x111f, 0x112e,
> + 0x11c5, 0x11c6, 0x1309, 0x130a, 0x131b, 0x131e,
> + 0x1323, 0x1326, 0x1353, 0x1354, 0x1363, 0x1364,
> + 0x1367, 0x136c, 0x1375, 0x1378, 0x138b, 0x138e,
> + 0x1393, 0x1396, 0x139f, 0x13ae, 0x13b1, 0x13b8,
> + 0x13bd, 0x13be, 0x13c9, 0x13cc, 0x13ff, 0x1402,
> + 0x1409, 0x140a, 0x1417, 0x141e, 0x1423, 0x1426,
> + 0x1453, 0x1454, 0x1463, 0x1464, 0x1469, 0x146a,
> + 0x146f, 0x1470, 0x1475, 0x1478, 0x147b, 0x147c,
> + 0x1487, 0x148e, 0x1493, 0x1496, 0x149d, 0x14a2,
> + 0x14a5, 0x14b2, 0x14bb, 0x14bc, 0x14bf, 0x14cc,
> + 0x14ef, 0x1502, 0x1509, 0x150a, 0x151d, 0x151e,
> + 0x1525, 0x1526, 0x1553, 0x1554, 0x1563, 0x1564,
> + 0x1569, 0x156a, 0x1575, 0x1578, 0x158d, 0x158e,
> + 0x1595, 0x1596, 0x159d, 0x15a0, 0x15a3, 0x15c0,
> + 0x15c9, 0x15cc, 0x15e5, 0x15f2, 0x1601, 0x1602,
> + 0x1609, 0x160a, 0x161b, 0x161e, 0x1623, 0x1626,
> + 0x1653, 0x1654, 0x1663, 0x1664, 0x1669, 0x166a,
> + 0x1675, 0x1678, 0x168b, 0x168e, 0x1693, 0x1696,
> + 0x169d, 0x16aa, 0x16b1, 0x16b8, 0x16bd, 0x16be,
> + 0x16c9, 0x16cc, 0x16f1, 0x1704, 0x1709, 0x170a,
> + 0x1717, 0x171c, 0x1723, 0x1724, 0x172d, 0x1732,
> + 0x1737, 0x1738, 0x173b, 0x173c, 0x1741, 0x1746,
> + 0x174b, 0x1750, 0x1757, 0x175c, 0x1775, 0x177c,
> + 0x1787, 0x178c, 0x1793, 0x1794, 0x179d, 0x17a0,
> + 0x17a3, 0x17ae, 0x17b1, 0x17cc, 0x17f7, 0x1800,
> + 0x181b, 0x181c, 0x1823, 0x1824, 0x1853, 0x1854,
> + 0x1875, 0x1878, 0x188b, 0x188c, 0x1893, 0x1894,
> + 0x189d, 0x18aa, 0x18af, 0x18b0, 0x18b7, 0x18ba,
> + 0x18bd, 0x18c0, 0x18c9, 0x18cc, 0x18e1, 0x18ee,
> + 0x191b, 0x191c, 0x1923, 0x1924, 0x1953, 0x1954,
> + 0x1969, 0x196a, 0x1975, 0x1978, 0x198b, 0x198c,
> + 0x1993, 0x1994, 0x199d, 0x19aa, 0x19af, 0x19ba,
> + 0x19bf, 0x19c0, 0x19c9, 0x19cc, 0x19e1, 0x19e2,
> + 0x19e9, 0x1a00, 0x1a1b, 0x1a1c, 0x1a23, 0x1a24,
> + 0x1a8b, 0x1a8c, 0x1a93, 0x1a94, 0x1aa1, 0x1aa8,
> + 0x1ac9, 0x1acc, 0x1b01, 0x1b02, 0x1b09, 0x1b0a,
> + 0x1b2f, 0x1b34, 0x1b65, 0x1b66, 0x1b79, 0x1b7a,
> + 0x1b7d, 0x1b80, 0x1b8f, 0x1b94, 0x1b97, 0x1b9e,
> + 0x1bab, 0x1bac, 0x1baf, 0x1bb0, 0x1bc1, 0x1bcc,
> + 0x1be1, 0x1be4, 0x1beb, 0x1c02, 0x1c77, 0x1c7e,
> + 0x1cb9, 0x1d02, 0x1d07, 0x1d08, 0x1d0b, 0x1d0c,
> + 0x1d17, 0x1d18, 0x1d49, 0x1d4a, 0x1d4d, 0x1d4e,
> + 0x1d7d, 0x1d80, 0x1d8b, 0x1d8c, 0x1d8f, 0x1d90,
> + 0x1d9f, 0x1da0, 0x1db5, 0x1db8, 0x1dc1, 0x1e00,
> + 0x1e91, 0x1e92, 0x1edb, 0x1ee2, 0x1f31, 0x1f32,
> + 0x1f7b, 0x1f7c, 0x1f9b, 0x1f9c, 0x1fb7, 0x2000,
> + 0x218d, 0x218e, 0x2191, 0x219a, 0x219d, 0x21a0,
> + 0x2493, 0x2494, 0x249d, 0x24a0, 0x24af, 0x24b0,
> + 0x24b3, 0x24b4, 0x24bd, 0x24c0, 0x2513, 0x2514,
> + 0x251d, 0x2520, 0x2563, 0x2564, 0x256d, 0x2570,
> + 0x257f, 0x2580, 0x2583, 0x2584, 0x258d, 0x2590,
> + 0x25af, 0x25b0, 0x2623, 0x2624, 0x262d, 0x2630,
> + 0x26b7, 0x26ba, 0x26fb, 0x2700, 0x2735, 0x2740,
> + 0x27ed, 0x27f0, 0x27fd, 0x2800, 0x2d01, 0x2d02,
> + 0x2d3b, 0x2d40, 0x2df3, 0x2e00, 0x2e2d, 0x2e3e,
> + 0x2e6f, 0x2e80, 0x2ea9, 0x2ec0, 0x2edb, 0x2edc,
> + 0x2ee3, 0x2ee4, 0x2ee9, 0x2f00, 0x2fbd, 0x2fc0,
> + 0x2fd5, 0x2fe0, 0x2ff5, 0x3000, 0x301d, 0x301e,
> + 0x3035, 0x3040, 0x30f3, 0x3100, 0x3157, 0x3160,
> + 0x31ed, 0x3200, 0x323f, 0x3240, 0x3259, 0x3260,
> + 0x3279, 0x3280, 0x3283, 0x3288, 0x32dd, 0x32e0,
> + 0x32eb, 0x3300, 0x3359, 0x3360, 0x3395, 0x33a0,
> + 0x33b7, 0x33bc, 0x3439, 0x343c, 0x34bf, 0x34c0,
> + 0x34fb, 0x34fe, 0x3515, 0x3520, 0x3535, 0x3540,
> + 0x355d, 0x3560, 0x359f, 0x3600, 0x369b, 0x369c,
> + 0x37e9, 0x37f8, 0x3871, 0x3876, 0x3895, 0x389a,
> + 0x3917, 0x3920, 0x3977, 0x397a, 0x3991, 0x39a0,
> + 0x39f7, 0x3a00, 0x3e2d, 0x3e30, 0x3e3d, 0x3e40,
> + 0x3e8d, 0x3e90, 0x3e9d, 0x3ea0, 0x3eb1, 0x3eb2,
> + 0x3eb5, 0x3eb6, 0x3eb9, 0x3eba, 0x3ebd, 0x3ebe,
> + 0x3efd, 0x3f00, 0x3f6b, 0x3f6c, 0x3f8b, 0x3f8c,
> + 0x3fa9, 0x3fac, 0x3fb9, 0x3fba, 0x3fe1, 0x3fe4,
> + 0x3feb, 0x3fec, 0x3fff, 0x4020, 0x4051, 0x4060,
> + 0x40bf, 0x40e0, 0x40e5, 0x40e8, 0x411f, 0x4120,
> + 0x413b, 0x4140, 0x4183, 0x41a0, 0x41e3, 0x4200,
> + 0x4319, 0x4320, 0x4855, 0x4880, 0x4897, 0x48c0,
> + 0x56e9, 0x56ec, 0x572d, 0x572e, 0x59e9, 0x59f2,
> + 0x5a4d, 0x5a4e, 0x5a51, 0x5a5a, 0x5a5d, 0x5a60,
> + 0x5ad1, 0x5ade, 0x5ae3, 0x5afe, 0x5b2f, 0x5b40,
> + 0x5b4f, 0x5b50, 0x5b5f, 0x5b60, 0x5b6f, 0x5b70,
> + 0x5b7f, 0x5b80, 0x5b8f, 0x5b90, 0x5b9f, 0x5ba0,
> + 0x5baf, 0x5bb0, 0x5bbf, 0x5bc0, 0x5cbd, 0x5d00,
> + 0x5d35, 0x5d36, 0x5de9, 0x5e00, 0x5fad, 0x5fe0,
> + 0x6001, 0x6002, 0x6081, 0x6082, 0x612f, 0x6132,
> + 0x6201, 0x620a, 0x6261, 0x6262, 0x631f, 0x6320,
> + 0x63cd, 0x63de, 0x643f, 0x6440, 0x1491b, 0x14920,
> + 0x1498f, 0x149a0, 0x14c59, 0x14c80, 0x14df1, 0x14e00,
> + 0x14f9d, 0x14fa0, 0x14fa5, 0x14fa6, 0x14fa9, 0x14faa,
> + 0x14fbb, 0x14fe4, 0x1505b, 0x15060, 0x15075, 0x15080,
> + 0x150f1, 0x15100, 0x1518d, 0x1519c, 0x151b5, 0x151c0,
> + 0x152a9, 0x152be, 0x152fb, 0x15300, 0x1539d, 0x1539e,
> + 0x153b5, 0x153bc, 0x153ff, 0x15400, 0x1546f, 0x15480,
> + 0x1549d, 0x154a0, 0x154b5, 0x154b8, 0x15587, 0x155b6,
> + 0x155ef, 0x15602, 0x1560f, 0x15612, 0x1561f, 0x15622,
> + 0x1562f, 0x15640, 0x1564f, 0x15650, 0x1565f, 0x15660,
> + 0x156d9, 0x156e0, 0x157dd, 0x157e0, 0x157f5, 0x15800,
> + 0x1af49, 0x1af60, 0x1af8f, 0x1af96, 0x1aff9, 0x1f200,
> + 0x1f4dd, 0x1f4e0, 0x1f5b5, 0x1f600, 0x1f60f, 0x1f626,
> + 0x1f631, 0x1f63a, 0x1f66f, 0x1f670, 0x1f67b, 0x1f67c,
> + 0x1f67f, 0x1f680, 0x1f685, 0x1f686, 0x1f68b, 0x1f68c,
> + 0x1f787, 0x1f7a6, 0x1fb21, 0x1fb24, 0x1fb91, 0x1fb9e,
> + 0x1fba1, 0x1fbe0, 0x1fc35, 0x1fc40, 0x1fca7, 0x1fca8,
> + 0x1fccf, 0x1fcd0, 0x1fcd9, 0x1fce0, 0x1fceb, 0x1fcec,
> + 0x1fdfb, 0x1fe02, 0x1ff7f, 0x1ff84, 0x1ff91, 0x1ff94,
> + 0x1ffa1, 0x1ffa4, 0x1ffb1, 0x1ffb4, 0x1ffbb, 0x1ffc0,
> + 0x1ffcf, 0x1ffd0, 0x1ffdf, 0x1fff8, 0x1fffd, 0x20000,
> + 0x20019, 0x2001a, 0x2004f, 0x20050, 0x20077, 0x20078,
> + 0x2007d, 0x2007e, 0x2009d, 0x200a0, 0x200bd, 0x20100,
> + 0x201f7, 0x20200, 0x20207, 0x2020e, 0x20269, 0x2026e,
> + 0x2031f, 0x20320, 0x2033b, 0x20340, 0x20343, 0x203a0,
> + 0x203fd, 0x20500, 0x2053b, 0x20540, 0x205a3, 0x205c0,
> + 0x205f9, 0x20600, 0x20649, 0x2065a, 0x20697, 0x206a0,
> + 0x206f7, 0x20700, 0x2073d, 0x2073e, 0x20789, 0x20790,
> + 0x207ad, 0x20800, 0x2093d, 0x20940, 0x20955, 0x20960,
> + 0x209a9, 0x209b0, 0x209f9, 0x20a00, 0x20a51, 0x20a60,
> + 0x20ac9, 0x20ade, 0x20af7, 0x20af8, 0x20b17, 0x20b18,
> + 0x20b27, 0x20b28, 0x20b2d, 0x20b2e, 0x20b45, 0x20b46,
> + 0x20b65, 0x20b66, 0x20b75, 0x20b76, 0x20b7b, 0x20b80,
> + 0x20be9, 0x20c00, 0x20e6f, 0x20e80, 0x20ead, 0x20ec0,
> + 0x20ed1, 0x20f00, 0x20f0d, 0x20f0e, 0x20f63, 0x20f64,
> + 0x20f77, 0x21000, 0x2100d, 0x21010, 0x21013, 0x21014,
> + 0x2106d, 0x2106e, 0x21073, 0x21078, 0x2107b, 0x2107e,
> + 0x210ad, 0x210ae, 0x2113f, 0x2114e, 0x21161, 0x211c0,
> + 0x211e7, 0x211e8, 0x211ed, 0x211f6, 0x21239, 0x2123e,
> + 0x21275, 0x2127e, 0x21281, 0x21300, 0x21371, 0x21378,
> + 0x213a1, 0x213a4, 0x21409, 0x2140a, 0x2140f, 0x21418,
> + 0x21429, 0x2142a, 0x21431, 0x21432, 0x2146d, 0x21470,
> + 0x21477, 0x2147e, 0x21493, 0x214a0, 0x214b3, 0x214c0,
> + 0x21541, 0x21580, 0x215cf, 0x215d6, 0x215ef, 0x21600,
> + 0x2166d, 0x21672, 0x216ad, 0x216b0, 0x216e7, 0x216f0,
> + 0x21725, 0x21732, 0x2173b, 0x21752, 0x21761, 0x21800,
> + 0x21893, 0x21900, 0x21967, 0x21980, 0x219e7, 0x219f4,
> + 0x21a51, 0x21a60, 0x21a75, 0x21a80, 0x21acd, 0x21ad2,
> + 0x21b0d, 0x21b1c, 0x21b21, 0x21cc0, 0x21cff, 0x21d00,
> + 0x21d55, 0x21d56, 0x21d5d, 0x21d60, 0x21d65, 0x21d84,
> + 0x21d8b, 0x21df8, 0x21e51, 0x21e60, 0x21eb5, 0x21ee0,
> + 0x21f15, 0x21f60, 0x21f99, 0x21fc0, 0x21fef, 0x22000,
> + 0x2209d, 0x220a4, 0x220ed, 0x220fe, 0x2217b, 0x2217c,
> + 0x22187, 0x221a0, 0x221d3, 0x221e0, 0x221f5, 0x22200,
> + 0x2226b, 0x2226c, 0x22291, 0x222a0, 0x222ef, 0x22300,
> + 0x223c1, 0x223c2, 0x223eb, 0x22400, 0x22425, 0x22426,
> + 0x22485, 0x22500, 0x2250f, 0x22510, 0x22513, 0x22514,
> + 0x2251d, 0x2251e, 0x2253d, 0x2253e, 0x22555, 0x22560,
> + 0x225d7, 0x225e0, 0x225f5, 0x22600, 0x22609, 0x2260a,
> + 0x2261b, 0x2261e, 0x22623, 0x22626, 0x22653, 0x22654,
> + 0x22663, 0x22664, 0x22669, 0x2266a, 0x22675, 0x22676,
> + 0x2268b, 0x2268e, 0x22693, 0x22696, 0x2269d, 0x226a0,
> + 0x226a3, 0x226ae, 0x226b1, 0x226ba, 0x226c9, 0x226cc,
> + 0x226db, 0x226e0, 0x226eb, 0x22700, 0x22715, 0x22716,
> + 0x22719, 0x2271c, 0x2271f, 0x22720, 0x2276d, 0x2276e,
> + 0x22783, 0x22784, 0x22787, 0x2278a, 0x2278d, 0x2278e,
> + 0x22797, 0x22798, 0x227ad, 0x227ae, 0x227b3, 0x227c2,
> + 0x227c7, 0x22800, 0x228b9, 0x228ba, 0x228c5, 0x22900,
> + 0x22991, 0x229a0, 0x229b5, 0x22b00, 0x22b6d, 0x22b70,
> + 0x22bbd, 0x22c00, 0x22c8b, 0x22ca0, 0x22cb5, 0x22cc0,
> + 0x22cdb, 0x22d00, 0x22d75, 0x22d80, 0x22d95, 0x22da0,
> + 0x22dc9, 0x22e00, 0x22e37, 0x22e3a, 0x22e59, 0x22e60,
> + 0x22e8f, 0x23000, 0x23079, 0x23140, 0x231e7, 0x231fe,
> + 0x2320f, 0x23212, 0x23215, 0x23218, 0x23229, 0x2322a,
> + 0x2322f, 0x23230, 0x2326d, 0x2326e, 0x23273, 0x23276,
> + 0x2328f, 0x232a0, 0x232b5, 0x23340, 0x23351, 0x23354,
> + 0x233b1, 0x233b4, 0x233cb, 0x23400, 0x23491, 0x234a0,
> + 0x23547, 0x23560, 0x235f3, 0x23600, 0x23615, 0x23780,
> + 0x237c5, 0x237e0, 0x237f5, 0x23800, 0x23813, 0x23814,
> + 0x2386f, 0x23870, 0x2388d, 0x238a0, 0x238db, 0x238e0,
> + 0x23921, 0x23924, 0x23951, 0x23952, 0x2396f, 0x23a00,
> + 0x23a0f, 0x23a10, 0x23a15, 0x23a16, 0x23a6f, 0x23a74,
> + 0x23a77, 0x23a78, 0x23a7d, 0x23a7e, 0x23a91, 0x23aa0,
> + 0x23ab5, 0x23ac0, 0x23acd, 0x23ace, 0x23ad3, 0x23ad4,
> + 0x23b1f, 0x23b20, 0x23b25, 0x23b26, 0x23b33, 0x23b40,
> + 0x23b55, 0x23dc0, 0x23df3, 0x23e00, 0x23e23, 0x23e24,
> + 0x23e77, 0x23e7c, 0x23eb7, 0x23f60, 0x23f63, 0x23f80,
> + 0x23fe5, 0x23ffe, 0x24735, 0x24800, 0x248df, 0x248e0,
> + 0x248eb, 0x24900, 0x24a89, 0x25f20, 0x25fe7, 0x26000,
> + 0x26861, 0x26880, 0x268ad, 0x268c0, 0x287f7, 0x28800,
> + 0x28c8f, 0x2c200, 0x2c275, 0x2d000, 0x2d473, 0x2d480,
> + 0x2d4bf, 0x2d4c0, 0x2d4d5, 0x2d4dc, 0x2d57f, 0x2d580,
> + 0x2d595, 0x2d5a0, 0x2d5dd, 0x2d5e0, 0x2d5ed, 0x2d600,
> + 0x2d68d, 0x2d6a0, 0x2d6b5, 0x2d6b6, 0x2d6c5, 0x2d6c6,
> + 0x2d6f1, 0x2d6fa, 0x2d721, 0x2da80, 0x2daf5, 0x2dc80,
> + 0x2dd37, 0x2de00, 0x2de97, 0x2de9e, 0x2df11, 0x2df1e,
> + 0x2df41, 0x2dfc0, 0x2dfcb, 0x2dfe0, 0x2dfe5, 0x2e000,
> + 0x30ff1, 0x31000, 0x319ad, 0x319fe, 0x31a13, 0x35fe0,
> + 0x35fe9, 0x35fea, 0x35ff9, 0x35ffa, 0x35fff, 0x36000,
> + 0x36247, 0x36264, 0x36267, 0x362a0, 0x362a7, 0x362aa,
> + 0x362ad, 0x362c8, 0x362d1, 0x362e0, 0x365f9, 0x37800,
> + 0x378d7, 0x378e0, 0x378fb, 0x37900, 0x37913, 0x37920,
> + 0x37935, 0x37938, 0x37941, 0x39800, 0x399f5, 0x39a00,
> + 0x39d69, 0x39e00, 0x39e5d, 0x39e60, 0x39e8f, 0x39ea0,
> + 0x39f89, 0x3a000, 0x3a1ed, 0x3a200, 0x3a24f, 0x3a252,
> + 0x3a2e7, 0x3a2f6, 0x3a3d7, 0x3a400, 0x3a48d, 0x3a580,
> + 0x3a5a9, 0x3a5c0, 0x3a5e9, 0x3a600, 0x3a6af, 0x3a6c0,
> + 0x3a6f3, 0x3a800, 0x3a8ab, 0x3a8ac, 0x3a93b, 0x3a93c,
> + 0x3a941, 0x3a944, 0x3a947, 0x3a94a, 0x3a94f, 0x3a952,
> + 0x3a95b, 0x3a95c, 0x3a975, 0x3a976, 0x3a979, 0x3a97a,
> + 0x3a989, 0x3a98a, 0x3aa0d, 0x3aa0e, 0x3aa17, 0x3aa1a,
> + 0x3aa2b, 0x3aa2c, 0x3aa3b, 0x3aa3c, 0x3aa75, 0x3aa76,
> + 0x3aa7f, 0x3aa80, 0x3aa8b, 0x3aa8c, 0x3aa8f, 0x3aa94,
> + 0x3aaa3, 0x3aaa4, 0x3ad4d, 0x3ad50, 0x3af99, 0x3af9c,
> + 0x3b519, 0x3b536, 0x3b541, 0x3b542, 0x3b561, 0x3be00,
> + 0x3be3f, 0x3be4a, 0x3be57, 0x3c000, 0x3c00f, 0x3c010,
> + 0x3c033, 0x3c036, 0x3c045, 0x3c046, 0x3c04b, 0x3c04c,
> + 0x3c057, 0x3c060, 0x3c0dd, 0x3c11e, 0x3c121, 0x3c200,
> + 0x3c25b, 0x3c260, 0x3c27d, 0x3c280, 0x3c295, 0x3c29c,
> + 0x3c2a1, 0x3c520, 0x3c55f, 0x3c580, 0x3c5f5, 0x3c5fe,
> + 0x3c601, 0x3c9a0, 0x3c9f5, 0x3cba0, 0x3cbf7, 0x3cbfe,
> + 0x3cc01, 0x3cfc0, 0x3cfcf, 0x3cfd0, 0x3cfd9, 0x3cfda,
> + 0x3cfdf, 0x3cfe0, 0x3cfff, 0x3d000, 0x3d18b, 0x3d18e,
> + 0x3d1af, 0x3d200, 0x3d299, 0x3d2a0, 0x3d2b5, 0x3d2bc,
> + 0x3d2c1, 0x3d8e2, 0x3d96b, 0x3da02, 0x3da7d, 0x3dc00,
> + 0x3dc09, 0x3dc0a, 0x3dc41, 0x3dc42, 0x3dc47, 0x3dc48,
> + 0x3dc4b, 0x3dc4e, 0x3dc51, 0x3dc52, 0x3dc67, 0x3dc68,
> + 0x3dc71, 0x3dc72, 0x3dc75, 0x3dc76, 0x3dc79, 0x3dc84,
> + 0x3dc87, 0x3dc8e, 0x3dc91, 0x3dc92, 0x3dc95, 0x3dc96,
> + 0x3dc99, 0x3dc9a, 0x3dca1, 0x3dca2, 0x3dca7, 0x3dca8,
> + 0x3dcab, 0x3dcae, 0x3dcb1, 0x3dcb2, 0x3dcb5, 0x3dcb6,
> + 0x3dcb9, 0x3dcba, 0x3dcbd, 0x3dcbe, 0x3dcc1, 0x3dcc2,
> + 0x3dcc7, 0x3dcc8, 0x3dccb, 0x3dcce, 0x3dcd7, 0x3dcd8,
> + 0x3dce7, 0x3dce8, 0x3dcf1, 0x3dcf2, 0x3dcfb, 0x3dcfc,
> + 0x3dcff, 0x3dd00, 0x3dd15, 0x3dd16, 0x3dd39, 0x3dd42,
> + 0x3dd49, 0x3dd4a, 0x3dd55, 0x3dd56, 0x3dd79, 0x3dde0,
> + 0x3dde5, 0x3e000, 0x3e059, 0x3e060, 0x3e129, 0x3e140,
> + 0x3e15f, 0x3e162, 0x3e181, 0x3e182, 0x3e1a1, 0x3e1a2,
> + 0x3e1ed, 0x3e200, 0x3e35d, 0x3e3cc, 0x3e407, 0x3e420,
> + 0x3e479, 0x3e480, 0x3e493, 0x3e4a0, 0x3e4a5, 0x3e4c0,
> + 0x3e4cd, 0x3e600, 0x3edb1, 0x3edb8, 0x3eddb, 0x3ede0,
> + 0x3edfb, 0x3ee00, 0x3eeef, 0x3eef6, 0x3efb5, 0x3efc0,
> + 0x3efd9, 0x3efe0, 0x3efe3, 0x3f000, 0x3f019, 0x3f020,
> + 0x3f091, 0x3f0a0, 0x3f0b5, 0x3f0c0, 0x3f111, 0x3f120,
> + 0x3f15d, 0x3f160, 0x3f179, 0x3f180, 0x3f185, 0x3f200,
> + 0x3f4a9, 0x3f4c0, 0x3f4dd, 0x3f4e0, 0x3f4fb, 0x3f500,
> + 0x3f515, 0x3f51e, 0x3f58f, 0x3f59c, 0x3f5bb, 0x3f5be,
> + 0x3f5d5, 0x3f5e0, 0x3f5f3, 0x3f600, 0x3f727, 0x3f728,
> + 0x3f7f5, 0x40000, 0x54dc1, 0x54e00, 0x56e75, 0x56e80,
> + 0x5703d, 0x57040, 0x59d45, 0x59d60, 0x5d7c3, 0x5d7e0,
> + 0x5dcbd, 0x5f000, 0x5f43d, 0x60000, 0x62697, 0x626a0,
> + 0x64761, 0x1c0200, 0x1c03e1,
> + };
> +
> enum class _Gcb_property {
> _Gcb_Other = 0,
> _Gcb_Control = 1,
> @@ -81,7 +333,7 @@
> _Gcb_Regional_Indicator = 13,
> };
>
> - // Values generated by contrib/unicode/gen_std_format_width.py,
> + // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,
> // from GraphemeBreakProperty.txt from the Unicode standard.
> // Entries are (code_point << shift_bits) + property.
> inline constexpr int __gcb_shift_bits = 0x4;
> @@ -381,7 +633,7 @@
>
> enum class _InCB { _Consonant = 1, _Extend = 2 };
>
> - // Values generated by contrib/unicode/gen_std_format_width.py,
> + // Values generated by contrib/unicode/gen_libstdcxx_unicode_data.py,
> // from DerivedCoreProperties.txt from the Unicode standard.
> // Entries are (code_point << 2) + property.
> inline constexpr uint32_t __incb_edges[] = {
> @@ -519,7 +771,7 @@
> 0x380082, 0x380200, 0x380402, 0x3807c0,
> };
>
> - // Table generated by contrib/unicode/gen_std_format_width.py,
> + // Table generated by contrib/unicode/gen_libstdcxx_unicode_data.py,
> // from emoji-data.txt from the Unicode standard.
> inline constexpr char32_t __xpicto_edges[] = {
> 0xa9, 0xaa, 0xae, 0xaf, 0x203c, 0x203d, 0x2049, 0x204a,
> diff --git a/libstdc++-v3/include/bits/unicode.h b/libstdc++-v3/include/bits/unicode.h
> index 24b1ac3d53d..f1b6bf49c54 100644
> --- a/libstdc++-v3/include/bits/unicode.h
> +++ b/libstdc++-v3/include/bits/unicode.h
> @@ -150,6 +150,11 @@ namespace __unicode
> base() const requires forward_iterator<_Iter>
> { return _M_curr(); }
>
> + [[nodiscard]]
> + constexpr iter_difference_t<_Iter>
> + _M_units() const requires forward_iterator<_Iter>
> + { return _M_to_increment; }
> +
> [[nodiscard]]
> constexpr value_type
> operator*() const { return _M_buf[_M_buf_index]; }
> @@ -609,6 +614,18 @@ inline namespace __v16_0_0
> return (__p - __width_edges) % 2 + 1;
> }
>
> + // @pre c <= 0x10FFFF
> + constexpr bool
> + __should_escape_category(char32_t __c) noexcept
> + {
> + constexpr uint32_t __mask = 0x01;
> + auto* __end = std::end(__escape_edges);
> + auto* __p = std::lower_bound(__escape_edges, __end,
> + (__c << 1u) + 2);
> + return __p[-1] & __mask;
> + }
> +
> +
> // @pre c <= 0x10FFFF
> constexpr _Gcb_property
> __grapheme_cluster_break_property(char32_t __c) noexcept
> @@ -1039,6 +1056,8 @@ inline namespace __v16_0_0
> string_view __s(__enc);
> if (__s.ends_with("//"))
> __s.remove_suffix(2);
> + if (__s.ends_with("LE") || __s.ends_with("BE"))
> + __s.remove_suffix(2);
> return __s == "16" || __s == "32";
> }
> }
> diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def
> index 1468c0491b7..d7621431762 100644
> --- a/libstdc++-v3/include/bits/version.def
> +++ b/libstdc++-v3/include/bits/version.def
> @@ -1404,18 +1404,18 @@ ftms = {
> };
> };
>
> -// ftms = {
> - // name = format_ranges;
> +ftms = {
> + name = format_ranges;
> // 202207 P2286R8 Formatting Ranges
> // 202207 P2585R1 Improving default container formatting
> // LWG3750 Too many papers bump __cpp_lib_format
> - // TODO: #define __cpp_lib_format_ranges 202207L
> - // values = {
> - // v = 202207;
> - // cxxmin = 23;
> - // hosted = yes;
> - // };
> -// };
> + stdname = __glibcxx_format_ranges_part; // TODO remove
> + values = {
> + v = 1; // TODO 202207
> + cxxmin = 23;
> + hosted = yes;
> + };
> +};
>
> ftms = {
> name = freestanding_algorithm;
> diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h
> index f7c9849893d..f51bf38418d 100644
> --- a/libstdc++-v3/include/bits/version.h
> +++ b/libstdc++-v3/include/bits/version.h
> @@ -1555,6 +1555,16 @@
> #endif /* !defined(__cpp_lib_expected) && defined(__glibcxx_want_expected) */
> #undef __glibcxx_want_expected
>
> +#if !defined(__cpp_lib_format_ranges)
> +# if (__cplusplus >= 202100L) && _GLIBCXX_HOSTED
> +# define __glibcxx_format_ranges 1L
> +# if defined(__glibcxx_want_all) || defined(__glibcxx_want_format_ranges)
> +# define __glibcxx_format_ranges_part 1L
> +# endif
> +# endif
> +#endif /* !defined(__cpp_lib_format_ranges) && defined(__glibcxx_want_format_ranges) */
> +#undef __glibcxx_want_format_ranges
> +
> #if !defined(__cpp_lib_freestanding_algorithm)
> # if (__cplusplus >= 202100L)
> # define __glibcxx_freestanding_algorithm 202311L
> diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
> index c3327e1d384..9877386c5fc 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -82,8 +82,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> /// @cond undocumented
> namespace __format
> {
> - // Type-erased character sink.
> + // STATICALLY-WIDEN, see C++20 [time.general]
> + // It doesn't matter for format strings (which can only be char or wchar_t)
> + // but this returns the narrow string for anything that isn't wchar_t. This
> + // is done because const char* can be inserted into any ostream type, and
> + // will be widened at runtime if necessary.
> + template<typename _CharT>
> + consteval auto
> + _Widen(const char* __narrow, const wchar_t* __wide)
> + {
> + if constexpr (is_same_v<_CharT, wchar_t>)
> + return __wide;
> + else
> + return __narrow;
> + }
> +#define _GLIBCXX_WIDEN_(C, S) ::std::__format::_Widen<C>(S, L##S)
> +#define _GLIBCXX_WIDEN(S) _GLIBCXX_WIDEN_(_CharT, S)
> +
> + // Type-erased character sinks.
> template<typename _CharT> class _Sink;
> + template<typename _CharT> class _Fixedbuf_sink;
> + template<typename _Seq> class _Seq_sink;
> +
> + template<typename _CharT, typename _Alloc = allocator<_CharT>>
> + using _Str_sink
> + = _Seq_sink<basic_string<_CharT, char_traits<_CharT>, _Alloc>>;
> +
> + // template<typename _CharT, typename _Alloc = allocator<_CharT>>
> + // using _Vec_sink = _Seq_sink<vector<_CharT, _Alloc>>;
> +
> // Output iterator that writes to a type-erase character sink.
> template<typename _CharT>
> class _Sink_iter;
> @@ -850,6 +877,273 @@ namespace __format
> __spec._M_fill);
> }
>
> + // Valus are indicies into _Escapes::all.
> + enum class _Term_char : unsigned char {
> + _Tc_quote = 12,
> + _Tc_apos = 15
> + };
> +
> + template<typename _CharT>
> + struct _Escapes
> + {
> + using _Str_view = basic_string_view<_CharT>;
> +
> + static consteval
> + _Str_view _S_all()
> + { return _GLIBCXX_WIDEN("\t\\t\n\\n\r\\r\\\\\\\"\\\"'\\'\\u\\x"); }
> +
> + static constexpr
> + _CharT _S_term(_Term_char __term)
> + { return _S_all()[static_cast<unsigned char>(__term)]; }
> +
> + static consteval
> + _Str_view _S_tab()
> + { return _S_all().substr(0, 3); }
> +
> + static consteval
> + _Str_view _S_nline()
> + { return _S_all().substr(3, 3); }
> +
> + static consteval
> + _Str_view _S_carret()
> + { return _S_all().substr(6, 3); }
> +
> + static consteval
> + _Str_view _S_bslash()
> + { return _S_all().substr(9, 3); }
> +
> + static consteval
> + _Str_view _S_quote()
> + { return _S_all().substr(12, 3); }
> +
> + static consteval
> + _Str_view _S_apos()
> + { return _S_all().substr(15, 3); }
> +
> + static consteval
> + _Str_view _S_u()
> + { return _S_all().substr(18, 2); }
> +
> + static consteval
> + _Str_view _S_x()
> + { return _S_all().substr(20, 2); }
> + };
> +
> + template<typename _CharT>
> + struct _Separators
> + {
> + using _Str_view = basic_string_view<_CharT>;
> +
> + static consteval
> + _Str_view _S_all()
> + { return _GLIBCXX_WIDEN("{}"); }
> +
> + static consteval
> + _Str_view _S_braces()
> + { return _S_all().substr(0, 2); }
> + };
> +
> + template<typename _CharT>
> + constexpr bool __should_escape_ascii(_CharT __c, _Term_char __term)
> + {
> + using _Esc = _Escapes<_CharT>;
> + switch (__c)
> + {
> + case _Esc::_S_tab()[0]:
> + case _Esc::_S_nline()[0]:
> + case _Esc::_S_carret()[0]:
> + case _Esc::_S_bslash()[0]:
> + return true;
> + case _Esc::_S_quote()[0]:
> + return __term == _Term_char::_Tc_quote;
> + case _Esc::_S_apos()[0]:
> + return __term == _Term_char::_Tc_apos;
> + default:
> + return (__c >= 0 && __c < 0x20) || __c == 0x7f;
> + };
> + }
> +
> + // @pre __c <= 0x10FFFF
> + constexpr bool __should_escape_unicode(char32_t __c, bool __prev_esc)
> + {
> + using namespace __unicode;
> + if (__should_escape_category(__c))
> + return __c != U' ';
> + if (!__prev_esc)
> + return false;
> + auto __gcp = __grapheme_cluster_break_property(__c);
> + return __grapheme_cluster_break_property(__c) == _Gcb_property::_Gcb_Extend;
> + }
> +
> + using uint_least32_t = __UINT_LEAST32_TYPE__;
> + template<typename _Out, typename _CharT>
> + _Out
> + __write_escape_seq(_Out __out, uint_least32_t __val,
> + basic_string_view<_CharT> __prefix)
> + {
> + constexpr size_t __max = 8;
> + char __buf[__max];
> + const string_view __narrow(__buf,
> + to_chars(__buf, __buf + __max, __val, 16).ptr);
> +
> + __out = __write(__out, __prefix);
> + *__out = _Separators<_CharT>::_S_braces()[0];
> + ++__out;
> + if constexpr (is_same_v<char, _CharT>)
> + __out = __write(__out, __narrow);
> +#ifdef _GLIBCXX_USE_WCHAR_T
> + else
> + {
> + _CharT __wbuf[__max];
> + const size_t __n = __narrow.size();
> + std::__to_wstring_numeric(__narrow.data(), __n, __wbuf);
> + __out = __write(__out, basic_string_view<_CharT>(__wbuf, __n));
> + }
> +#endif
> + *__out = _Separators<_CharT>::_S_braces()[1];
> + return ++__out;
> + }
> +
> + template<typename _Out, typename _CharT>
> + _Out
> + __write_escaped_char(_Out __out, _CharT __c)
> + {
> + using _UChar = make_unsigned_t<_CharT>;
> + using _Esc = _Escapes<_CharT>;
> + switch (__c)
> + {
> + case _Esc::_S_tab()[0]:
> + return __write(__out, _Esc::_S_tab().substr(1, 2));
> + case _Esc::_S_nline()[0]:
> + return __write(__out, _Esc::_S_nline().substr(1, 2));
> + case _Esc::_S_carret()[0]:
> + return __write(__out, _Esc::_S_carret().substr(1, 2));
> + case _Esc::_S_bslash()[0]:
> + return __write(__out, _Esc::_S_bslash().substr(1, 2));
> + case _Esc::_S_quote()[0]:
> + return __write(__out, _Esc::_S_quote().substr(1, 2));
> + case _Esc::_S_apos()[0]:
> + return __write(__out, _Esc::_S_apos().substr(1, 2));
> + default:
> + return __write_escape_seq(__out, static_cast<_UChar>(__c),
> + _Esc::_S_u());
> + }
> + }
> +
> + template<typename _CharT, typename _Out>
> + _Out
> + __write_escaped_ascii(_Out __out,
> + basic_string_view<_CharT> __str,
> + _Term_char __term)
> + {
> + auto __first = __str.begin();
> + auto const __last = __str.end();
> + while (__first != __last)
> + {
> + auto __print = __first;
> + // assume anything outside ASCII is printable
> + while (__print != __last && !__should_escape_ascii(*__print, __term))
> + ++__print;
> +
> + if (__print != __first)
> + __out = __write(__out, basic_string_view(__first, __print));
> +
> + if (__print == __last)
> + return __out;
> +
> + __first = __print;
> + __out = __write_escaped_char(__out, *__first);
> + ++__first;
> + }
> + return __out;
> + }
> +
> + template<typename _CharT, typename _Out>
> + _Out
> + __write_escaped_unicode(_Out __out,
> + basic_string_view<_CharT> __str,
> + _Term_char __term)
> + {
> + using _Str_view = basic_string_view<_CharT>;
> + using _UChar = make_unsigned_t<_CharT>;
> + using _Esc = _Escapes<_CharT>;
> +
> + static constexpr char32_t __replace = U'\uFFFD';
> + static constexpr _Str_view __replace_rep = _GLIBCXX_WIDEN("\uFFFD");
> +
> + __unicode::_Utf_view<char32_t, _Str_view> __v(std::move(__str));
> + auto __first = __v.begin();
> + auto const __last = __v.end();
> +
> + bool __prev_esc = true;
> + while (__first != __last)
> + {
> + bool __esc_ascii = false;
> + bool __esc_unicode = false;
> + bool __esc_replace = false;
> + auto __should_escape = [&](auto const& __it)
> + {
> + if (*__it <= 0x7f)
> + return __esc_ascii = __should_escape_ascii(*__it.base(), __term);
> + if (__should_escape_unicode(*__it, __prev_esc))
> + return __esc_unicode = true;
> + if (*__it == __replace)
> + {
> + _Str_view __units(__it.base(), __it._M_units());
> + return __esc_replace = (__units != __replace_rep);
> + }
> + return false;
> + };
> +
> + auto __print = __first;
> + while (__print != __last && !__should_escape(__print))
> + {
> + __prev_esc = false;
> + ++__print;
> + }
> +
> + if (__print != __first)
> + __out = __write(__out, _Str_view(__first.base(), __print.base()));
> +
> + if (__print == __last)
> + return __out;
> +
> + __first = __print;
> + if (__esc_ascii)
> + __out = __write_escaped_char(__out, *__first.base());
> + else if (__esc_unicode)
> + __out = __write_escape_seq(__out, *__first, _Esc::_S_u());
> + else // __esc_replace
> + for (_CharT __c : _Str_view(__first.base(), __first._M_units()))
> + __out = __write_escape_seq(__out, static_cast<_UChar>(__c),
> + _Esc::_S_x());
> + __prev_esc = true;
> + ++__first;
> +
> + }
> + return __out;
> + }
> +
> + template<typename _CharT, typename _Out>
> + _Out
> + __write_escaped(_Out __out, basic_string_view<_CharT> __str, _Term_char __term)
> + {
> + *__out = _Escapes<_CharT>::_S_term(__term);
> + ++__out;
> +
> + if constexpr (__unicode::__literal_encoding_is_unicode<_CharT>())
> + __out = __write_escaped_unicode(__out, __str, __term);
> + else if constexpr (is_same_v<char, _CharT>
> + && __unicode::__literal_encoding_is_extended_ascii())
> + __out = __write_escaped_ascii(__out, __str, __term);
> + else
> + // TODO Handle non-ascii extended encoding
> + __out = __write_escaped_ascii(__out, __str, __term);
> +
> + *__out = _Escapes<_CharT>::_S_term(__term);
> + return ++__out;
> + }
> +
> // A lightweight optional<locale>.
> struct _Optional_locale
> {
> @@ -971,7 +1265,7 @@ namespace __format
>
> if (*__first == 's')
> ++__first;
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> else if (*__first == '?')
> {
> __spec._M_type = _Pres_esc;
> @@ -990,11 +1284,37 @@ namespace __format
> format(basic_string_view<_CharT> __s,
> basic_format_context<_Out, _CharT>& __fc) const
> {
> - if (_M_spec._M_type == _Pres_esc)
> - {
> - // TODO: C++23 escaped string presentation
> - }
> + if (_M_spec._M_type != _Pres_esc)
> + return _M_format(__s, __fc);
> +
> + constexpr auto __term = __format::_Term_char::_Tc_quote;
> + if (_M_spec._M_get_width(__fc) <= 2
> + && _M_spec._M_prec_kind == _WP_none)
> + return __format::__write_escaped(__fc.out(), __s, __term);
> +
> + __format::_Str_sink<_CharT> __sink;
> + __format::_Sink_iter<_CharT> __out(__sink);
> + __format::__write_escaped(__out, __s, __term);
> + span<_CharT> __written = __sink.view();
> + basic_string_view<_CharT> __escaped(__written.data(), __written.size());
> + // N.B. [tab:format.type.string] defines '?' as
> + // Copies the escaped string ([format.string.escaped]) to the output,
> + // so precision seem to appy to escaped string.
> + return _M_format(__escaped, __fc);
> + }
>
> +#if __glibcxx_format_ranges
> + constexpr void
> + set_debug_format() noexcept
> + { _M_spec._M_type = _Pres_esc; }
> +#endif
> +
> + private:
> + template<typename _Out>
> + _Out
> + _M_format(basic_string_view<_CharT> __s,
> + basic_format_context<_Out, _CharT>& __fc) const
> + {
> if (_M_spec._M_width_kind == _WP_none
> && _M_spec._M_prec_kind == _WP_none)
> return __format::__write(__fc.out(), __s);
> @@ -1020,13 +1340,7 @@ namespace __format
> __fc, _M_spec);
> }
>
> -#if __cpp_lib_format_ranges
> - constexpr void
> - set_debug_format() noexcept
> - { _M_spec._M_type = _Pres_esc; }
> -#endif
>
> - private:
> _Spec<_CharT> _M_spec{};
> };
>
> @@ -1130,7 +1444,7 @@ namespace __format
> ++__first;
> }
> break;
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> case '?':
> if (__type == _AsChar)
> {
> @@ -1285,6 +1599,34 @@ namespace __format
> return __format::__write_padded_as_spec({&__c, 1u}, 1, __fc, _M_spec);
> }
>
> + template<typename _Out>
> + typename basic_format_context<_Out, _CharT>::iterator
> + _M_format_character_escaped(_CharT __c,
> + basic_format_context<_Out, _CharT>& __fc) const
> + {
> + constexpr auto __term = __format::_Term_char::_Tc_apos;
> + const basic_string_view<_CharT> __in(&__c, 1u);
> + if (_M_spec._M_get_width(__fc) <= 3u)
> + return __format::__write_escaped(__fc.out(), __in, __term);
> +
> + _CharT __buf[12];
> + __format::_Fixedbuf_sink<_CharT> __sink(__buf);
> + __format::_Sink_iter<_CharT> __out(__sink);
> + __format::__write_escaped(__out, __in, __term);
> +
> + const basic_string_view<_CharT> __escaped = __sink.view();
> + size_t __estimated_width;
> + if (__escaped[1] == '\\') // escape sequence
> + __estimated_width = __escaped.size();
> + else if constexpr (__unicode::__literal_encoding_is_unicode<_CharT>())
> + __estimated_width = __unicode::__field_width(__escaped);
> + else
> + __estimated_width = 3;
> + return __format::__write_padded_as_spec(__escaped,
> + __estimated_width,
> + __fc, _M_spec);
> + }
> +
> template<typename _Int>
> static _CharT
> _S_to_character(_Int __i)
> @@ -1969,15 +2311,12 @@ namespace __format
> || _M_f._M_spec._M_type == __format::_Pres_c)
> return _M_f._M_format_character(__u, __fc);
> else if (_M_f._M_spec._M_type == __format::_Pres_esc)
> - {
> - // TODO
> - return __fc.out();
> - }
> + return _M_f._M_format_character_escaped(__u, __fc);
> else
> return _M_f.format(static_cast<make_unsigned_t<_CharT>>(__u), __fc);
> }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void
> set_debug_format() noexcept
> { _M_f._M_spec._M_type = __format::_Pres_esc; }
> @@ -2008,15 +2347,12 @@ namespace __format
> || _M_f._M_spec._M_type == __format::_Pres_c)
> return _M_f._M_format_character(__u, __fc);
> else if (_M_f._M_spec._M_type == __format::_Pres_esc)
> - {
> - // TODO
> - return __fc.out();
> - }
> + return _M_f._M_format_character_escaped(__u, __fc);
> else
> return _M_f.format(static_cast<unsigned char>(__u), __fc);
> }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void
> set_debug_format() noexcept
> { _M_f._M_spec._M_type = __format::_Pres_esc; }
> @@ -2046,7 +2382,7 @@ namespace __format
> format(_CharT* __u, basic_format_context<_Out, _CharT>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2071,7 +2407,7 @@ namespace __format
> basic_format_context<_Out, _CharT>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2095,7 +2431,7 @@ namespace __format
> basic_format_context<_Out, _CharT>& __fc) const
> { return _M_f.format({__u, _Nm}, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2119,7 +2455,7 @@ namespace __format
> basic_format_context<_Out, char>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2144,7 +2480,7 @@ namespace __format
> basic_format_context<_Out, wchar_t>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2169,7 +2505,7 @@ namespace __format
> basic_format_context<_Out, char>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2194,7 +2530,7 @@ namespace __format
> basic_format_context<_Out, wchar_t>& __fc) const
> { return _M_f.format(__u, __fc); }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr void set_debug_format() noexcept { _M_f.set_debug_format(); }
> #endif
>
> @@ -2855,6 +3191,32 @@ namespace __format
> { return _Sink_iter<_CharT>(*this); }
> };
>
> +
> + template<typename _CharT>
> + class _Fixedbuf_sink final : public _Sink<_CharT>
> + {
> + void
> + _M_overflow() override
> + {
> + __glibcxx_assert(false);
> + this->_M_rewind();
> + }
> +
> + public:
> + [[__gnu__::__always_inline__]]
> + constexpr explicit
> + _Fixedbuf_sink(span<_CharT> __buf)
> + : _Sink<_CharT>(__buf)
> + { }
> +
> + constexpr basic_string_view<_CharT>
> + view() const
> + {
> + auto __s = this->_M_used();
> + return basic_string_view<_CharT>(__s.data(), __s.size());
> + }
> + };
> +
> // A sink with an internal buffer. This is used to implement concrete sinks.
> template<typename _CharT>
> class _Buf_sink : public _Sink<_CharT>
> @@ -2989,13 +3351,6 @@ namespace __format
> }
> };
>
> - template<typename _CharT, typename _Alloc = allocator<_CharT>>
> - using _Str_sink
> - = _Seq_sink<basic_string<_CharT, char_traits<_CharT>, _Alloc>>;
> -
> - // template<typename _CharT, typename _Alloc = allocator<_CharT>>
> - // using _Vec_sink = _Seq_sink<vector<_CharT, _Alloc>>;
> -
> // A sink that writes to an output iterator.
> // Writes to a fixed-size buffer and then flushes to the output iterator
> // when the buffer fills up.
> @@ -3671,17 +4026,17 @@ namespace __format
> return _M_visit([&__vis]<typename _Tp>(_Tp& __val) -> decltype(auto)
> {
> constexpr bool __user_facing = __is_one_of<_Tp,
> - monostate, bool, _CharT,
> - int, unsigned int, long long int, unsigned long long int,
> - float, double, long double,
> - const _CharT*, basic_string_view<_CharT>,
> - const void*, handle>::value;
> + monostate, bool, _CharT,
> + int, unsigned int, long long int, unsigned long long int,
> + float, double, long double,
> + const _CharT*, basic_string_view<_CharT>,
> + const void*, handle>::value;
> if constexpr (__user_facing)
> return std::forward<_Visitor>(__vis)(__val);
> else
> {
> - handle __h(__val);
> - return std::forward<_Visitor>(__vis)(__h);
> + handle __h(__val);
> + return std::forward<_Visitor>(__vis)(__h);
> }
> }, __type);
> }
> diff --git a/libstdc++-v3/testsuite/std/format/debug.cc b/libstdc++-v3/testsuite/std/format/debug.cc
> new file mode 100644
> index 00000000000..8b020778c0d
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/format/debug.cc
> @@ -0,0 +1,419 @@
> +// { dg-options "-fexec-charset=UTF-8 -fwide-exec-charset=UTF-32LE -DUNICODE_ENC" }
> +// { dg-do run { target c++23 } }
> +// { dg-add-options no_pch }
> +
> +#include <format>
> +#include <testsuite_hooks.h>
> +
> +std::string
> +fdebug(char t)
> +{ return std::format("{:?}", t); }
> +
> +std::wstring
> +fdebug(wchar_t t)
> +{ return std::format(L"{:?}", t); }
> +
> +std::string
> +fdebug(std::string_view t)
> +{ return std::format("{:?}", t); }
> +
> +std::wstring
> +fdebug(std::wstring_view t)
> +{ return std::format(L"{:?}", t); }
> +
> +
> +template<typename _CharT>
> +void
> +test_basic_escapes()
> +{
> + std::basic_string<_CharT> res;
> +
> + const auto tab = _GLIBCXX_WIDEN("\t");
> + res = fdebug(tab);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\t")") );
> + res = fdebug(tab[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\t')") );
> +
> + const auto nline = _GLIBCXX_WIDEN("\n");
> + res = fdebug(nline);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\n")") );
> + res = fdebug(nline[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\n')") );
> +
> + const auto carret = _GLIBCXX_WIDEN("\r");
> + res = fdebug(carret);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\r")") );
> + res = fdebug(carret[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\r')") );
> +
> + const auto bslash = _GLIBCXX_WIDEN("\\");
> + res = fdebug(bslash);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\\")") );
> + res = fdebug(bslash[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\\')") );
> +
> + const auto quote = _GLIBCXX_WIDEN("\"");
> + res = fdebug(quote);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\"")") );
> + res = fdebug(quote[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('"')") );
> +
> + const auto apos = _GLIBCXX_WIDEN("\'");
> + res = fdebug(apos);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("'")") );
> + res = fdebug(apos[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\'')") );
> +}
> +
> +template<typename _CharT>
> +void
> +test_ascii_escapes()
> +{
> + std::basic_string<_CharT> res;
> +
> + const auto in = _GLIBCXX_WIDEN("\x10 abcde\x7f\t0123");
> + res = fdebug(in);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\u{10} abcde\u{7f}\t0123")") );
> + res = fdebug(in[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{10}')") );
> + res = fdebug(in[1]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"(' ')") );
> + res = fdebug(in[2]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('a')") );
> +}
> +
> +template<typename _CharT>
> +void
> +test_unicode_escapes()
> +{
> + std::basic_string<_CharT> res;
> +
> + const auto in = _GLIBCXX_WIDEN(
> + "\u008a" // Cc, Control, Line Tabulation Set,
> + "\u00ad" // Cf, Format, Soft Hyphen
> + "\u1d3d" // Lm, Modifier letter, Modifier Letter Capital Ou
> + "\u00a0" // Zs, Space Separator, No-Break Space (NBSP)
> + "\u2029" // Zp, Paragraph Separator, Paragraph Separator
> + "\U0001f984" // So, Other Symbol, Unicorn Face
> + );
> + const auto out = _GLIBCXX_WIDEN("\""
> + R"(\u{8a})"
> + R"(\u{ad})"
> + "\u1d3d"
> + R"(\u{a0})"
> + R"(\u{2029})"
> + "\U0001f984"
> + "\"");
> +
> + res = fdebug(in);
> + VERIFY( res == out );
> +
> + if constexpr (sizeof(_CharT) >= 2)
> + {
> + res = fdebug(in[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{8a}')") );
> + res = fdebug(in[1]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{ad}')") );
> + res = fdebug(in[2]);
> + VERIFY( res == _GLIBCXX_WIDEN("'\u1d3d'") );
> + res = fdebug(in[3]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{a0}')") );
> + res = fdebug(in[4]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{2029}')") );
> + }
> +
> + if constexpr (sizeof(_CharT) >= 4)
> + {
> + res = fdebug(in[5]);
> + VERIFY( res == _GLIBCXX_WIDEN("'\U0001f984'") );
> + }
> +}
> +
> +template<typename _CharT>
> +void
> +test_grapheme_extend()
> +{
> + std::basic_string<_CharT> res;
> +
> + const auto vin = _GLIBCXX_WIDEN("o\u0302\u0323");
> + res = fdebug(vin);
> + VERIFY( res == _GLIBCXX_WIDEN("\"o\u0302\u0323\"") );
> +
> + std::basic_string_view<_CharT> in = _GLIBCXX_WIDEN("\t\u0302\u0323");
> + res = fdebug(in);
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\t\u{302}\u{323}")") );
> +
> + res = fdebug(in.substr(1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\u{302}\u{323}")") );
> +
> + if constexpr (sizeof(_CharT) >= 2)
> + {
> + res = fdebug(in[1]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{302}')") );
> + }
> +}
> +
> +template<typename _CharT>
> +void
> +test_replacement_char()
> +{
> + std::basic_string<_CharT> repl = _GLIBCXX_WIDEN("\uFFFD");
> + std::basic_string<_CharT> res = fdebug(repl);
> + VERIFY( res == _GLIBCXX_WIDEN("\"\uFFFD\"") );
> +
> + repl = _GLIBCXX_WIDEN("\uFFFD\uFFFD");
> + res = fdebug(repl);
> + VERIFY( res == _GLIBCXX_WIDEN("\"\uFFFD\uFFFD\"") );
> +}
> +
> +void
> +test_ill_formed_utf8_seq()
> +{
> + std::string_view seq = "\xf0\x9f\xa6\x84"; // \U0001F984
> + std::string res;
> +
> + res = fdebug(seq);
> + VERIFY( res == "\"\U0001F984\"" );
> +
> + res = fdebug(seq.substr(1));
> + VERIFY( res == R"("\x{9f}\x{a6}\x{84}")" );
> +
> + res = fdebug(seq.substr(2));
> + VERIFY( res == R"("\x{a6}\x{84}")" );
> +
> + res = fdebug(seq[0]);
> + VERIFY( res == R"('\x{f0}')" );
> + res = fdebug(seq.substr(0, 1));
> + VERIFY( res == R"("\x{f0}")" );
> +
> + res = fdebug(seq[1]);
> + VERIFY( res == R"('\x{9f}')" );
> + res = fdebug(seq.substr(1, 1));
> + VERIFY( res == R"("\x{9f}")" );
> +
> + res = fdebug(seq[2]);
> + VERIFY( res == R"('\x{a6}')" );
> + res = fdebug(seq.substr(2, 1));
> + VERIFY( res == R"("\x{a6}")" );
> +
> + res = fdebug(seq[3]);
> + VERIFY( res == R"('\x{84}')" );
> + res = fdebug(seq.substr(3, 1));
> + VERIFY( res == R"("\x{84}")" );
> +}
> +
> +void
> +test_ill_formed_utf32()
> +{
> + std::wstring res;
> +
> + wchar_t ic1 = static_cast<wchar_t>(0xff'ffff);
> + res = fdebug(ic1);
> + VERIFY( res == LR"('\x{ffffff}')" );
> +
> + std::wstring is1(1, ic1);
> + res = fdebug(is1);
> + VERIFY( res == LR"("\x{ffffff}")" );
> +
> + wchar_t ic2 = static_cast<wchar_t>(0xffff'ffff);
> + res = fdebug(ic2);
> + VERIFY( res == LR"('\x{ffffffff}')" );
> +
> + std::wstring is2(1, ic2);
> + res = fdebug(is2);
> + VERIFY( res == LR"("\x{ffffffff}")" );
> +}
> +
> +template<typename _CharT>
> +void
> +test_fill()
> +{
> + std::basic_string<_CharT> res;
> +
> + std::basic_string_view<_CharT> in = _GLIBCXX_WIDEN("a\t\x10\u00ad");
> + res = std::format(_GLIBCXX_WIDEN("{:10?}"), in.substr(0, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("a" )") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:->10?}"), in.substr(1, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"(------"\t")") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:+<10?}"), in.substr(2, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\u{10}"++)") );
> +
> +
> + res = std::format(_GLIBCXX_WIDEN("{:10?}"), in[0]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('a' )") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:->10?}"), in[1]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"(------'\t')") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:+<10?}"), in[2]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('\u{10}'++)") );
> +
> +#if UNICODE_ENC
> + res = std::format(_GLIBCXX_WIDEN("{:=^10?}"), in.substr(3));
> + VERIFY( res == _GLIBCXX_WIDEN(R"(="\u{ad}"=)") );
> +
> + // width is 2
> + std::basic_string_view<_CharT> in2 = _GLIBCXX_WIDEN("\u1100");
> + res = std::format(_GLIBCXX_WIDEN("{:*^10?}"), in2);
> + VERIFY( res == _GLIBCXX_WIDEN("***\"\u1100\"***") );
> +
> + if constexpr (sizeof(_CharT) >= 2)
> + {
> + res = std::format(_GLIBCXX_WIDEN("{:=^10?}"), in[3]);
> + VERIFY( res == _GLIBCXX_WIDEN(R"(='\u{ad}'=)") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:*^10?}"), in2[0]);
> + VERIFY( res == _GLIBCXX_WIDEN("***'\u1100'***") );
> + }
> +#endif // UNICODE_ENC
> +}
> +
> +template<typename _CharT>
> +void
> +test_prec()
> +{
> + std::basic_string<_CharT> res;
> + // with ? escpaed presentation is copied to ouput, same as source
> +
> + std::basic_string_view<_CharT> in = _GLIBCXX_WIDEN("a\t\x10\u00ad");
> + res = std::format(_GLIBCXX_WIDEN("{:.2?}"), in.substr(0, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("a)") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:.4?}"), in.substr(1, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\t")") );
> +
> + res = std::format(_GLIBCXX_WIDEN("{:.5?}"), in.substr(2, 1));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\u{1)") );
> +
> +#if UNICODE_ENC
> + res = std::format(_GLIBCXX_WIDEN("{:.10?}"), in.substr(3));
> + VERIFY( res == _GLIBCXX_WIDEN(R"("\u{ad}")") );
> +
> + std::basic_string_view<_CharT> in2 = _GLIBCXX_WIDEN("\u1100");
> + res = std::format(_GLIBCXX_WIDEN("{:.3?}"), in2);
> + VERIFY( res == _GLIBCXX_WIDEN("\"\u1100") );
> +#endif // UNICODE_ENC
> +}
> +
> +void test_char_as_wchar()
> +{
> + std::wstring res;
> +
> + res = std::format(L"{:?}", 'a');
> + VERIFY( res == LR"('a')" );
> +
> + res = std::format(L"{:?}", '\t');
> + VERIFY( res == LR"('\t')" );
> +
> + res = std::format(L"{:+<10?}", '\x10');
> + VERIFY( res == LR"('\u{10}'++)" );
> +}
> +
> +template<typename T>
> +struct DebugWrapper
> +{
> + T val;
> +};
> +
> +template<typename T, typename CharT>
> +struct std::formatter<DebugWrapper<T>, CharT>
> +{
> + constexpr std::basic_format_parse_context<CharT>::iterator
> + parse(std::basic_format_parse_context<CharT>& pc)
> + {
> + auto out = under.parse(pc);
> + under.set_debug_format();
> + return out;
> + }
> +
> + template<typename Out>
> + Out format(DebugWrapper<T> const& t,
> + std::basic_format_context<Out, CharT>& fc) const
> + { return under.format(t.val, fc); }
> +
> +private:
> + std::formatter<T, CharT> under;
> +};
> +
> +template<typename _CharT, typename StrT>
> +void
> +test_formatter_str()
> +{
> + _CharT buf[]{ 'a', 'b', 'c', 0 };
> + DebugWrapper<StrT> in{ buf };
> + std::basic_string<_CharT> res = std::format(_GLIBCXX_WIDEN("{:?}"), in );
> + VERIFY( res == _GLIBCXX_WIDEN(R"("abc")") );
> +}
> +
> +template<typename _CharT>
> +void
> +test_formatter_arr()
> +{
> + std::basic_string<_CharT> res;
> +
> + DebugWrapper<_CharT[3]> in3{ 'a', 'b', 'c' };
> + res = std::format(_GLIBCXX_WIDEN("{:?}"), in3 );
> + VERIFY( res == _GLIBCXX_WIDEN(R"("abc")") );
> +
> + // We print all characters, including null-terminator
> + DebugWrapper<_CharT[4]> in4{ 'a', 'b', 'c', 0 };
> + res = std::format(_GLIBCXX_WIDEN("{:?}"), in4 );
> + VERIFY( res == _GLIBCXX_WIDEN(R"("abc\u{0}")") );
> +}
> +
> +template<typename _CharT, typename SrcT>
> +void
> +test_formatter_char()
> +{
> + DebugWrapper<SrcT> in{ 'a' };
> + std::basic_string<_CharT> res = std::format(_GLIBCXX_WIDEN("{:?}"), in);
> + VERIFY( res == _GLIBCXX_WIDEN(R"('a')") );
> +}
> +
> +template<typename CharT>
> +void
> +test_formatters()
> +{
> + test_formatter_char<CharT, CharT>();
> + test_formatter_str<CharT, CharT*>();
> + test_formatter_str<CharT, const CharT*>();
> + test_formatter_str<CharT, std::basic_string<CharT>>();
> + test_formatter_str<CharT, std::basic_string_view<CharT>>();
> + test_formatter_arr<CharT>();
> +}
> +
> +void
> +test_formatters_c()
> +{
> + test_formatters<char>();
> + test_formatters<wchar_t>();
> + test_formatter_char<wchar_t, char>();
> +}
> +
> +int main()
> +{
> + test_basic_escapes<char>();
> + test_basic_escapes<wchar_t>();
> + test_ascii_escapes<char>();
> + test_ascii_escapes<wchar_t>();
> +
> +#if UNICODE_ENC
> + test_unicode_escapes<char>();
> + test_unicode_escapes<wchar_t>();
> + test_grapheme_extend<char>();
> + test_grapheme_extend<wchar_t>();
> + test_replacement_char<char>();
> + test_replacement_char<wchar_t>();
> + test_ill_formed_utf8_seq();
> + test_ill_formed_utf32();
> +#endif // UNICODE_ENC
> +
> + test_fill<char>();
> + test_fill<wchar_t>();
> + test_prec<char>();
> + test_prec<wchar_t>();
> +
> + test_formatters_c();
> +}
> diff --git a/libstdc++-v3/testsuite/std/format/parse_ctx.cc b/libstdc++-v3/testsuite/std/format/parse_ctx.cc
> index b5dd7cdba78..b338ac7b762 100644
> --- a/libstdc++-v3/testsuite/std/format/parse_ctx.cc
> +++ b/libstdc++-v3/testsuite/std/format/parse_ctx.cc
> @@ -108,7 +108,7 @@ is_std_format_spec_for(std::string_view spec)
> }
> }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr bool escaped_strings_supported = true;
> #else
> constexpr bool escaped_strings_supported = false;
> diff --git a/libstdc++-v3/testsuite/std/format/string.cc b/libstdc++-v3/testsuite/std/format/string.cc
> index ee987a15ec3..76614d4bc3e 100644
> --- a/libstdc++-v3/testsuite/std/format/string.cc
> +++ b/libstdc++-v3/testsuite/std/format/string.cc
> @@ -62,7 +62,7 @@ test_indexing()
> VERIFY( ! is_format_string_for("{} {0}", 1) );
> }
>
> -#if __cpp_lib_format_ranges
> +#if __glibcxx_format_ranges
> constexpr bool escaped_strings_supported = true;
> #else
> constexpr bool escaped_strings_supported = false;
--
Michael Welsh Duggan
(md5i@md5i.com)
More information about the Libstdc++
mailing list