This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Revision 3: utf-16 and utf-32 support in C and C++


Hi Kris,

Your patch caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36479

I have a patch at

http://gcc.gnu.org/ml/gcc-patches/2008-06/msg00523.html

Can you take a look?

Thanks.

H.J.
On Tue, Apr 15, 2008 at 1:00 PM, Kris Van Hees <kris.van.hees@oracle.com> wrote:
> Oracle has a full copyright assignment in place with the FSF.
>
> Please refer to the following message in the archives for the original
> posting of this patch:
>
>        http://gcc.gnu.org/ml/gcc-patches/2008-03/msg00827.html
>
> and the previous revisions in:
>
>        http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01474.html
>        http://gcc.gnu.org/ml/gcc-patches/2008-03/msg02025.html
>
> This 3rd revised patch addresses more feedback provided on this list.  This
> patch is not incremental - it replaces the previous posting.  The changelog
> entries mentioned in this message also replace the original entries.  The
> description that follows describes the changes to the previous patch.
>
> This patch does not contain documentation yet (in the extensions section),
> because feedback may still causes changes to be made.  I'll be working on
> the documentation in the next few days.  That way the code and the doc will
> be finalised together.
>
> - #if 0 constructs that were left in the code accidentally have been removed.
> - Mangling no longer depends on C++0x mode.  The compiler will simply not
>  use those types unless C++0x is enabled, so there is no need to make this
>  copnditional.
> - The types are created regardless of the compiler mode, yet they are not
>  registered as builtin types unless C++0x mode is enabled.  Disabling the
>  recognition of the char16_t/char32_t keywords in non-C++0x mode was not
>  sufficient to ensure the compiler would not use these types.
> - The parsing of [uU]["']...["'] literals is now controlled with a new flag
>  in lang_flags and cpp_options, and the appropriate flag is set for modes
>  where these literals are legal.
>
> ChangeLog entries:
> ------------------
> libcpp/ChangeLog:
> 2008-04-14  Kris Van Hees <kris.van.hees@oracle.com>
>
>        * include/cpp-id-data.h (UC): Was U, conflicts with U"..." literal.
>        * include/cpplib.h (CHAR16, CHAR32, STRING16, STRING32): New tokens.
>        (struct cpp_options): Added uliterals.
>        (cpp_interpret_string): Update prototype.
>        (cpp_interpret_string_notranslate): Idem.
>        * charset.c (init_iconv_desc): New width member in cset_converter.
>        (cpp_init_iconv): Add support for char{16,32}_cset_desc.
>        (convert_ucn): Idem.
>        (emit_numeric_escape): Idem.
>        (convert_hex): Idem.
>        (convert_oct): Idem.
>        (convert_escape): Idem.
>        (converter_for_type): New function.
>        (cpp_interpret_string): Use converter_for_type, support u and U prefix.
>        (cpp_interpret_string_notranslate): Match changed prototype.
>        (wide_str_to_charconst): Use converter_for_type.
>        (cpp_interpret_charconst): Add support for CPP_CHAR{16,32}.
>        * directives.c (linemarker_dir): Macro U changed to UC.
>        (parse_include): Idem.
>        (register_pragma_1): Idem.
>        (restore_registered_pragmas): Idem.
>        (get__Pragma_string): Support CPP_STRING{16,32}.
>        * expr.c (eval_token): Support CPP_CHAR{16,32}.
>        * init.c (struct lang_flags): Added uliterals.
>        (lang_defaults): Idem.
>        * internal.h (struct cset_converter) <width>: New field.
>        (struct cpp_reader) <char16_cset_desc>: Idem.
>        (struct cpp_reader) <char32_cset_desc>: Idem.
>        * lex.c (digraph_spellings): Macro U changed to UC.
>        (OP, TK): Idem.
>        (lex_string): Add support for u'...', U'...', u"..." and U"...".
>        (_cpp_lex_direct): Idem.
>        * macro.c (_cpp_builtin_macro_text): Macro U changed to UC.
>        (stringify_arg): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
>
> gcc/ChangeLog:
> 2008-04-14  Kris Van Hees <kris.van.hees@oracle.com>
>
>        * c-common.c (CHAR16_TYPE, CHAR32_TYPE): New macros.
>        (fname_as_string): Match updated cpp_interpret_string prototype.
>        (fix_string_type): Support char16_t* and char32_t*.
>        (c_common_nodes_and_builtins): Add char16_t and char32_t (and
>        derivative) nodes.  Register as builtin if C++0x.
>        (c_parse_error): Support CPP_CHAR{16,32}.
>        * c-common.h (RID_CHAR16, RID_CHAR32): New elements.
>        (enum c_tree_index) <CTI_CHAR16_TYPE, CTI_SIGNED_CHAR16_TYPE,
>        CTI_UNSIGNED_CHAR16_TYPE, CTI_CHAR32_TYPE, CTI_SIGNED_CHAR32_TYPE,
>        CTI_UNSIGNED_CHAR32_TYPE, CTI_CHAR16_ARRAY_TYPE,
>        CTI_CHAR32_ARRAY_TYPE>: New elements.
>        (char16_type_node, signed_char16_type_node, unsigned_char16_type_node,
>        char32_type_node, signed_char32_type_node, char16_array_type_node,
>        char32_array_type_node): New defines.
>        * c-lex.c (cb_ident): Match updated cpp_interpret_string prototype.
>        (c_lex_with_flags): Support CPP_CHAR{16,32} and CPP_STRING{16,32}.
>        (lex_string): Support CPP_STRING{16,32}, match updated
>        cpp_interpret_string and cpp_interpret_string_notranslate prototypes.
>        (lex_charconst): Support CPP_CHAR{16,32}.
>        * c-parser.c (c_parser_postfix_expression): Support CPP_CHAR{16,32}
>        and CPP_STRING{16,32}.
>
> gcc/cp/ChangeLog:
> 2008-04-14  Kris Van Hees <kris.van.hees@oracle.com>
>
>        * cvt.c (type_promotes_to): Support char16_t and char32_t.
>        * decl.c (grokdeclarator): Disallow signed/unsigned/short/long on
>        char16_t and char32_t.
>        * lex.c (reswords): Add char16_t and char32_t (for c++0x).
>        * mangle.c (write_builtin_type): Mangle char16_t/char32_t as vendor
>        extended builtin type "u8char{16,32}_t".
>        * parser.c (cp_lexer_next_token_is_decl_specifier_keyword): Support
>        RID_CHAR{16,32}.
>        (cp_lexer_print_token): Support CPP_STRING{16,32}.
>        (cp_parser_is_string_literal): Idem.
>        (cp_parser_string_literal): Idem.
>        (cp_parser_primary_expression): Support CPP_CHAR{16,32} and
>        CPP_STRING{16,32}.
>        (cp_parser_simple_type_specifier): Support RID_CHAR{16,32}.
>        * tree.c (char_type_p): Support char16_t and char32_t as char types.
>        * typeck.c (string_conv_p): Support char16_t and char32_t.
>
> gcc/testsuite/ChangeLog:
> 2008-04-14  Kris Van Hees <kris.van.hees@oracle.com>
>
>        Tests for char16_t and char32_t support.
>        * g++.dg/ext/utf-cvt.C: New
>        * g++.dg/ext/utf-cxx0x.C: New
>        * g++.dg/ext/utf-cxx98.C: New
>        * g++.dg/ext/utf-dflt.C: New
>        * g++.dg/ext/utf-gnuxx0x.C: New
>        * g++.dg/ext/utf-gnuxx98.C: New
>        * g++.dg/ext/utf-mangle.C: New
>        * g++.dg/ext/utf-typedef-cxx0x.C: New
>        * g++.dg/ext/utf-typedef-cxx98.C: New
>        * g++.dg/ext/utf-typespec.C: New
>        * g++.dg/ext/utf16-1.C: New
>        * g++.dg/ext/utf16-2.C: New
>        * g++.dg/ext/utf16-3.C: New
>        * g++.dg/ext/utf16-4.C: New
>        * g++.dg/ext/utf32-1.C: New
>        * g++.dg/ext/utf32-2.C: New
>        * g++.dg/ext/utf32-3.C: New
>        * g++.dg/ext/utf32-4.C: New
>        * gcc.dg/utf-cvt.c: New
>        * gcc.dg/utf-dflt.c: New
>        * gcc.dg/utf16-1.c: New
>        * gcc.dg/utf16-2.c: New
>        * gcc.dg/utf16-3.c: New
>        * gcc.dg/utf16-4.c: New
>        * gcc.dg/utf32-1.c: New
>        * gcc.dg/utf32-2.c: New
>        * gcc.dg/utf32-3.c: New
>        * gcc.dg/utf32-4.c: New
>
> libiberty/ChangeLog:
> 2008-04-14  Kris Van Hees <kris.van.hees@oracle.com>
>
>        * testsuite/demangle-expected: Added tests for char16_t and char32_t.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]