This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/9]: C++ P0482R5 char8_t: Core language support


On 12/3/18 4:51 PM, Jason Merrill wrote:
On 11/5/18 2:39 PM, Tom Honermann wrote:
This patch adds support for the P0482R5 core language changes.  This includes:
- The -fchar8_t and -fno_char8_t command line options.
- char8_t as a keyword.
- The char8_t builtin type as a non-aliasing unsigned integral
   character type of size 1.
- Use of char8_t as a simple type specifier.
- u8 character literals with type char8_t.
- u8 string literals with type array of const char8_t.
- User defined literal operators that accept char8_1 and char8_t pointer
   types.
- New __cpp_char8_t predefined feature test macro.
- New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined
   macros .
- Name mangling and demangling for char8_t (using Du).

gcc/ChangeLog:

2018-11-04  Tom Honermann  <tom@honermann.net>

      * defaults.h: Define CHAR8_TYPE.

gcc/c-family/ChangeLog:

2018-11-04  Tom Honermann  <tom@honermann.net>
      * c-family/c-common.c (c_common_reswords): Add char8_t.
      (fix_string_type): Use char8_t for the type of u8 string literals.
      (c_common_get_alias_set): char8_t doesn't alias.
      (c_common_nodes_and_builtins): Define char8_t as a builtin type in
      C++.
      (c_stddef_cpp_builtins): Add __CHAR8_TYPE__.
      (keyword_begins_type_specifier): Add RID_CHAR8.
      * gcc/c-family/c-common.h (rid): Add RID_CHAR8.
      (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE.
      Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS.
      Define char8_type_node and char8_array_type_node.
      * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine
      __GCC_ATOMIC_CHAR8_T_LOCK_FREE.
      (c_cpp_builtins): Predefine __cpp_char8_t.
      * c-family/c-lex.c (lex_string): Use char8_array_type_node as the
      type of CPP_UTF8STRING.
      (lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR.
      * c-family/c.opt: Add the -fchar8_t command line option.

gcc/c/ChangeLog:

2018-11-04  Tom Honermann  <tom@honermann.net>

      * c/c-typeck.c (char_type_p): Add char8_type_node.
      (digest_init): Handle initialization by a u8 string literal of
      char8_t type.

gcc/cp/ChangeLog:

2018-11-04  Tom Honermann  <tom@honermann.net>

      * cp/cvt.c (type_promotes_to): Handle char8_t promotion.
      * cp/decl.c (grokdeclarator): Handle invalid type specifier
      combinations involving char8_t.
      * cp/lex.c (init_reswords): Add char8_t as a reserved word.
      * cp/mangle.c (write_builtin_type): Add name mangling for char8_t
      (Du).
      * cp/parser.c (cp_keyword_starts_decl_specifier_p,
      cp_parser_simple_type_specifier): Recognize char8_t as a simple
      type specifier.
      (cp_parser_string_literal): Use char8_array_type_node for the type
      of CPP_UTF8STRING.
      (cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in system
      headers.
      * cp/rtti.c (emit_support_tinfos): type_info support for char8_t.
      * cp/tree.c (char_type_p): Recognize char8_t as a character type.
      * cp/typeck.c (string_conv_p): Handle conversions of u8 string
      literals of char8_t type.
      (check_literal_operator_args): Handle UDLs with u8 string literals
      of char8_t type.
      * cp/typeck2.c (digest_init_r): Disallow initializing a char array
      with a u8 string literal.

libiberty/ChangeLog:

2018-10-31  Tom Honermann  <tom@honermann.net>
      * cp-demangle.c (cplus_demangle_builtin_types,
      cplus_demangle_type): Add name demangling for char8_t (Du).
      * cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate the
      new char8_t type.

@@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t)
   if (!TYPE_P (t))
     return -1;

+  /* Unlike char, char8_t doesn't alias. */
+  if (flag_char8_t && t == char8_type_node)
+    return -1;

This seems unnecessary; doesn't the existing code have the same effect? I think we could do with just an adjustment to the existing comment.

+  else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node)
+      || (flag_char8_t && type == char8_type_node)
+      bool char8_array = (flag_char8_t && !!comptypes (typ1, char8_type_node));
+       || (flag_char8_t && type == char8_type_node
In many places you check the flag and then for one of the char8 types. Since the types won't be used without the flag, checking the flag seems redundant?

-      if (TYPE_PRECISION (typ1) == BITS_PER_UNIT)
+      if (TYPE_PRECISION (typ1) == BITS_PER_UNIT
+          && (typ1 == char_type_node || !flag_char8_t))

This looks wrong, or at least incomplete; we want to complain about mismatched types here even with -fchar8-t.  Perhaps we should replace all of this if/else with simply comparing typ1 and char_type, and complaining if they're different.  Talking about wide and non-wide isn't as useful as the actual types would be.

Well, I suppose it isn't quite that simple, since we still need to treat the ordinary character types as interchangeable.

Jason


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]