This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 12/5/18 11:16 AM, Jason Merrill wrote:
On 12/5/18 2:09 AM, Tom Honermann wrote:On 12/3/18 5:01 PM, Jason Merrill wrote:I'm not sure. I had concerns about unintended matching due to char8_t having an underlying type of unsigned char.On 12/3/18 4:51 PM, Jason Merrill wrote:On 11/5/18 2:39 PM, Tom Honermann wrote:This patch adds support for the P0482R5 core language changes. This includes:- The -fchar8_t and -fno_char8_t command line options. - char8_t as a keyword. - The char8_t builtin type as a non-aliasing unsigned integral character type of size 1. - Use of char8_t as a simple type specifier. - u8 character literals with type char8_t. - u8 string literals with type array of const char8_t.- User defined literal operators that accept char8_1 and char8_t pointertypes. - New __cpp_char8_t predefined feature test macro. - New __CHAR8_TYPE__ and __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macros . - Name mangling and demangling for char8_t (using Du). gcc/ChangeLog: 2018-11-04 Tom Honermann <tom@honermann.net> * defaults.h: Define CHAR8_TYPE. gcc/c-family/ChangeLog: 2018-11-04 Tom Honermann <tom@honermann.net> * c-family/c-common.c (c_common_reswords): Add char8_t.(fix_string_type): Use char8_t for the type of u8 string literals.(c_common_get_alias_set): char8_t doesn't alias.(c_common_nodes_and_builtins): Define char8_t as a builtin type inC++. (c_stddef_cpp_builtins): Add __CHAR8_TYPE__. (keyword_begins_type_specifier): Add RID_CHAR8. * gcc/c-family/c-common.h (rid): Add RID_CHAR8. (c_tree_index): Add CTI_CHAR8_TYPE and CTI_CHAR8_ARRAY_TYPE. Define D_CXX_CHAR8_T and D_CXX_CHAR8_T_FLAGS. Define char8_type_node and char8_array_type_node. * c-family/c-cppbuiltin.c (cpp_atomic_builtins): Predefine __GCC_ATOMIC_CHAR8_T_LOCK_FREE. (c_cpp_builtins): Predefine __cpp_char8_t.* c-family/c-lex.c (lex_string): Use char8_array_type_node as thetype of CPP_UTF8STRING.(lex_charconst): Use char8_type_node as the type of CPP_UTF8CHAR.* c-family/c.opt: Add the -fchar8_t command line option. gcc/c/ChangeLog: 2018-11-04 Tom Honermann <tom@honermann.net> * c/c-typeck.c (char_type_p): Add char8_type_node. (digest_init): Handle initialization by a u8 string literal of char8_t type. gcc/cp/ChangeLog: 2018-11-04 Tom Honermann <tom@honermann.net> * cp/cvt.c (type_promotes_to): Handle char8_t promotion. * cp/decl.c (grokdeclarator): Handle invalid type specifier combinations involving char8_t. * cp/lex.c (init_reswords): Add char8_t as a reserved word.* cp/mangle.c (write_builtin_type): Add name mangling for char8_t(Du). * cp/parser.c (cp_keyword_starts_decl_specifier_p, cp_parser_simple_type_specifier): Recognize char8_t as a simple type specifier.(cp_parser_string_literal): Use char8_array_type_node for the typeof CPP_UTF8STRING.(cp_parser_set_decl_spec_type): Tolerate char8_t typedefs in systemheaders.* cp/rtti.c (emit_support_tinfos): type_info support for char8_t. * cp/tree.c (char_type_p): Recognize char8_t as a character type.* cp/typeck.c (string_conv_p): Handle conversions of u8 string literals of char8_t type.(check_literal_operator_args): Handle UDLs with u8 string literalsof char8_t type.* cp/typeck2.c (digest_init_r): Disallow initializing a char arraywith a u8 string literal. libiberty/ChangeLog: 2018-10-31 Tom Honermann <tom@honermann.net> * cp-demangle.c (cplus_demangle_builtin_types, cplus_demangle_type): Add name demangling for char8_t (Du).* cp-demangle.h: Increase D_BUILTIN_TYPE_COUNT to accommodate thenew char8_t type.@@ -3543,6 +3556,10 @@ c_common_get_alias_set (tree t) if (!TYPE_P (t)) return -1;+ /* Unlike char, char8_t doesn't alias. */ + if (flag_char8_t && t == char8_type_node) + return -1;This seems unnecessary; doesn't the existing code have the same effect? I think we could do with just an adjustment to the existing comment.That shouldn't be a problem: if char8_t is a distinct type, it won't match unsigned char, and if it's the same as unsigned char, flag_char8_t will be false.+ else if (flag_char8_t && TREE_TYPE (value) == char8_array_type_node)In many places you check the flag and then for one of the char8 types. Since the types won't be used without the flag, checking the flag seems redundant?+ || (flag_char8_t && type == char8_type_node)+ bool char8_array = (flag_char8_t && !!comptypes (typ1, char8_type_node));+ || (flag_char8_t && type == char8_type_nodeThis was again protection against unintended matching of the underlying unsigned char type, particularly when compiling as C. char8_type_node is constructed (in c_common_nodes_and_builtins) following the pattern in place for char16_t and char32_t with the following code:+ char8_type_node = get_identifier (CHAR8_TYPE);+ char8_type_node = TREE_TYPE (identifier_global_value (char8_type_node));+ char8_type_size = TYPE_PRECISION (char8_type_node); + if (c_dialect_cxx ()) + { + char8_type_node = make_unsigned_type (char8_type_size); + + if (flag_char8_t) + record_builtin_type (RID_CHAR8, "char8_t", char8_type_node); + }My knowledge of gcc internals is weak, but I understand this to be, effectively, defining a type alias (of unsigned char) and then, if compiling as C++, re-defining it as a strong unsigned type.I don't recall the details now, but at one point, I was missing a check for flag_char8_t in some location and I encountered some test failures as a result.Since char8_type_node is always a distinct type in C++, we shouldn't need these checks in the C++ front end. And since it's never a distinct type in C, the C front end (c/*) doesn't need to change at all.- if (TYPE_PRECISION (typ1) == BITS_PER_UNIT) + if (TYPE_PRECISION (typ1) == BITS_PER_UNIT + && (typ1 == char_type_node || !flag_char8_t))This looks wrong, or at least incomplete; we want to complain about mismatched types here even with -fchar8-t. Perhaps we should replace all of this if/else with simply comparing typ1 and char_type, and complaining if they're different. Talking about wide and non-wide isn't as useful as the actual types would be.Well, I suppose it isn't quite that simple, since we still need to treat the ordinary character types as interchangeable.We do need to complain about mismatched types and test gcc/testsuite/g++.dg/ext/char8_t-init-2.C was added to ensure that happens.You don't seem to test initializing an array of signed/unsigned char, which I think will be broken by the change only considering char_type_node.I think we want to specifically allow conversion from array of one ordinary character type to another, and otherwise complain about the types being different with a message like "cannot initialize array of %qT from array of %qT" rather than mess with terms like int-array and (non-)wide string.
Ping. Will you have a chance to update the patch soon, or would you like me to make these last changes myself?
Jason
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |