[REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support

Jonathan Wakely jwakely@redhat.com
Fri Feb 8 12:57:00 GMT 2019


On 07/02/19 23:35 -0500, Tom Honermann wrote:
>On 2/7/19 4:44 AM, Jonathan Wakely wrote:
>>On 23/12/18 21:27 -0500, Tom Honermann wrote:
>>>Attached is a revised patch that addresses changes in P0482R6.  
>>>Changes from the prior patch include:
>>>- Updated the value of the __cpp_char8_t feature test macro to 201811.
>>>
>>>Tested on x86_64-linux.
>>
>>Thanks, Tom, this is great work!
>>
>>The front-end changes for char8_t went in recently, and I'm finally
>>ready to commit the library parts.
>Great!
>>There's one big problem I found in
>>this patch, which is that the new numeric_limits<char8_t>
>>specialization uses constexpr unconditionally. That fails if <limits>
>>is compiled using options like -std=c++98 -fno-char8_t because the
>>specialization will be used, but the constexpr keyword isn't allowed.
>>That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR.
>Hmm, the code for the char8_t specialization was copied from the 
>char16_t specialization which also uses constexpr unconditionally (but 
>is guarded by a C++11+ requirement).

That can use it unconditionally, because there's no -fchar16_t switch
to enable char16_t prior to C++11.

>The char8_t specialization must 
>be elided when the compiler is invoked with -std=c++98 -fno-char8_t 
>(since the char8_t type doesn't exist then).  The _GLIBCXX_USE_CHAR8_T 
>guard doesn't suffice for this? _GLIBCXX_USE_CHAR8_T should only be 
>defined if __cpp_char8_t is defined; and that should only be defined 
>if -fchar8_t or -std=c++2a is specified.  Or perhaps you intended 
>-std=c++98 -fchar8_t?  I agree in that case that use of 
>_GLIBCXX_CONSTEXPR is necessary.

Yes sorry, that's a typo above, I meant -std=c++98 -fchar8_t.

The -std=c++98 -fno-char8_t case works fine, as expected (because
-fno-char8_t is the default for -std=c++98 anyway).

>>The other way to solve that problem would be for the compiler to give
>>an error if -fchar8_t is used with C++98, but I see no fundamental
>>reason that combination of options shouldn't be allowed. We can
>>support it in the library by using the macro.
>Agreed.
>>
>>As discussed in San Diego, the other change needed is to add the
>>abi_tag attribute to the new versions of path::u8string and
>>path::generic_u8string, so that the mangling is different when its
>>return type is different:
>>
>>#ifdef _GLIBCXX_USE_CHAR8_T
>>   __attribute__((__abi_tag__("__u8")))
>>   std::u8string  u8string() const;
>>#else
>>   std::string    u8string() const;
>>#endif // _GLIBCXX_USE_CHAR8_T
>>
>>Otherwise we get ODR violations when linking objects compiled
>>with -fchar8_t enabled to objects with it disabled (e.g. linking
>>-std=c++17 objects to -std=c++2a objects, which needs to work).
>
>Are ODR violations bad? :)

Only when they make people send us bug reports ;-)

>>
>>I suggest "__u8" as the name of the ABI tag, but I'm open to other
>>suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be
>>consistent with "__cxx11" used for the new ABI introduced in GCC 5 but
>>it regularly confuses people who think it is coupled to the -std=c++11
>>option (and so don't understand why they still see it for -std=c++14).
>I have no preference or alternative suggestions here.  Had I 
>recognized the issue, I would have asked you what to do about it :)
>>
>>Also, I see that you've made changes to <experimental/string_view> (to
>>add the experimental::u8string_view typedef) and to
>>std::experimental::path (to change the return type of u8string and
>>generic_u8string).
>>
>>The former change is fairly harmless; it only adds a typedef, albeit
>>one which is not a reserved name in C++14/C++17 and so should be
>>available for users to define as a macro. Maybe prior to C++2a we
>>should only define it when GNU extensions are enabled (i.e. when using
>>-std=gnu++14 not -std=c++14):
>>
>>#if defined _GLIBCXX_USE_CHAR8_T \
>> && (__cplusplus > 201703L || !defined __STRICT_ANSI__)
>> using u8string_view = basic_string_view<char8_t>;
>>#endif
>That makes sense.

Actually I was thinking about this further, and if somebody explicitly
uses -fchar8_t then they're asking for a non-standard dialect of C++
anyway, and so they can't complain about some extra non-standard
names. So I think it's fine to declare std::u8string_view whenever
char8_t is enabled.

>>Changing the return type of experimental::path members concerns me
>>more. That's a published TS which is not going to be revised, and it's
>>not obvious to me that users would want the change in semantics. If
>>somebody is still using the Filesystem TS in C++2a code, they're
>>probably not expecting it to change. If they need to update their code
>>for C++2a they might as well just use std::filesystem, and so having
>>char8_t support in std::experimental::filesystem isn't clearly useful.
>>
>I agree.  I added the support to the experimental implementations more 
>out of a desire to be complete and to remove any potential barriers to 
>use of -fchar8_t than because I felt the changes were really 
>necessary.  I would be perfectly fine with skipping the updates to the 
>experimental libraries completely.

OK, let's leave them alone.




More information about the Libstdc++ mailing list