[REVISED PATCH 5/9]: C++ P0482R5 char8_t: Standard library support
Jonathan Wakely
jwakely@redhat.com
Fri Feb 8 12:57:00 GMT 2019
On 07/02/19 23:35 -0500, Tom Honermann wrote:
>On 2/7/19 4:44 AM, Jonathan Wakely wrote:
>>On 23/12/18 21:27 -0500, Tom Honermann wrote:
>>>Attached is a revised patch that addresses changes in P0482R6.Â
>>>Changes from the prior patch include:
>>>- Updated the value of the __cpp_char8_t feature test macro to 201811.
>>>
>>>Tested on x86_64-linux.
>>
>>Thanks, Tom, this is great work!
>>
>>The front-end changes for char8_t went in recently, and I'm finally
>>ready to commit the library parts.
>Great!
>>There's one big problem I found in
>>this patch, which is that the new numeric_limits<char8_t>
>>specialization uses constexpr unconditionally. That fails if <limits>
>>is compiled using options like -std=c++98 -fno-char8_t because the
>>specialization will be used, but the constexpr keyword isn't allowed.
>>That's easily fixed by replacing the keyword with _GLIBCXX_CONSTEXPR.
>Hmm, the code for the char8_t specialization was copied from the
>char16_t specialization which also uses constexpr unconditionally (but
>is guarded by a C++11+ requirement).
That can use it unconditionally, because there's no -fchar16_t switch
to enable char16_t prior to C++11.
>The char8_t specialization must
>be elided when the compiler is invoked with -std=c++98 -fno-char8_t
>(since the char8_t type doesn't exist then). The _GLIBCXX_USE_CHAR8_T
>guard doesn't suffice for this? _GLIBCXX_USE_CHAR8_T should only be
>defined if __cpp_char8_t is defined; and that should only be defined
>if -fchar8_t or -std=c++2a is specified. Or perhaps you intended
>-std=c++98 -fchar8_t? I agree in that case that use of
>_GLIBCXX_CONSTEXPR is necessary.
Yes sorry, that's a typo above, I meant -std=c++98 -fchar8_t.
The -std=c++98 -fno-char8_t case works fine, as expected (because
-fno-char8_t is the default for -std=c++98 anyway).
>>The other way to solve that problem would be for the compiler to give
>>an error if -fchar8_t is used with C++98, but I see no fundamental
>>reason that combination of options shouldn't be allowed. We can
>>support it in the library by using the macro.
>Agreed.
>>
>>As discussed in San Diego, the other change needed is to add the
>>abi_tag attribute to the new versions of path::u8string and
>>path::generic_u8string, so that the mangling is different when its
>>return type is different:
>>
>>#ifdef _GLIBCXX_USE_CHAR8_T
>>Â Â __attribute__((__abi_tag__("__u8")))
>>  std::u8string u8string() const;
>>#else
>>  std::string   u8string() const;
>>#endif // _GLIBCXX_USE_CHAR8_T
>>
>>Otherwise we get ODR violations when linking objects compiled
>>with -fchar8_t enabled to objects with it disabled (e.g. linking
>>-std=c++17 objects to -std=c++2a objects, which needs to work).
>
>Are ODR violations bad? :)
Only when they make people send us bug reports ;-)
>>
>>I suggest "__u8" as the name of the ABI tag, but I'm open to other
>>suggestions. "__char8_t" is a bit long and verbose. "__cxx20" would be
>>consistent with "__cxx11" used for the new ABI introduced in GCC 5 but
>>it regularly confuses people who think it is coupled to the -std=c++11
>>option (and so don't understand why they still see it for -std=c++14).
>I have no preference or alternative suggestions here. Had I
>recognized the issue, I would have asked you what to do about it :)
>>
>>Also, I see that you've made changes to <experimental/string_view> (to
>>add the experimental::u8string_view typedef) and to
>>std::experimental::path (to change the return type of u8string and
>>generic_u8string).
>>
>>The former change is fairly harmless; it only adds a typedef, albeit
>>one which is not a reserved name in C++14/C++17 and so should be
>>available for users to define as a macro. Maybe prior to C++2a we
>>should only define it when GNU extensions are enabled (i.e. when using
>>-std=gnu++14 not -std=c++14):
>>
>>#if defined _GLIBCXX_USE_CHAR8_T \
>>Â && (__cplusplus > 201703L || !defined __STRICT_ANSI__)
>>Â using u8string_view = basic_string_view<char8_t>;
>>#endif
>That makes sense.
Actually I was thinking about this further, and if somebody explicitly
uses -fchar8_t then they're asking for a non-standard dialect of C++
anyway, and so they can't complain about some extra non-standard
names. So I think it's fine to declare std::u8string_view whenever
char8_t is enabled.
>>Changing the return type of experimental::path members concerns me
>>more. That's a published TS which is not going to be revised, and it's
>>not obvious to me that users would want the change in semantics. If
>>somebody is still using the Filesystem TS in C++2a code, they're
>>probably not expecting it to change. If they need to update their code
>>for C++2a they might as well just use std::filesystem, and so having
>>char8_t support in std::experimental::filesystem isn't clearly useful.
>>
>I agree. I added the support to the experimental implementations more
>out of a desire to be complete and to remove any potential barriers to
>use of -fchar8_t than because I felt the changes were really
>necessary. I would be perfectly fine with skipping the updates to the
>experimental libraries completely.
OK, let's leave them alone.
More information about the Libstdc++
mailing list