[PATCH 1/3]: C N2653 char8_t: Language support

Tom Honermann tom@honermann.net
Fri Jun 11 16:20:48 GMT 2021

On 6/11/21 12:01 PM, Jakub Jelinek wrote:
> On Fri, Jun 11, 2021 at 11:52:41AM -0400, Tom Honermann via Gcc-patches wrote:
>> On 6/7/21 5:11 PM, Joseph Myers wrote:
>>> On Sun, 6 Jun 2021, Tom Honermann via Gcc-patches wrote:
>>>> When -fchar8_t support is enabled for non-C++ modes, the _CHAR8_T_SOURCE macro
>>>> is predefined.  This is the mechanism proposed to glibc to opt-in to
>>>> declarations of the char8_t typedef and c8rtomb and mbrtoc8 functions proposed
>>>> in N2653.  See [2].
>>> I don't think glibc should have such a feature test macro, and I don't
>>> think GCC should define such feature test macros either - _*_SOURCE macros
>>> are generally for the *user* to define to decide what namespace they want
>>> visible, not for the compiler to define.  Without proliferating new
>>> language dialects, __STDC_VERSION__ ought to be sufficient to communicate
>>> from the compiler to the library (including to GCC's own headers such as
>>> stdatomic.h).
>> In general I agree, but I think an exception is warranted in this case for a
>> few reasons:
>> 1. The feature includes both core language changes (the change of type
>>     for u8 string literals) and library changes.  The library changes
>>     are not actually dependent on the core language change, but they are
>>     intended to be used together.
>> 2. Existing use of the char8_t identifier can be found in existing open
>>     source projects and likely exists in some closed source projects as
>>     well.  An opt-in approach avoids conflict and the need to
>>     conditionalize code based on gcc version.
>> 3. An opt-in approach enables evaluation of the feature prior to any
>>     WG14 approval.
> But calling it _CHAR8_T_SOURCE is weird and inconsistent with everything
> else.
> In C++, there is __cpp_char8_t 201811L predefined macro for char8_t.
> Using that in C is not right, sure.
> Often we use __SIZEOF_type__ macros not just for sizeof(), but also for
> presence check of the types, like
> #ifdef __SIZEOF_INT128__
> __int128 i;
> #else
> long long i;
> #endif
> etc., while char8_t has sizeof (char8_t) == 1, perhaps predefining
> __SIZEOF_CHAR8_T__ 1
> instead of _CHAR8_T_SOURCE would be better?

I'm open to whatever signaling mechanism would be preferred.  It took me 
a while to settle on _CHAR8_T_SOURCE as the mechanism to propose as I 
didn't find much for other precedents.

I agree that having _CHAR8_T_SOURCE be implied by the -fchar8_t option 
is unusual with respect to other feature test macros.  Is that what you 
find to be weird and inconsistent?

Predefining __SIZEOF_CHAR8_T__ would be consistent with 
__SIZEOF_WCHAR_T__, but kind of strange too since the size is always 1.

Perhaps a better approach would be to follow the __CHAR16_TYPE__ and 
__CHAR32_TYPE__ precedent and define __CHAR8_TYPE__ to unsigned char.  
That is likewise a bit strange since the type would always be unsigned 
char, but it does provide a bit more symmetry.  That could potentially 
have some use as well; for C++, it could be defined as char8_t and 
thereby reflect the difference between the two languages.  Perhaps it 
could be useful in the future as well if WG14 were to add distinct 
char8_t, char16_t, and char32_t types as C++ did (I'm not offering any 
prediction regarding the likelihood of that happening).


> 	Jakub

More information about the Gcc-patches mailing list