[Bug c/94990] NFC / NFD in identifiers

joseph at codesourcery dot com gcc-bugzilla@gcc.gnu.org
Thu May 7 23:08:58 GMT 2020


--- Comment #1 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
Note that ISO C references ISO 10646, not Unicode, so normalization forms 
are not part of the C notion of identifier characters and differently 
normalized forms are different identifiers as far as C is concerned.

The reason the -Wnormalized= options prefer NFC and don't have an option 
-Wnormalized=nfd is that many characters were only valid in C99 in the 
precomposed forms (C11 added more combining characters to the set allowed 
in identifiers).  Any Unicode character sequence can of course be 
converted to an NFC form if desired; some characters there may use 
precomposed forms and some may use combining characters.

If you wish to use NFD in your code, you should probably set your editor 
to generate NFD source files and compile with -Wno-normalized.

(A separate issue is that the Unicode data used in GCC for -Wnormalized= 
was last updated in 2013 and needs updating to a newer version of Unicode.  
Since the update I did in 2013 introduced automated generation of the 
relevant code from Unicode data, such an update to use newer Unicode data 
should be straightforward.)

More information about the Gcc-bugs mailing list