Next: , Previous: Compiling Different Versions of Ada, Up: Switches for gcc

3.2.10 Character Set Control

Normally GNAT recognizes the Latin-1 character set in source program identifiers, as described in the Ada Reference Manual. This switch causes GNAT to recognize alternate character sets in identifiers. c is a single character indicating the character set, as follows:
ISO 8859-1 (Latin-1) identifiers
ISO 8859-2 (Latin-2) letters allowed in identifiers
ISO 8859-3 (Latin-3) letters allowed in identifiers
ISO 8859-4 (Latin-4) letters allowed in identifiers
ISO 8859-5 (Cyrillic) letters allowed in identifiers
ISO 8859-15 (Latin-9) letters allowed in identifiers
IBM PC letters (code page 437) allowed in identifiers
IBM PC letters (code page 850) allowed in identifiers
Full upper-half codes allowed in identifiers
No upper-half codes allowed in identifiers
Wide-character codes (that is, codes greater than 255) allowed in identifiers

See Foreign Language Representation, for full details on the implementation of these character sets.

Specify the method of encoding for wide characters. e is one of the following:
Hex encoding (brackets coding also recognized)
Upper half encoding (brackets encoding also recognized)
Shift/JIS encoding (brackets encoding also recognized)
EUC encoding (brackets encoding also recognized)
UTF-8 encoding (brackets encoding also recognized)
Brackets encoding only (default value)
For full details on these encoding methods see Wide Character Encodings. Note that brackets coding is always accepted, even if one of the other options is specified, so for example -gnatW8 specifies that both brackets and UTF-8 encodings will be recognized. The units that are with'ed directly or indirectly will be scanned using the specified representation scheme, and so if one of the non-brackets scheme is used, it must be used consistently throughout the program. However, since brackets encoding is always recognized, it may be conveniently used in standard libraries, allowing these libraries to be used with any of the available coding schemes. scheme.

If no -gnatW? parameter is present, then the default representation is normally Brackets encoding only. However, if the first three characters of the file are 16#EF# 16#BB# 16#BF# (the standard byte order mark or BOM for UTF-8), then these three characters are skipped and the default representation for the file is set to UTF-8.

Note that the wide character representation that is specified (explicitly or by default) for the main program also acts as the default encoding used for Wide_Text_IO files if not specifically overridden by a WCEM form parameter.