This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc/libcpp: non-UTF-8 source or execution encodings?


Hi, David

I don't believe that hardware easily is available.  We probably could
arrange for access, if it is necessary, but it is not accessible
through the IBM Community Development system for Linux on z Systems
because this isn't Linux-based.  GCC on the system is not self-hosting
-- I believe that GCC only is used as a cross-compiler.

Thanks, David


On Tue, Jul 19, 2016 at 3:39 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Tue, 2016-07-19 at 12:24 -0400, David Edelsohn wrote:
>> On Tue, Jul 19, 2016 at 12:05 PM, David Malcolm <dmalcolm@redhat.com>
>> wrote:
>> > libcpp/charset.c has a helpful introductory comment
>> > describingcharacter
>> > sets, including the source and execution character sets.
>> >
>> > libcpp appears to attempt to support both UTF-8 and UTF-EBCDIC for
>> > the
>> > source character set, via:
>> >
>> > #if HOST_CHARSET == HOST_CHARSET_ASCII
>> > #define SOURCE_CHARSET "UTF-8"
>> > #define LAST_POSSIBLY_BASIC_SOURCE_CHAR 0x7e
>> > #elif HOST_CHARSET == HOST_CHARSET_EBCDIC
>> > #define SOURCE_CHARSET "UTF-EBCDIC"
>> > #define LAST_POSSIBLY_BASIC_SOURCE_CHAR 0xFF
>> > #else
>> > #error "Unrecognized basic host character set"
>> > #endif
>> >
>> > though libiberty's safe-ctype.c has:
>> >
>> > # if HOST_CHARSET == HOST_CHARSET_EBCDIC
>> >   #error "FIXME: write tables for EBCDIC"
>> >
>> > so presumably we only effectively support UTF-8 as the source char
>> > set.
>> >
>> > Do we support any hosts for which the source character set is *not*
>> > UTF
>> > -8?
>> >
>> > Similarly, do we support any targets for which the execution
>> > character
>> > set is *not* UTF-8?
>> >
>> > This relates to the locations-within-string-literals patch I posted
>> > here:
>> > https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00441.html
>> > ("[PATCH] RFC: On-demand locations within string-literals"); that
>> > patch
>> > currently has an assumption that the source encoding == execution
>> > encoding, and I'd appreciate knowing a configuration for which this
>> > isn't the case so I can test accordingly.
>>
>> I believe that the GCC z/TPF configuration uses EBCDIC.  There also
>> is
>> the on-again off-again i370 port.
>>
>> Thanks, David
>
> Thanks.  Looks like the triple for the former is "s390x-ibm-tpf"; I'm
> experimenting with that as the target.
>
> Is there any accessible hardware for these?  I don't see them in the
> gcc compile farm.
>
> Dave


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]