This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
[fortran,patch] Towards wide character strings, step 2: handle characeter constants as wide strings
- From: FX <fxcoudert at gmail dot com>
- To: Fortran List <fortran at gcc dot gnu dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 6 May 2008 13:10:13 +0100
- Subject: [fortran,patch] Towards wide character strings, step 2: handle characeter constants as wide strings
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:mime-version:to:message-id:content-type:from:subject:date:x-mailer; bh=SN3jRUkcsXIgvBJ23n86zcrZvq0c5bCtat5k2/Paedw=; b=ur7wmUNhuMWHTgNqkcnvFbDiwPbEmi8tVCqvOaOyyLQmarnAoVz8FxjPi0StwlDjIqUUPFrGF1yAKUHDOsH0eRCSQx50qAQ71eqn4eGZwLNNrZxjjB5BY+MZ7tZQllWIYcVHFVpxTk2+8H7g1zqhZIAHghPHDXlWfS1/5sjiRQE=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:to:message-id:content-type:from:subject:date:x-mailer; b=ISnGOt8qrhG+OHdEUjJEo9DtOaQFAM8OjHnSAT3BcLX+ae4cbv3r8mo8TjurLVPjsRwh+jLtTxD4Vw2pNLoqgWsKofW1CICNTYYcFhPOwObrgKHeTs6QkBE/UZsVp1/LiCkPDBgExQygLf0rQL4owaLWtf/hZ9SLUkXyZhZi3Xw=
Hi all,
This patch goes on top of the previous "wide strings" patch, and is
the second step in the direction of supporting non-default character
kinds: after the first one, which made us handle all source as wide
characters, this one makes us handle all constant character values
internally as wide strings. Same restrictions as before apply, i.e.
we still don't handle them throughout but instead lower them (with
adequate checks and gcc_assert's) to usual (narrow) strings when
emitting code for the middle-end.
This patch is rather boring. It bootstraps and regtests on x86_64-
linux (both -m32 and -m64); no testcases are attached, as behaviour
should not change in any case. I provide, in addition to the
ChangeLog, two diffs: one is the "incremental" one that goes on top
of the previous patch, and the second one is a full diff from
mainline, that includes both patches.
OK to commit?
I think I now have a very clear view on how to go all the way to wide
characters; we'll see if I have time to work on it, depending on
whether I visit as many airports, planes, stations, trains and buses
as I've done in the last 2 weeks :) Some comments on the remaining
issues, however, by decreasing order of difficulty (but there's
nothing major I can see):
-- misc assumptions in the front-end that there only is one
character kind (all uses of char_type_node and pchar_type_node will
have to be audited)
-- the problem of storing wide character strings in module files:
do we want to use encoding (UTF-8? UTF-32, and which endianness?) or
escaping? (using \x, \u and \U) I now favour escaping, which I think
will really make our life easier.
-- determining what should be done with Holleriths; I have to say
I know virtually nothing about that, and I'd rather not learn, but I
guess I'll have to :)
-- handling of UTF-8 in the library; probably not too hard, as
UTF-8 <--> UTF-32 conversion is rather simple
--
François-Xavier Coudert
http://www.homepages.ucl.ac.uk/~uccafco/
Attachment:
wide_char_constants_incrementalpatch_1.ChangeLog
Description: Binary data
Attachment:
wide_char_constants_incrementalpatch_1.diff
Description: Binary data
Attachment:
wide_char_constants_1.diff
Description: Binary data