This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[fortran,patch] Towards wide character strings, step 2: handle characeter constants as wide strings

From: FX <fxcoudert at gmail dot com>
To: Fortran List <fortran at gcc dot gnu dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Tue, 6 May 2008 13:10:13 +0100
Subject: [fortran,patch] Towards wide character strings, step 2: handle characeter constants as wide strings
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:mime-version:to:message-id:content-type:from:subject:date:x-mailer; bh=SN3jRUkcsXIgvBJ23n86zcrZvq0c5bCtat5k2/Paedw=; b=ur7wmUNhuMWHTgNqkcnvFbDiwPbEmi8tVCqvOaOyyLQmarnAoVz8FxjPi0StwlDjIqUUPFrGF1yAKUHDOsH0eRCSQx50qAQ71eqn4eGZwLNNrZxjjB5BY+MZ7tZQllWIYcVHFVpxTk2+8H7g1zqhZIAHghPHDXlWfS1/5sjiRQE=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:to:message-id:content-type:from:subject:date:x-mailer; b=ISnGOt8qrhG+OHdEUjJEo9DtOaQFAM8OjHnSAT3BcLX+ae4cbv3r8mo8TjurLVPjsRwh+jLtTxD4Vw2pNLoqgWsKofW1CICNTYYcFhPOwObrgKHeTs6QkBE/UZsVp1/LiCkPDBgExQygLf0rQL4owaLWtf/hZ9SLUkXyZhZi3Xw=

Hi all,

This patch goes on top of the previous "wide strings" patch, and is the second step in the direction of supporting non-default character kinds: after the first one, which made us handle all source as wide characters, this one makes us handle all constant character values internally as wide strings. Same restrictions as before apply, i.e. we still don't handle them throughout but instead lower them (with adequate checks and gcc_assert's) to usual (narrow) strings when emitting code for the middle-end.

This patch is rather boring. It bootstraps and regtests on x86_64- linux (both -m32 and -m64); no testcases are attached, as behaviour should not change in any case. I provide, in addition to the ChangeLog, two diffs: one is the "incremental" one that goes on top of the previous patch, and the second one is a full diff from mainline, that includes both patches.

OK to commit?

I think I now have a very clear view on how to go all the way to wide characters; we'll see if I have time to work on it, depending on whether I visit as many airports, planes, stations, trains and buses as I've done in the last 2 weeks :) Some comments on the remaining issues, however, by decreasing order of difficulty (but there's nothing major I can see): -- misc assumptions in the front-end that there only is one character kind (all uses of char_type_node and pchar_type_node will have to be audited) -- the problem of storing wide character strings in module files: do we want to use encoding (UTF-8? UTF-32, and which endianness?) or escaping? (using \x, \u and \U) I now favour escaping, which I think will really make our life easier. -- determining what should be done with Holleriths; I have to say I know virtually nothing about that, and I'd rather not learn, but I guess I'll have to :) -- handling of UTF-8 in the library; probably not too hard, as UTF-8 <--> UTF-32 conversion is rather simple


--
François-Xavier Coudert
http://www.homepages.ucl.ac.uk/~uccafco/

Attachment: wide_char_constants_incrementalpatch_1.ChangeLog
Description: Binary data

Attachment: wide_char_constants_incrementalpatch_1.diff
Description: Binary data

Attachment: wide_char_constants_1.diff
Description: Binary data

Follow-Ups:
- Re: [fortran,patch] Towards wide character strings, step 2: handle characeter constants as wide strings
  - From: Tobias Burnus

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]