This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
[fortran,patch] Wide characeters, main patch
- From: FX <fxcoudert at gmail dot com>
- To: Fortran List <fortran at gcc dot gnu dot org>, gcc-patches Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sat, 17 May 2008 23:08:55 +0100
- Subject: [fortran,patch] Wide characeters, main patch
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:mime-version:to:message-id:content-type:from:subject:date:x-mailer; bh=IO02Q5m7nRuIq3qdsTYoabh7elB2aLp7pqqPTUK2zBM=; b=qMxn5HyL8ApoP0tm+0C4aho/vYrOsKrpOkj8hH1nTpfYKBpM0iN+tHc67UHlwWCHM5NIF6V/yccLuD/mwxI76uZr6AAF9ZWMTi7ShtIGG13eI4N4sLOclcsu0gWgci3gUq/Evn9FwXhGrJEf1zBrx6dNc7Z+dW9CF4I+PPhUDp4=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:to:message-id:content-type:from:subject:date:x-mailer; b=lzQ3CH+5XVwV3STwiKo057hObefq8qpUHAaNGSY92BGcv5PuME0RmuSEejdY5TsGfwb0OR2YnK+CZkB9GoUQXS804eisdVdt2wD4lawzZY5IMnNk2EE/tQWoJMMOmjf4xRRlutfNrg/TvpQv8xeG3h0rRxJPfqMFAPvyoK8HkzA=
Well, I said only the final patch was missing, but it's getting very
long to make completely cleanly, because there are so many intrinsics
to fully check. So, to ease review and allow you to review the patch
while I work on the last missing intrinsics, here is the main part of
the patch enabling full wide strings support. What it does:
-- allow construction of wide char litterals: they're strings with
the correct memory representation stored in it (code lifted from
target-memory.h to do that correctly), and the correct wide char
basetype
-- modify front-end routines so that allocating and copying
strings takes into account the existence of wide characters
-- removing various restrictions on standard intrinsics, adapting
code when needed and adding library functions to deal with wide chars
-- check that all standard intrinsics that should only accept
default character kinds do so; also, I decided that all GNU
extensions would only deal with default character kinds, implemented
these restrictions and documented them
-- add conversion functions for the assignment of one character
kind to a variable of another type; this currently does not error out
at runtime (it wraps around with the semantics of unsigned integers),
but we could make it do so later if deemed useful; the front-end
framework to do so is inspired by the other conversions, but not
integrated because the character conversion is allowed in different
places (namely, only in assignment) (as a side note: I've tried at
first to integrate it with other conversions, but it was rather a mess)
-- fixed a few bits and odds here and there; the one-line change
to get_array_ctor_var_strlen() is fixing a bug that could manifest
itself with long strings in array constructors (longer than 255).
What it doesn't do, and what I will be working on from now on:
-- some more intrinsics: CSHIFT, EOSHIFT, INDEX, {MIN,MAX}
{LOC,VAL}, MERGE, MIN, MAX, MVBITS, NEW_LINE, REPEAT, SCAN, SIZEOF,
TRANSFER, VERIFY (writing testcases, manually inspecting the
generated code, fixing front-end and adding library functions where
necessary; some notes: REPEAT only needs testcases, because I already
fixed the front-end and manually tested; INDEX, MERGE, SCAN, VERIFY
and TRANSFER should be straightforward)
-- write testcases for a few things that could potentially behave
unexpectedly: I'm currently thinking of allocatables, functions
returning widechar strings, user-defined operators, one-character
assignment (I already tested it manually and it works as expected),
playing with modules
The patch comes as a full version in file wide_char_part6.diff. I've
also broken it down into pieces for ease of reading, with their
separate ChangeLog entries: front-end, library and testsuite. I also
include a patch to the middle-end tree pretty-printer, which makes
litteral widechar strings easier to read; I will not commit it with
the rest, and will seek approval by middle-end maintainers later.
Patch bootstraps and regtestes fine on x86_64-linux (both -m32 and -
m64), except for gfortran.dg/char_cast_1.f90 and gfortran.dg/
char_cast_2.f90; it's not a fundamental problem but the testcase is
scanning its dump tree, which has changed with my modifications to
gfc_trans_string_copy(); I'll modify it accordingly before
committing. Comments welcome. OK to commit?
FX
--
François-Xavier Coudert
http://www.homepages.ucl.ac.uk/~uccafco/
Attachment:
wide_char_part6.diff
Description: Binary data
Attachment:
wide_char_part6_frontend.ChangeLog
Description: Binary data
Attachment:
wide_char_part6_frontend.diff
Description: Binary data
Attachment:
wide_char_part6_library.ChangeLog
Description: Binary data
Attachment:
wide_char_part6_library.diff
Description: Binary data
Attachment:
wide_char_part6_testsuite.ChangeLog
Description: Binary data
Attachment:
wide_char_part6_testsuite.diff
Description: Binary data
Attachment:
wide_char_part6_gcc.diff
Description: Binary data