This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch, fortran] Wide character I/O Part 1.1 Round 2
- From: Jerry DeLisle <jvdelisle at verizon dot net>
- To: Fortran List <fortran at gcc dot gnu dot org>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 30 May 2008 23:15:46 -0700
- Subject: [patch, fortran] Wide character I/O Part 1.1 Round 2
- References: <483E0FBE.8090301@verizon.net> <483EFC3E.2070102@net-b.de>
Attached is round 2 of this patch with FX and Tobias comments incorporated.
I have also added support for unformatted wide character I/O.
ISO-8859-1 characters now work correctly. I have cleaned up the code quite a
bit, updated the previously submitted test cases, and added two more.
Please give it a spin.
I have not finished looking into swapping with the CONVERT= extension. I plan to
deal with swap in a follow-up patch.
Regression tested on x86-64-*-linux and currently testing on ppc64-*. The test
cases pass on ppc64.
OK to commit if regression testing is OK on ppc64?
Regards,
Jerry
2008-05-30 Jerry DeLisle <jvdelisle@gcc.gnu.org>
PR fortran/35863
* trans-io.c (gfc_build_io_library_fndecls): Build declaration for
transfer_character_wide which includes passing in the character kind to
support wide character IO. (transfer_expr): If the kind == 4, create the
argument and build the call.
libfortran/
* libgfortran.h: Change l8_to_l4_offset to big_endian and add endian_off.
* runtime/main.c: Fix error in comment. Change l8_to_l4_offset to
big_endian. (determine_endianness): Add endian_off and set its value
according to big_endian.
* gfortran.map: Add symbol for new _gfortran_transfer_character_wide.
* io/io.h: Add prototype declarations for new functions.
* io/list_read.c (list_formatted_read_scalar): Modify to handle kind=4.
(list_formatted_read): Calculate stride based on kind for character type
and use it when calling list_formatted_read_scalar.
* io/inquire.c (inquire_via_unit): Change l8_to_l4_offset to big_endian.
* io/open.c (st_open): Change l8_to_l4_offset to big_endian.
* io/read.c (read_a_char4): New function to handle formatted read.
* io/write.c: Define GFC_CHAR4(x) to improve readability of code.
(write_a_char4): New function to handle formatted write.
(write_character): Modify to accept the kind parameter and adjust for
endianess of the machine. (list_formatted_write): Calculate the stride
resulting from the kind and adjust the list_formatted_write_scalar call
accordingly. (nml_write_obj): Adjust calls to write_character.
(namelist_write): Likewise.
* io/transfer.c (formatted_transfer_scaler): Rename 'len' argument to
'kind' argument to better describe what it is. Add calls to new
functions for kind == 4. (formatted_transfer): Modify to handle the case
of type character and kind equals 4 to pass in the kind to the transfer
routines. (transfer_character_wide): Add this new function.
(transfer_array): Don't set kind to the character string length. Adjust
strides bases on character kind.
(unformatted_read): Adjust size based on kind for character types.
(unformatted_write): Likewise. (data_transfer_init): Change
l8_to_l4_offset to big_endian.
Index: gcc/fortran/ChangeLog
===================================================================
--- gcc/fortran/ChangeLog (revision 136136)
+++ gcc/fortran/ChangeLog (working copy)
@@ -1,3 +1,42 @@
+2008-05-30 Jerry DeLisle <jvdelisle@gcc.gnu.org>
+
+ PR fortran/35863
+ * trans-io.c (gfc_build_io_library_fndecls): Build declaration for
+ transfer_character_wide which includes passing in the character kind to
+ support wide character IO. (transfer_expr): If the kind == 4, create the
+ argument and build the call.
+
+ libfortran/
+ * libgfortran.h: Change l8_to_l4_offset to big_endian and add endian_off.
+ * runtime/main.c: Fix error in comment. Change l8_to_l4_offset to
+ big_endian. (determine_endianness): Add endian_off and set its value
+ according to big_endian.
+ * gfortran.map: Add symbol for new _gfortran_transfer_character_wide.
+ * io/io.h: Add prototype declarations for new functions.
+ * io/list_read.c (list_formatted_read_scalar): Modify to handle kind=4.
+ (list_formatted_read): Calculate stride based on kind for character type
+ and use it when calling list_formatted_read_scalar.
+ * io/inquire.c (inquire_via_unit): Change l8_to_l4_offset to big_endian.
+ * io/open.c (st_open): Change l8_to_l4_offset to big_endian.
+ * io/read.c (read_a_char4): New function to handle formatted read.
+ * io/write.c: Define GFC_CHAR4(x) to improve readability of code.
+ (write_a_char4): New function to handle formatted write.
+ (write_character): Modify to accept the kind parameter and adjust for
+ endianess of the machine. (list_formatted_write): Calculate the stride
+ resulting from the kind and adjust the list_formatted_write_scalar call
+ accordingly. (nml_write_obj): Adjust calls to write_character.
+ (namelist_write): Likewise.
+ * io/transfer.c (formatted_transfer_scaler): Rename 'len' argument to
+ 'kind' argument to better describe what it is. Add calls to new
+ functions for kind == 4. (formatted_transfer): Modify to handle the case
+ of type character and kind equals 4 to pass in the kind to the transfer
+ routines. (transfer_character_wide): Add this new function.
+ (transfer_array): Don't set kind to the character string length. Adjust
+ strides bases on character kind.
+ (unformatted_read): Adjust size based on kind for character types.
+ (unformatted_write): Likewise. (data_transfer_init): Change
+ l8_to_l4_offset to big_endian.
+
2008-05-28 Janus Weil <janus@gcc.gnu.org>
PR fortran/36325
Index: gcc/fortran/trans-io.c
===================================================================
--- gcc/fortran/trans-io.c (revision 136136)
+++ gcc/fortran/trans-io.c (working copy)
@@ -121,6 +121,7 @@ enum iocall
IOCALL_X_INTEGER,
IOCALL_X_LOGICAL,
IOCALL_X_CHARACTER,
+ IOCALL_X_CHARACTER_WIDE,
IOCALL_X_REAL,
IOCALL_X_COMPLEX,
IOCALL_X_ARRAY,
@@ -327,6 +328,13 @@ gfc_build_io_library_fndecls (void)
void_type_node, 3, dt_parm_type,
pvoid_type_node, gfc_int4_type_node);
+ iocall[IOCALL_X_CHARACTER_WIDE] =
+ gfc_build_library_function_decl (get_identifier
+ (PREFIX("transfer_character_wide")),
+ void_type_node, 4, dt_parm_type,
+ pvoid_type_node, gfc_charlen_type_node,
+ gfc_int4_type_node);
+
iocall[IOCALL_X_REAL] =
gfc_build_library_function_decl (get_identifier (PREFIX("transfer_real")),
void_type_node, 3, dt_parm_type,
@@ -1977,7 +1985,7 @@ transfer_array_component (tree expr, gfc
static void
transfer_expr (gfc_se * se, gfc_typespec * ts, tree addr_expr, gfc_code * code)
{
- tree tmp, function, arg2, field, expr;
+ tree tmp, function, arg2, arg3, field, expr;
gfc_component *c;
int kind;
@@ -2009,6 +2017,7 @@ transfer_expr (gfc_se * se, gfc_typespec
kind = ts->kind;
function = NULL;
arg2 = NULL;
+ arg3 = NULL;
switch (ts->type)
{
@@ -2033,6 +2042,26 @@ transfer_expr (gfc_se * se, gfc_typespec
break;
case BT_CHARACTER:
+ if (kind == 4)
+ {
+ if (se->string_length)
+ arg2 = se->string_length;
+ else
+ {
+ tmp = build_fold_indirect_ref (addr_expr);
+ gcc_assert (TREE_CODE (TREE_TYPE (tmp)) == ARRAY_TYPE);
+ arg2 = TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (tmp)));
+ arg2 = fold_convert (gfc_charlen_type_node, arg2);
+ }
+ arg3 = build_int_cst (NULL_TREE, kind);
+ function = iocall[IOCALL_X_CHARACTER_WIDE];
+ tmp = build_fold_addr_expr (dt_parm);
+ tmp = build_call_expr (function, 4, tmp, addr_expr, arg2, arg3);
+ gfc_add_expr_to_block (&se->pre, tmp);
+ gfc_add_block_to_block (&se->pre, &se->post);
+ return;
+ }
+ /* Fall through. */
case BT_HOLLERITH:
if (se->string_length)
arg2 = se->string_length;
Index: libgfortran/runtime/main.c
===================================================================
--- libgfortran/runtime/main.c (revision 136136)
+++ libgfortran/runtime/main.c (working copy)
@@ -45,11 +45,13 @@ stupid_function_name_for_static_linking
return;
}
-/* This is the offset (in bytes) required to cast from logical(8)* to
- logical(4)*. and still get the same result. Will be 0 for little-endian
- machines and 4 for big-endian machines. */
-int l8_to_l4_offset = 0;
-
+/* This will be 0 for little-endian
+ machines and 1 for big-endian machines. */
+int big_endian = 0;
+
+/* This will be 0 for little-endian
+ machines and 3 for big-endian machines. */
+int endian_off = 0;
/* Figure out endianness for this machine. */
@@ -64,11 +66,15 @@ determine_endianness (void)
u.l8 = 1;
if (u.l4[0])
- l8_to_l4_offset = 0;
+ big_endian = 0;
else if (u.l4[1])
- l8_to_l4_offset = 1;
+ big_endian = 1;
else
runtime_error ("Unable to determine machine endianness");
+
+ /* This is the byte offset to use to reach the other end of a 4 byte char if
+ on a big endian machine. It's zero for little endian. */
+ endian_off = 3 * big_endian;
}
Index: libgfortran/gfortran.map
===================================================================
--- libgfortran/gfortran.map (revision 136136)
+++ libgfortran/gfortran.map (working copy)
@@ -1083,6 +1083,7 @@ GFORTRAN_1.1 {
_gfortran_string_trim_char4;
_gfortran_string_verify_char4;
_gfortran_st_wait;
+ _gfortran_transfer_character_wide;
_gfortran_transpose_char4;
_gfortran_unpack0_char4;
_gfortran_unpack1_char4;
Index: libgfortran/libgfortran.h
===================================================================
--- libgfortran/libgfortran.h (revision 136136)
+++ libgfortran/libgfortran.h (working copy)
@@ -274,11 +274,15 @@ typedef GFC_UINTEGER_4 gfc_char4_t;
/* This will be 0 on little-endian machines and one on big-endian machines. */
-extern int l8_to_l4_offset;
-internal_proto(l8_to_l4_offset);
+extern int big_endian;
+internal_proto(big_endian);
+
+/* This will be 0 on little-endian machines and 3 on big-endian machines. */
+extern int endian_off;
+internal_proto(endian_off);
#define GFOR_POINTER_TO_L1(p, kind) \
- (l8_to_l4_offset * (kind - 1) + (GFC_LOGICAL_1 *)(p))
+ (big_endian * (kind - 1) + (GFC_LOGICAL_1 *)(p))
#define GFC_INTEGER_1_HUGE \
(GFC_INTEGER_1)((((GFC_UINTEGER_1)1) << 7) - 1)
Index: libgfortran/io/open.c
===================================================================
--- libgfortran/io/open.c (revision 136136)
+++ libgfortran/io/open.c (working copy)
@@ -795,7 +795,7 @@ st_open (st_parameter_open *opp)
conv = compile_options.convert;
}
- /* We use l8_to_l4_offset, which is 0 on little-endian machines
+ /* We use big_endian, which is 0 on little-endian machines
and 1 on big-endian machines. */
switch (conv)
{
@@ -804,11 +804,11 @@ st_open (st_parameter_open *opp)
break;
case GFC_CONVERT_BIG:
- conv = l8_to_l4_offset ? GFC_CONVERT_NATIVE : GFC_CONVERT_SWAP;
+ conv = big_endian ? GFC_CONVERT_NATIVE : GFC_CONVERT_SWAP;
break;
case GFC_CONVERT_LITTLE:
- conv = l8_to_l4_offset ? GFC_CONVERT_SWAP : GFC_CONVERT_NATIVE;
+ conv = big_endian ? GFC_CONVERT_SWAP : GFC_CONVERT_NATIVE;
break;
default:
Index: libgfortran/io/list_read.c
===================================================================
--- libgfortran/io/list_read.c (revision 136136)
+++ libgfortran/io/list_read.c (working copy)
@@ -1728,7 +1728,8 @@ list_formatted_read_scalar (st_parameter
int kind, size_t size)
{
char c;
- int m;
+ char *q;
+ int i, j, m;
jmp_buf eof_jump;
dtp->u.p.namelist_mode = 0;
@@ -1830,18 +1831,29 @@ list_formatted_read_scalar (st_parameter
break;
case BT_CHARACTER:
+ q = (char *) p;
if (dtp->u.p.saved_string)
- {
+ {
m = ((int) size < dtp->u.p.saved_used)
? (int) size : dtp->u.p.saved_used;
- memcpy (p, dtp->u.p.saved_string, m);
- }
+ if (kind == 1)
+ memcpy (p, dtp->u.p.saved_string, m);
+ else
+ for (i = j = 0; i < m; i++, j += 4)
+ q[j + endian_off] = dtp->u.p.saved_string[i];
+ }
else
/* Just delimiters encountered, nothing to copy but SPACE. */
m = 0;
- if (m < (int) size)
- memset (((char *) p) + m, ' ', size - m);
+ if (m < (int) size)
+ {
+ if (kind == 1)
+ memset (((char *) p) + m, ' ', size - m);
+ else
+ for (i = m, j = m * 4; i < (int) size - m; i++, j += 4)
+ q[j + endian_off] = ' ';
+ }
break;
case BT_NULL:
@@ -1862,6 +1874,8 @@ list_formatted_read (st_parameter_dt *dt
{
size_t elem;
char *tmp;
+ size_t stride = type == BT_CHARACTER ?
+ size * GFC_SIZE_OF_CHAR_KIND(kind) : size;
tmp = (char *) p;
@@ -1869,7 +1883,7 @@ list_formatted_read (st_parameter_dt *dt
for (elem = 0; elem < nelems; elem++)
{
dtp->u.p.item_count++;
- list_formatted_read_scalar (dtp, type, tmp + size*elem, kind, size);
+ list_formatted_read_scalar (dtp, type, tmp + stride*elem, kind, size);
}
}
Index: libgfortran/io/read.c
===================================================================
--- libgfortran/io/read.c (revision 136136)
+++ libgfortran/io/read.c (working copy)
@@ -270,6 +270,42 @@ read_a (st_parameter_dt *dtp, const fnod
memset (p + m, ' ', n);
}
+void
+read_a_char4 (st_parameter_dt *dtp, const fnode *f, char *p, int length)
+{
+ char *s, *dest;
+ int m, n, wi, status;
+ size_t w;
+
+ wi = f->u.w;
+ if (wi == -1) /* '(A)' edit descriptor */
+ wi = length;
+
+ w = wi;
+
+ s = gfc_alloca (w);
+
+ /* Read in w bytes, treating comma as not a separator. */
+ dtp->u.p.sf_read_comma = 0;
+ status = read_block_form (dtp, s, &w);
+ dtp->u.p.sf_read_comma =
+ dtp->u.p.decimal_status == DECIMAL_COMMA ? 0 : 1;
+
+ if (status == FAILURE)
+ return;
+ if (w > (size_t) length)
+ s += (w - length);
+
+ m = ((int) w > length) ? length : (int) w;
+
+ dest = p;
+
+ for (n = 0; n < m; n++, dest += 4, s++)
+ dest[endian_off] = *s;
+
+ for (n = 0; n < length - (int) w; n++, dest += 4)
+ dest[endian_off] = ' ';
+}
/* eat_leading_spaces()-- Given a character pointer and a width,
* ignore the leading spaces. */
Index: libgfortran/io/io.h
===================================================================
--- libgfortran/io/io.h (revision 136136)
+++ libgfortran/io/io.h (working copy)
@@ -869,6 +869,9 @@ internal_proto(convert_real);
extern void read_a (st_parameter_dt *, const fnode *, char *, int);
internal_proto(read_a);
+extern void read_a_char4 (st_parameter_dt *, const fnode *, char *, int);
+internal_proto(read_a);
+
extern void read_f (st_parameter_dt *, const fnode *, char *, int);
internal_proto(read_f);
@@ -904,6 +907,9 @@ internal_proto(namelist_write);
extern void write_a (st_parameter_dt *, const fnode *, const char *, int);
internal_proto(write_a);
+extern void write_a_char4 (st_parameter_dt *, const fnode *, const char *, int);
+internal_proto(write_a_char4);
+
extern void write_b (st_parameter_dt *, const fnode *, const char *, int);
internal_proto(write_b);
Index: libgfortran/io/inquire.c
===================================================================
--- libgfortran/io/inquire.c (revision 136136)
+++ libgfortran/io/inquire.c (working copy)
@@ -497,13 +497,13 @@ inquire_via_unit (st_parameter_inquire *
else
switch (u->flags.convert)
{
- /* l8_to_l4_offset is 0 for little-endian, 1 for big-endian. */
+ /* big_endian is 0 for little-endian, 1 for big-endian. */
case GFC_CONVERT_NATIVE:
- p = l8_to_l4_offset ? "BIG_ENDIAN" : "LITTLE_ENDIAN";
+ p = big_endian ? "BIG_ENDIAN" : "LITTLE_ENDIAN";
break;
case GFC_CONVERT_SWAP:
- p = l8_to_l4_offset ? "LITTLE_ENDIAN" : "BIG_ENDIAN";
+ p = big_endian ? "LITTLE_ENDIAN" : "BIG_ENDIAN";
break;
default:
Index: libgfortran/io/transfer.c
===================================================================
--- libgfortran/io/transfer.c (revision 136136)
+++ libgfortran/io/transfer.c (working copy)
@@ -54,6 +54,7 @@ Boston, MA 02110-1301, USA. */
transfer_integer
transfer_logical
transfer_character
+ transfer_character_wide
transfer_real
transfer_complex
@@ -76,6 +77,9 @@ export_proto(transfer_logical);
extern void transfer_character (st_parameter_dt *, void *, int);
export_proto(transfer_character);
+extern void transfer_character_wide (st_parameter_dt *, void *, int, int);
+export_proto(transfer_character_wide);
+
extern void transfer_complex (st_parameter_dt *, void *, int);
export_proto(transfer_complex);
@@ -730,35 +734,49 @@ write_buf (st_parameter_dt *dtp, void *b
static void
unformatted_read (st_parameter_dt *dtp, bt type,
- void *dest, int kind __attribute__((unused)),
- size_t size, size_t nelems)
+ void *dest, int kind, size_t size, size_t nelems)
{
size_t i, sz;
- /* Currently, character implies size=1. */
if (dtp->u.p.current_unit->flags.convert == GFC_CONVERT_NATIVE
- || size == 1 || type == BT_CHARACTER)
+ || size == 1)
{
sz = size * nelems;
+ if (type == BT_CHARACTER)
+ sz *= GFC_SIZE_OF_CHAR_KIND(kind);
read_block_direct (dtp, dest, &sz);
}
else
{
char buffer[16];
char *p;
-
+
+ p = dest;
+
+ /* Handle wide chracters. */
+ if (type == BT_CHARACTER && kind == 4)
+ {
+ sz = (size_t) kind;
+ for (i = 0; i < size * nelems; i++)
+ {
+ read_block_direct (dtp, buffer, &sz);
+ reverse_memcpy (p, buffer, kind);
+ p += kind;
+ }
+ return;
+ }
+
/* Break up complex into its constituent reals. */
if (type == BT_COMPLEX)
{
nelems *= 2;
size /= 2;
}
- p = dest;
/* By now, all complex variables have been split into their
constituent reals. */
- for (i=0; i<nelems; i++)
+ for (i = 0; i < nelems; i++)
{
read_block_direct (dtp, buffer, &size);
reverse_memcpy (p, buffer, size);
@@ -775,20 +793,36 @@ unformatted_read (st_parameter_dt *dtp,
static void
unformatted_write (st_parameter_dt *dtp, bt type,
- void *source, int kind __attribute__((unused)),
- size_t size, size_t nelems)
+ void *source, int kind, size_t size, size_t nelems)
{
if (dtp->u.p.current_unit->flags.convert == GFC_CONVERT_NATIVE ||
- size == 1 || type == BT_CHARACTER)
+ size == 1)
{
- size *= nelems;
- write_buf (dtp, source, size);
+ size_t stride = type == BT_CHARACTER ?
+ size * GFC_SIZE_OF_CHAR_KIND(kind) : size;
+
+ write_buf (dtp, source, stride * nelems);
}
else
{
char buffer[16];
char *p;
- size_t i;
+ size_t i, sz;
+
+ p = source;
+
+ /* Handle wide chracters. */
+ if (type == BT_CHARACTER && kind == 4)
+ {
+ sz = (size_t) kind;
+ for (i = 0; i < size * nelems; i++)
+ {
+ reverse_memcpy(buffer, p, sz);
+ p += sz;
+ write_buf (dtp, buffer, sz);
+ }
+ return;
+ }
/* Break up complex into its constituent reals. */
if (type == BT_COMPLEX)
@@ -797,16 +831,13 @@ unformatted_write (st_parameter_dt *dtp,
size /= 2;
}
- p = source;
-
/* By now, all complex variables have been split into their
constituent reals. */
-
- for (i=0; i<nelems; i++)
+ for (i = 0; i < nelems; i++)
{
reverse_memcpy(buffer, p, size);
- p+= size;
+ p += size;
write_buf (dtp, buffer, size);
}
}
@@ -904,7 +935,7 @@ require_type (st_parameter_dt *dtp, bt e
of the next element, then comes back here to process it. */
static void
-formatted_transfer_scalar (st_parameter_dt *dtp, bt type, void *p, int len,
+formatted_transfer_scalar (st_parameter_dt *dtp, bt type, void *p, int kind,
size_t size)
{
char scratch[SCRATCH_SIZE];
@@ -1004,9 +1035,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_decimal (dtp, f, p, len);
+ read_decimal (dtp, f, p, kind);
else
- write_i (dtp, f, p, len);
+ write_i (dtp, f, p, kind);
break;
@@ -1019,9 +1050,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_radix (dtp, f, p, len, 2);
+ read_radix (dtp, f, p, kind, 2);
else
- write_b (dtp, f, p, len);
+ write_b (dtp, f, p, kind);
break;
@@ -1034,9 +1065,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_radix (dtp, f, p, len, 8);
+ read_radix (dtp, f, p, kind, 8);
else
- write_o (dtp, f, p, len);
+ write_o (dtp, f, p, kind);
break;
@@ -1049,9 +1080,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_radix (dtp, f, p, len, 16);
+ read_radix (dtp, f, p, kind, 16);
else
- write_z (dtp, f, p, len);
+ write_z (dtp, f, p, kind);
break;
@@ -1059,11 +1090,23 @@ formatted_transfer_scalar (st_parameter_
if (n == 0)
goto need_data;
- if (dtp->u.p.mode == READING)
- read_a (dtp, f, p, len);
+ /* It is possible to have FMT_A with something not BT_CHARACTER such
+ as when writing out hollerith strings, so check both type
+ and kind before calling wide character routines. */
+ if (type == BT_CHARACTER && kind == 4)
+ {
+ if (dtp->u.p.mode == READING)
+ read_a_char4 (dtp, f, p, size);
+ else
+ write_a_char4 (dtp, f, p, size);
+ }
else
- write_a (dtp, f, p, len);
-
+ {
+ if (dtp->u.p.mode == READING)
+ read_a (dtp, f, p, size);
+ else
+ write_a (dtp, f, p, size);
+ }
break;
case FMT_L:
@@ -1071,9 +1114,9 @@ formatted_transfer_scalar (st_parameter_
goto need_data;
if (dtp->u.p.mode == READING)
- read_l (dtp, f, p, len);
+ read_l (dtp, f, p, kind);
else
- write_l (dtp, f, p, len);
+ write_l (dtp, f, p, kind);
break;
@@ -1084,9 +1127,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
else
- write_d (dtp, f, p, len);
+ write_d (dtp, f, p, kind);
break;
@@ -1097,9 +1140,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
else
- write_e (dtp, f, p, len);
+ write_e (dtp, f, p, kind);
break;
case FMT_EN:
@@ -1109,9 +1152,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
else
- write_en (dtp, f, p, len);
+ write_en (dtp, f, p, kind);
break;
@@ -1122,9 +1165,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
else
- write_es (dtp, f, p, len);
+ write_es (dtp, f, p, kind);
break;
@@ -1135,9 +1178,9 @@ formatted_transfer_scalar (st_parameter_
return;
if (dtp->u.p.mode == READING)
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
else
- write_f (dtp, f, p, len);
+ write_f (dtp, f, p, kind);
break;
@@ -1148,16 +1191,16 @@ formatted_transfer_scalar (st_parameter_
switch (type)
{
case BT_INTEGER:
- read_decimal (dtp, f, p, len);
+ read_decimal (dtp, f, p, kind);
break;
case BT_LOGICAL:
- read_l (dtp, f, p, len);
+ read_l (dtp, f, p, kind);
break;
case BT_CHARACTER:
- read_a (dtp, f, p, len);
+ read_a (dtp, f, p, kind);
break;
case BT_REAL:
- read_f (dtp, f, p, len);
+ read_f (dtp, f, p, kind);
break;
default:
goto bad_type;
@@ -1166,16 +1209,16 @@ formatted_transfer_scalar (st_parameter_
switch (type)
{
case BT_INTEGER:
- write_i (dtp, f, p, len);
+ write_i (dtp, f, p, kind);
break;
case BT_LOGICAL:
- write_l (dtp, f, p, len);
+ write_l (dtp, f, p, kind);
break;
case BT_CHARACTER:
- write_a (dtp, f, p, len);
+ write_a (dtp, f, p, kind);
break;
case BT_REAL:
- write_d (dtp, f, p, len);
+ write_d (dtp, f, p, kind);
break;
default:
bad_type:
@@ -1404,12 +1447,13 @@ formatted_transfer (st_parameter_dt *dtp
char *tmp;
tmp = (char *) p;
-
+ size_t stride = type == BT_CHARACTER ?
+ size * GFC_SIZE_OF_CHAR_KIND(kind) : size;
/* Big loop over all the elements. */
for (elem = 0; elem < nelems; elem++)
{
dtp->u.p.item_count++;
- formatted_transfer_scalar (dtp, type, tmp + size*elem, kind, size);
+ formatted_transfer_scalar (dtp, type, tmp + stride*elem, kind, size);
}
}
@@ -1462,10 +1506,26 @@ transfer_character (st_parameter_dt *dtp
if (len == 0 && p == NULL)
p = empty_string;
- /* Currently we support only 1 byte chars, and the library is a bit
- confused of character kind vs. length, so we kludge it by setting
- kind = length. */
- dtp->u.p.transfer (dtp, BT_CHARACTER, p, len, len, 1);
+ /* Set kind here to 1. */
+ dtp->u.p.transfer (dtp, BT_CHARACTER, p, 1, len, 1);
+}
+
+void
+transfer_character_wide (st_parameter_dt *dtp, void *p, int len, int kind)
+{
+ static char *empty_string[0];
+
+ if ((dtp->common.flags & IOPARM_LIBRETURN_MASK) != IOPARM_LIBRETURN_OK)
+ return;
+
+ /* Strings of zero length can have p == NULL, which confuses the
+ transfer routines into thinking we need more data elements. To avoid
+ this, we give them a nice pointer. */
+ if (len == 0 && p == NULL)
+ p = empty_string;
+
+ /* Here we pass the actual kind value. */
+ dtp->u.p.transfer (dtp, BT_CHARACTER, p, kind, len, 1);
}
@@ -1519,13 +1579,7 @@ transfer_array (st_parameter_dt *dtp, gf
break;
case GFC_DTYPE_CHARACTER:
iotype = BT_CHARACTER;
- /* FIXME: Currently dtype contains the charlen, which is
- clobbered if charlen > 2**24. That's why we use a separate
- argument for the charlen. However, if we want to support
- non-8-bit charsets we need to fix dtype to contain
- sizeof(chartype) and fix the code below. */
size = charlen;
- kind = charlen;
break;
case GFC_DTYPE_DERIVED:
internal_error (&dtp->common,
@@ -1539,7 +1593,9 @@ transfer_array (st_parameter_dt *dtp, gf
for (n = 0; n < rank; n++)
{
count[n] = 0;
- stride[n] = desc->dim[n].stride;
+ stride[n] = iotype == BT_CHARACTER ?
+ desc->dim[n].stride * GFC_SIZE_OF_CHAR_KIND(kind) :
+ desc->dim[n].stride;
extent[n] = desc->dim[n].ubound + 1 - desc->dim[n].lbound;
/* If the extent of even one dimension is zero, then the entire
@@ -1812,7 +1868,7 @@ data_transfer_init (st_parameter_dt *dtp
if (conv == GFC_CONVERT_NONE)
conv = compile_options.convert;
- /* We use l8_to_l4_offset, which is 0 on little-endian machines
+ /* We use big_endian, which is 0 on little-endian machines
and 1 on big-endian machines. */
switch (conv)
{
@@ -1821,11 +1877,11 @@ data_transfer_init (st_parameter_dt *dtp
break;
case GFC_CONVERT_BIG:
- conv = l8_to_l4_offset ? GFC_CONVERT_NATIVE : GFC_CONVERT_SWAP;
+ conv = big_endian ? GFC_CONVERT_NATIVE : GFC_CONVERT_SWAP;
break;
case GFC_CONVERT_LITTLE:
- conv = l8_to_l4_offset ? GFC_CONVERT_SWAP : GFC_CONVERT_NATIVE;
+ conv = big_endian ? GFC_CONVERT_SWAP : GFC_CONVERT_NATIVE;
break;
default:
Index: libgfortran/io/write.c
===================================================================
--- libgfortran/io/write.c (revision 136136)
+++ libgfortran/io/write.c (working copy)
@@ -122,6 +122,106 @@ write_a (st_parameter_dt *dtp, const fno
#endif
}
+
+/* The primary difference between write_a_char4 and write_a is that we have to
+ deal with writing from the first byte of the 4-byte character and take care
+ of endianess. This currently implements encoding="default" which means we
+ write the lowest significant byte. If the 3 most significant bytes are
+ not representable emit a '?'. TODO: Implement encoding="UTF-8"
+ which will process all 4 bytes and translate to the encoded output. */
+
+/* This macro casts a kind=1 character to a kind=4. */
+#define GFC_CHAR4(x) (*(gfc_char4_t *)&x)
+
+void
+write_a_char4 (st_parameter_dt *dtp, const fnode *f, const char *source, int len)
+{
+ int k, wlen;
+ char *p;
+
+ wlen = f->u.string.length < 0 ? len : f->u.string.length;
+
+#ifdef HAVE_CRLF
+ /* If this is formatted STREAM IO convert any embedded line feed characters
+ to CR_LF on systems that use that sequence for newlines. See F2003
+ Standard sections 10.6.3 and 9.9 for further information. */
+ if (is_stream_io (dtp))
+ {
+ const char crlf[] = "\r\n";
+ int i, j, kk, bytes;
+ q = bytes = 0;
+
+ /* Write out any padding if needed. */
+ if (len < wlen)
+ {
+ p = write_block (dtp, wlen - len);
+ if (p == NULL)
+ return;
+ memset (p, ' ', wlen - len);
+ }
+
+ /* Scan the source string looking for '\n' and convert it if found. */
+ for (i = kk = 0; i < wlen; i++, kk += 4)
+ {
+ if (GFC_CHAR4(source[kk]) == '\n')
+ {
+ /* Write out the previously scanned characters in the string. */
+ if (bytes > 0)
+ {
+ p = write_block (dtp, bytes);
+ if (p == NULL)
+ return;
+ for (j = k = 0; j < bytes; j++, k += 4)
+ p[j] = GFC_CHAR4(source[k]) > 255 ?
+ '?' : source[k + endian_off];
+ bytes = 0;
+ }
+
+ /* Write out the CR_LF sequence. */
+ p = write_block (dtp, 2);
+ if (p == NULL)
+ return;
+ memcpy (p, crlf, 2);
+ }
+ else
+ bytes++;
+ }
+
+ /* Write out any remaining bytes if no LF was found. */
+ if (bytes > 0)
+ {
+ p = write_block (dtp, bytes);
+ if (p == NULL)
+ return;
+ for (j = k = 0; j < bytes; j++, k += 4)
+ p[j] = GFC_CHAR4(source[k]) > 255 ? '?' : source[k + endian_off];
+ }
+ }
+ else
+ {
+#endif
+ int j;
+ p = write_block (dtp, wlen);
+ if (p == NULL)
+ return;
+
+ if (wlen < len)
+ for (j = k = 0; j < wlen; j++, k += 4)
+ p[j] = GFC_CHAR4(source[k]) > 255 ?
+ '?' : source[k + endian_off];
+ else
+ {
+ memset (p, ' ', wlen - len);
+ for (j = k = wlen - len; j < wlen; j++, k += 4)
+ p[j] = GFC_CHAR4(source[k]) > 255 ?
+ '?' : source[k + endian_off];
+ }
+#ifdef HAVE_CRLF
+ }
+#endif
+}
+
+
static GFC_INTEGER_LARGEST
extract_int (const void *p, int len)
{
@@ -635,9 +735,9 @@ write_integer (st_parameter_dt *dtp, con
the strings if the file has been opened in that mode. */
static void
-write_character (st_parameter_dt *dtp, const char *source, int length)
+write_character (st_parameter_dt *dtp, const char *source, int kind, int length)
{
- int i, extra;
+ int i, j, extra;
char *p, d;
switch (dtp->u.p.delim_status)
@@ -653,35 +753,75 @@ write_character (st_parameter_dt *dtp, c
break;
}
- if (d == ' ')
- extra = 0;
- else
+ if (kind == 1)
{
- extra = 2;
+ if (d == ' ')
+ extra = 0;
+ else
+ {
+ extra = 2;
- for (i = 0; i < length; i++)
- if (source[i] == d)
- extra++;
- }
+ for (i = 0; i < length; i++)
+ if (source[i] == d)
+ extra++;
+ }
- p = write_block (dtp, length + extra);
- if (p == NULL)
- return;
+ p = write_block (dtp, length + extra);
+ if (p == NULL)
+ return;
+
+ if (d == ' ')
+ memcpy (p, source, length);
+ else
+ {
+ *p++ = d;
- if (d == ' ')
- memcpy (p, source, length);
+ for (i = 0; i < length; i++)
+ {
+ *p++ = source[i];
+ if (source[i] == d)
+ *p++ = d;
+ }
+
+ *p = d;
+ }
+ }
else
{
- *p++ = d;
-
- for (i = 0; i < length; i++)
+ /* We have to scan the source string looking for delimiters to determine
+ how large the write block needs to be. */
+ if (d == ' ')
+ extra = 0;
+ else
{
- *p++ = source[i];
- if (source[i] == d)
- *p++ = d;
+ extra = 2;
+
+ for (i = j = 0; i < length; i++, j += 4)
+ if (GFC_CHAR4(source[j]) == (gfc_char4_t) d)
+ extra++;
}
- *p = d;
+ p = write_block (dtp, length + extra);
+ if (p == NULL)
+ return;
+
+ if (d == ' ')
+ for (i = j = 0; i < length; i++, j += 4)
+ p[i] = GFC_CHAR4(source[j]) > 255 ?
+ '?' : source[j + endian_off];
+ else
+ {
+ *p++ = d;
+
+ for (i = j = 0; i < length; i++, j += 4)
+ {
+ *p++ = GFC_CHAR4(source[j]) > 255 ?
+ '?' : source[i * kind + endian_off];
+ if (GFC_CHAR4(source[j]) == (gfc_char4_t) d)
+ *p++ = d;
+ }
+ *p = d;
+ }
}
}
@@ -792,7 +932,7 @@ list_formatted_write_scalar (st_paramete
write_logical (dtp, p, kind);
break;
case BT_CHARACTER:
- write_character (dtp, p, kind);
+ write_character (dtp, p, kind, size);
break;
case BT_REAL:
write_real (dtp, p, kind);
@@ -814,6 +954,8 @@ list_formatted_write (st_parameter_dt *d
{
size_t elem;
char *tmp;
+ size_t stride = type == BT_CHARACTER ?
+ size * GFC_SIZE_OF_CHAR_KIND(kind) : size;
tmp = (char *) p;
@@ -821,7 +963,7 @@ list_formatted_write (st_parameter_dt *d
for (elem = 0; elem < nelems; elem++)
{
dtp->u.p.item_count++;
- list_formatted_write_scalar (dtp, type, tmp + size*elem, kind, size);
+ list_formatted_write_scalar (dtp, type, tmp + elem * stride, kind, size);
}
}
@@ -885,9 +1027,9 @@ nml_write_obj (st_parameter_dt *dtp, nam
if (obj->type != GFC_DTYPE_DERIVED)
{
#ifdef HAVE_CRLF
- write_character (dtp, "\r\n ", 3);
+ write_character (dtp, "\r\n ", 1, 3);
#else
- write_character (dtp, "\n ", 2);
+ write_character (dtp, "\n ", 1, 2);
#endif
len = 0;
if (base)
@@ -896,15 +1038,15 @@ nml_write_obj (st_parameter_dt *dtp, nam
for (dim_i = 0; dim_i < (index_type) strlen (base_name); dim_i++)
{
cup = toupper (base_name[dim_i]);
- write_character (dtp, &cup, 1);
+ write_character (dtp, &cup, 1, 1);
}
}
for (dim_i =len; dim_i < (index_type) strlen (obj->var_name); dim_i++)
{
cup = toupper (obj->var_name[dim_i]);
- write_character (dtp, &cup, 1);
+ write_character (dtp, &cup, 1, 1);
}
- write_character (dtp, "=", 1);
+ write_character (dtp, "=", 1, 1);
}
/* Counts the number of data output on a line, including names. */
@@ -974,7 +1116,7 @@ nml_write_obj (st_parameter_dt *dtp, nam
if (rep_ctr > 1)
{
sprintf(rep_buff, " %d*", rep_ctr);
- write_character (dtp, rep_buff, strlen (rep_buff));
+ write_character (dtp, rep_buff, 1, strlen (rep_buff));
dtp->u.p.no_leading_blank = 1;
}
num++;
@@ -999,7 +1141,7 @@ nml_write_obj (st_parameter_dt *dtp, nam
dtp->u.p.delim_status = DELIM_QUOTE;
if (dtp->u.p.nml_delim == '\'')
dtp->u.p.delim_status = DELIM_APOSTROPHE;
- write_character (dtp, p, obj->string_length);
+ write_character (dtp, p, 1, obj->string_length);
dtp->u.p.delim_status = tmp_delim;
break;
@@ -1089,14 +1231,14 @@ nml_write_obj (st_parameter_dt *dtp, nam
to column 2. Reset the repeat counter. */
dtp->u.p.no_leading_blank = 0;
- write_character (dtp, &semi_comma, 1);
+ write_character (dtp, &semi_comma, 1, 1);
if (num > 5)
{
num = 0;
#ifdef HAVE_CRLF
- write_character (dtp, "\r\n ", 3);
+ write_character (dtp, "\r\n ", 1, 3);
#else
- write_character (dtp, "\n ", 2);
+ write_character (dtp, "\n ", 1, 2);
#endif
}
rep_ctr = 1;
@@ -1160,13 +1302,13 @@ namelist_write (st_parameter_dt *dtp)
/* Temporarily disable namelist delimters. */
dtp->u.p.delim_status = DELIM_NONE;
- write_character (dtp, "&", 1);
+ write_character (dtp, "&", 1, 1);
/* Write namelist name in upper case - f95 std. */
for (i = 0 ;i < dtp->namelist_name_len ;i++ )
{
c = toupper (dtp->namelist_name[i]);
- write_character (dtp, &c ,1);
+ write_character (dtp, &c, 1 ,1);
}
if (dtp->u.p.ionml != NULL)
@@ -1180,9 +1322,9 @@ namelist_write (st_parameter_dt *dtp)
}
#ifdef HAVE_CRLF
- write_character (dtp, " /\r\n", 5);
+ write_character (dtp, " /\r\n", 1, 5);
#else
- write_character (dtp, " /\n", 4);
+ write_character (dtp, " /\n", 1, 4);
#endif
/* Restore the original delimiter. */
! { dg-do run }
! Wide chracter I/O test 1, formatted and mixed kind
! Test case developed by Jerry DeLisle <jvdelisle@gcc.gnu.org>
program test1
integer, parameter :: k4 = 4
character(len=10,kind=4) :: wide
character(len=10,kind=1) :: thin
character(kind=1,len=25) :: buffer
wide=k4_"Goodbye!"
thin="Hello!"
write(buffer, '(a)') wide
if (buffer /= "Goodbye!") call abort
open(10, form="formatted", access="stream", status="scratch")
write(10, '(a)') thin
rewind(10)
read(10, '(a)') wide
if (wide /= k4_"Hello!") call abort
write(buffer,*) thin, ">",wide,"<"
if (buffer /= " Hello! >Hello! <") call abort
end program test1
! { dg-do run }
! Wide chracter I/O test 2, formatted array write and read
! Test case developed by Jerry DeLisle <jvdelisle@gcc.gnu.org>
program chkdata
integer, parameter :: k4=4
character(len=7, kind=k4), dimension(3) :: mychar
character(50) :: buffer
mychar(1) = k4_"abc1234"
mychar(2) = k4_"def5678"
mychar(3) = k4_"ghi9012"
buffer = ""
write(buffer,'(3(a))') mychar(2:3), mychar(1)
if (buffer /= "def5678ghi9012abc1234") call abort
write(buffer,'(3(a))') mychar
if (buffer /= "abc1234def5678ghi9012") call abort
mychar = ""
read(buffer,'(3(a))') mychar
if (any(mychar.ne.[ k4_"abc1234",k4_"def5678",k4_"ghi9012" ])) call abort
end program chkdata
! { dg-do run }
! Wide chracter I/O test 3, unformatted arrays
! Test case developed by Jerry DeLisle <jvdelisle@gcc.gnu.org>
program test1
integer, parameter :: k4 = 4
character(len=10,kind=4) :: wide
character(len=10,kind=4), dimension(5,7) :: widearray
wide = k4_"abcdefg"
widearray = k4_"1234abcd"
open(10, form="unformatted", status="scratch")
write(10) wide
rewind(10)
wide = "wrong"
read(10) wide
if (wide /= k4_"abcdefg") call abort
rewind(10)
write(10) widearray(2:4,3:7)
widearray(2:4,3:7)=""
rewind(10)
read(10) widearray(2:4,3:7)
close(10)
if (any(widearray.ne.k4_"1234abcd")) call abort
end program test1
! { dg-do run }
! { dg-options -fbackslash }
! Wide chracter I/O test 4, formatted ISO-8859-1 characters in string
! Test case developed by Jerry DeLisle <jvdelisle@gcc.gnu.org>
! Compile with -fbackslash
integer, parameter :: k4 = 4
character(kind=1,len=15) :: buffer
character(kind=1, len=1) :: c1, c2
character(kind=4,len=20) :: str = k4_'X\xF8öABC' ! ISO-8859-1 encoded string
buffer = ""
write(buffer,'(3a)')':',trim(str),':'
if (buffer.ne.':X\xF8öABC: ') call abort
str = ""
read(buffer,'(3a)') c1,str(1:6),c2
if (c1.ne.':') call abort
if (str.ne.k4_'X\xF8öAB') call abort
if (c2.ne.'C') call abort
end