This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch, fortran] Wide character I/O Part 1.1 Round 2


Hi Jerry,

Sorry I took so long to review your patch. Still not fully OK for me, but we're definitely very close to it. We discussed it on IRC already, but for the record: I suggest you simplify things by dumping endian_off completely (it's not needed, and adds loads from memory and additions) and access things by casting them to the right type. Here's the detailed review:


+	    for (i = j = 0; i < m; i++, j += 4)
+	      q[j + endian_off] = dtp->u.p.saved_string[i];

If you do that, you need to be absolutely sure that all bytes in the saved_string has prevously been zero'ed. Is it so? But I don't understand why you don't want to do it that way:


for (i = j = 0; i < m; i++, j++)
((gfc_char4_t *) q)[j] = (unsigned char) dtp->u.p.saved_string [i];


+	      for (i = m, j = m * 4; i < (int) size - m; i++, j += 4)
+		q[j + endian_off] = ' ';

Same here, if you do it that way, you need to be sure that q was previously zero'ed out. (And it may break in the future is someone decides to not zero this buffer after all.) I think it'd be best rewritten as:


    for (i = j = m; i < (int) size - m; i++, j++)
      ((gfc_char4_t *) q)[j] = (unsigned char) ' ';

+  for (n = 0; n < m; n++, dest += 4, s++)
+      dest[endian_off] = *s;
+
+  for (n = 0; n < length - (int) w; n++, dest += 4)
+      dest[endian_off] = ' ';

Same again.


+      /* Handle wide chracters.  */
+      if (type == BT_CHARACTER && kind == 4)
+	{
+	  sz = (size_t) kind;
+	  for (i = 0; i < size * nelems; i++)
+	    {
+ 	      read_block_direct (dtp, buffer, &sz);
+ 	      reverse_memcpy (p, buffer, kind);
+ 	      p += kind;
+ 	    }
+	  return;
+	}

Couldn't you just do the same thing as for complex, and have it use the same loop as all others with tweaking the values of nelems and size? Something like:


  if (type == BT_CHARACTER && kind != 1)
    {
      nelems *= size;
      size = kind;
    }

+      /* Handle wide chracters.  */
+      if (type == BT_CHARACTER && kind == 4)
+	{
+	  sz = (size_t) kind;
+	  for (i = 0; i < size * nelems; i++)
+	    {
+	      reverse_memcpy(buffer, p, sz);
+ 	      p += sz;
+	      write_buf (dtp, buffer, sz);
+ 	    }
+	  return;
+	}

Same thing here: do "nelems *= size" and "size = kind", and I believe the "standard" a few lines down will do fine (and avoid code duplication).



+/* This macro casts a kind=1 character to a kind=4.  */
+#define GFC_CHAR4(x) (*(gfc_char4_t *)&x)

The comment is wrong: it takes an element of an array, and returns the 4 bytes that start with this element, as an character(kind=4). But that's a bit convoluted...


+ /* Scan the source string looking for '\n' and convert it if found. */
+ for (i = kk = 0; i < wlen; i++, kk += 4)
+ {
+ if (GFC_CHAR4(source[kk]) == '\n')

It's weird to do it that way; I think the code gives the good results in all cases (so I'm not objecting to it strongly), but I think it's certainly hard to read. I'd be tempted to have kk go by increments of 1 instead of 4, and do:


if (((gfc_char4_t *)source)[kk] == (unsigned char) '\n')

(I'm not even sure if the "unsigned char" cast is needed, but it can't hurt.)

+ {
+ /* Write out the previously scanned characters in the string. */
+ if (bytes > 0)
+ {
+ p = write_block (dtp, bytes);
+ if (p == NULL)
+ return;
+ for (j = k = 0; j < bytes; j++, k += 4)

The you'd also make k go by increments of 1.


+		    p[j] = GFC_CHAR4(source[k]) > 255 ?
+			   '?' : source[k + endian_off];

And that would be:


p[j] = (((gfc_char4_t *) source)[k] > 255 ? '?' : (unsigned char) ((gfc_char4_t *) source)[k]);

Once again, remove the need for endian_off and let unsigned arithmetics wrapping do its job.

+ for (j = k = 0; j < bytes; j++, k += 4)
+ p[j] = GFC_CHAR4(source[k]) > 255 ? '?' : source[k + endian_off];

And again, same thing here.


+	for (j = k = 0; j < wlen; j++, k += 4)
+	  p[j] = GFC_CHAR4(source[k]) > 255 ?
+		 '?' : source[k + endian_off];

And there. (Macro?)


+	    p[j] = GFC_CHAR4(source[k]) > 255 ?
+		   '?' : source[k + endian_off];

And there. (Macro, for sure!)


+	  for (i = j = 0; i < length; i++, j += 4)
+	    if (GFC_CHAR4(source[j]) == (gfc_char4_t) d)
+	      extra++;

Same here. Actually, I believe you could have a macro that looks up for a character in a char4 string:


#define CHAR4_AT_POS(string,pos) (((gfc_char4_t *) (string))[pos])

That way, you could use directly CHAR4_AT_POS(source, j)

and same thing everywhere you used GFC_CHAR4. That'd be way clearer (at least to me, but other opinions are welcome!).


+	for (i = j = 0; i < length; i++, j += 4)
+	  p[i] = GFC_CHAR4(source[j]) > 255 ?
+		 '?' : source[j + endian_off];

Again.


+	  for (i = j = 0; i < length; i++, j += 4)
+	    {
+	      *p++ = GFC_CHAR4(source[j]) > 255 ?
+		     '?' : source[i * kind + endian_off];
+	      if (GFC_CHAR4(source[j]) == (gfc_char4_t) d)
+		*p++ = d;
+	    }

And again.



FX


--
François-Xavier Coudert
http://www.homepages.ucl.ac.uk/~uccafco/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]