This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gfortran,patch] Don't call library function for string copying, take 2


Francois_Xavier,

OK to commit.

I would suggest a follow up.....

> There is a huge (positive) performance impact of this on at
> least one of the polyhedron benchmarks, aermod.

This being the case, it would be as well to get rid of the call to memmove.
If removing one level of function call improves matters so much, I guess
that getting rid of the second level will achieve at least the same again.
Not only is memmove purported to be slow but, there is potentially a call
to memset to be eliminated too.

I wrote the following as part of the TR15581 patch:

I needed to do an inline memcpy and so hijacked some of the scalarizer
functions. var is an array and we enter with tmp containing the number of
elements to transfer, or whatever.

+      nelems = gfc_evaluate_now (tmp, &fnblock);
+      index = gfc_create_var (gfc_array_index_type, "S");
+
+      /* Build the body of the loop.  */
+      gfc_init_block (&loopbody);
+      tmp = gfc_build_array_ref (var, index);

    tmp = var[index]
+
+      if (purpose == COPY_ALLOC_COMP)
+        tmp = structure_alloc_comps (der_type, tmp,
+				     gfc_build_array_ref (dest, index),
+				     0, purpose);
+      else
+        tmp = structure_alloc_comps (der_type, tmp, NULL_TREE, 0, purpose);

structure_alloc_comps is a rather heavy bit of code for manipulating derived
types with allocatable components.  You, however, could add another array ref
and do an assignment.

I believe that this would work:
       gfc_add_modify_expr (&loopbody, gfc_build_array_ref (var1, index),
                                       gfc_build_array_ref (var2, index));

+
+      gfc_add_expr_to_block (&loopbody, tmp);

.... dropping the previous line.
+
+      /* Build the loop and return. */
+      gfc_init_loopinfo (&loop);
+      loop.dimen = 1;
+      loop.from[0] = gfc_index_zero_node;
+      loop.loopvar[0] = index;
+      loop.to[0] = nelems;
+      gfc_trans_scalarizing_loops (&loop, &loopbody);
+      gfc_add_block_to_block (&fnblock, &loop.pre);
+      return gfc_finish_block (&fnblock);

Thus an inline memcpy is very easily accomplished, as long as the cast is
performed on var1 and var2 to turn them into char[0:nelems-1].... or maybe
not; just make sure that var1 is cast to var2.

An inline memmove could then be accomplished as

  if (&var1[0] == &var2[0])
    do nothing
  else if (&var1[0] < &var2[0])
    for (loopvar == from; loopvar < to; loopvar++)
	var1[loopvar] = var2[loopvar];
  else
    for (loopvar == from; loopvar < to; loopvar++)
      {
        idx = nelems - loopvar + 1;
        var1[idx] = var2[idx];
      }

Hmmm! I wonder if this would not improve gfortran's performance radically
when handling array constructors? Both calls to MEMCPY in trans-array.c
are associated with them.  It would be very noticable for small arrays and,
if I recall correctly they are found in several of the Polyhedron tests.

Best regards

Paul 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]