This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch, fortran] PR25746 - elemental subroutine dependency checking - redux


Dear All,

I should explain that Outlook normally seems to emit non-mime characters that gnu-central does not like. Changing to plain text drops all references to wrap around. I thought that I had it fixed but apparently not. Sorry about that.

If this is screwed up, I will resubmit tonight.... well, here it is again.

This patch fixes PR25746 by adding dependency checking to the translation of elemental subroutine calls. Quite aside from the immediate need of the PR to ensure that user defined assignments work correctly, the standard requires that elemental subroutines produce the same result, independent of the loop order and this mandates the dependency checking.

The checking is accomplished by calling a new function gfc_conv_elemental_dependencies before starting the scalarizing body and calling the elemental subroutine itself. This has the advantage that the loopinfo and the ss's for each actual argument are available and the information that they contain is used as much as possible.

The new function proceeds as follows:

It loops over the actual arguments and, if the INTENT is OUT or INOUT, it checks for dependency with each of the arguments that are not INTENT(OUT). I have modified dependency.c(gfc_check_fncall_dependency) to prevent the argument expression from being checked against itself.

If there is a dependency, a temporary loopinfo is made, ready for the call to gfc_trans_create_temp_array. This is done to prevent this function from resetting the loop and thereby necessitating the renormalisation of the ss->data.info's for each of the arguments. Once the temporary is obtained, the descriptor is obtained afresh, so that the result can be correctly unpacked after the subroutine call.

In the case of an INTENT(INOUT) argument, the original data has to be packed and copied to the temporary. I have signalled a TODO here; the potential creation of two temporaries could be eliminated and this is something that I intend to contribute after I have fixed the array TRANSFER intrinsic. This will require a new library routine that can be employed here. In the mean time, this patch works correctly (I hope!), even if it is a bit inefficient in a small number of cases.

As well as copying the packed data to the temporary used for the subroutine call, this part of the code frees the temporary potentially produced by internal_pack.

Following this, the scalarizer is satisfied by adding the offset for the temporary to the ss->data.info, using the loop index ranges. Finally, the result is copied to the destination array and the temporary freed.

As well as testing the fix for the original PR, the testcase checks various kinds of assignment, including >1D arrays, and tests that INTENT(INOUT) arguments are correctly handled. It has been corrected to be conformant (thanks, Dominique).

Regtested on FC5/Athlon. OK for trunk and 4.1?

Paul

2006-04-21 Paul Thomas <pault@gcc.gnu.org>

   PR fortran/25746
   * dependency.c (gfc_check_fncall_dependency): Don't check other
   against itself.
   * trans-stmt.c (gfc_conv_elemental_dependencies): New function.
   (gfc_trans_call): Call it.

2006-04-21 Paul Thomas <pault@gcc.gnu.org>

   PR fortran/25746
   * gfortran.dg/elemental_subroutine_3.f90: New test.



Index: gcc/fortran/trans-stmt.c
===================================================================
--- gcc/fortran/trans-stmt.c	(révision 113111)
+++ gcc/fortran/trans-stmt.c	(copie de travail)
@@ -199,6 +199,155 @@
 }
 
 
+/* Check for dependencies between INTENT(IN) and INTENT([IN]OUT) arguments of
+   elemental subroutines.  Make temporaries for input arguments if any such
+   dependencies are found.  */
+static void
+gfc_conv_elemental_dependencies (gfc_se * se, gfc_se * loopse,
+				 gfc_symbol * sym, gfc_actual_arglist * arg)
+{
+  gfc_actual_arglist *arg0;
+  gfc_expr *e;
+  gfc_formal_arglist *formal;
+  gfc_loopinfo tmp_loop;
+  gfc_se parmse;
+  gfc_ss *ss;
+  gfc_ss_info *info;
+  gfc_symbol *fsym;
+  int n;
+  stmtblock_t block;
+  tree data;
+  tree offset;
+  tree ptr;
+  tree size;
+  tree stmt;
+  tree tmp;
+
+  if (loopse->ss == NULL)
+    return;
+
+  ss = loopse->ss;
+  arg0 = arg;
+  formal = sym->formal;
+
+  /* Loop over all the arguments testing for dependencies.  */
+  for (; arg != NULL; arg = arg->next, formal = formal ? formal->next : NULL)
+    {
+      e = arg->expr;
+      if (e == NULL)
+	continue;
+
+      /* Obtain the info structure for the current argument.  */ 
+      info = NULL;
+      for (ss = loopse->ss; ss && ss != gfc_ss_terminator; ss = ss->next)
+	{
+	  if (ss->expr != e)
+	    continue;
+	  info = &ss->data.info;
+	  break;
+	}
+
+      /* If there is a dependency, create a temporary and use it
+	 instead of the variable. */
+      fsym = formal ? formal->sym : NULL;
+      if (e->expr_type == EXPR_VARIABLE
+	    && e->rank && fsym
+	    && fsym->attr.intent != INTENT_IN
+	    && gfc_check_fncall_dependency (e, INTENT_OUT, sym, arg0))
+	{
+	  /* Make a local loopinfo for the temporary creation, so that
+	     none of the other ss->info's have to be renormalized.  */
+	  gfc_init_loopinfo (&tmp_loop);
+	  for (n = 0; n < info->dimen; n++)
+	    {
+	      tmp_loop.to[n] = loopse->loop->to[n];
+	      tmp_loop.from[n] = loopse->loop->from[n];
+	      tmp_loop.order[n] = loopse->loop->order[n];
+	    }
+
+	  /* Generate the temporary.  Merge the block so that the
+	     declarations are put at the right binding level.  */
+	  size = gfc_create_var (gfc_array_index_type, NULL);
+	  data = gfc_create_var (pvoid_type_node, NULL);
+	  gfc_start_block (&block);
+	  tmp = gfc_typenode_for_spec (&e->ts);
+	  tmp = gfc_trans_create_temp_array (&se->pre, &se->post,
+					      &tmp_loop, info, tmp,
+					      false, true, false);
+	  gfc_add_modify_expr (&se->pre, size, tmp);
+	  tmp = fold_convert (pvoid_type_node, info->data);
+	  gfc_add_modify_expr (&se->pre, data, tmp);
+	  gfc_merge_block_scope (&block);
+
+	  /* Obtain the argument descriptor for packing/unpacking.  */
+	  gfc_init_se (&parmse, NULL);
+	  parmse.want_pointer = 1;
+	  gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
+	  gfc_add_block_to_block (&se->pre, &parmse.pre);
+
+	  /* Copy the data to the temporary for INTENT(INOUT).  
+	     TODO: Suppress the temporary memory allocation above,
+	     if the internal_pack allocates memory. */
+	  if (fsym->attr.intent == INTENT_INOUT)
+	    {
+	      /* Repack the array.  */
+	      tmp = gfc_chainon_list (NULL_TREE, parmse.expr);
+	      ptr = build_function_call_expr (gfor_fndecl_in_pack, tmp);
+	      ptr = gfc_evaluate_now (ptr, &se->pre);
+
+	      /* Copy the data to the temporary.  */
+	      tmp = gfc_chainon_list (NULL_TREE, data);
+	      tmp = gfc_chainon_list (tmp, ptr);
+	      tmp = gfc_chainon_list (tmp, size);
+	      tmp = build_function_call_expr
+			(built_in_decls[BUILT_IN_MEMCPY], tmp);
+	      tmp = gfc_evaluate_now (tmp, &se->pre);
+
+	      /* If necessary, free the temporary afterwards.  */
+	      gfc_start_block (&block);
+	      tmp = convert (pvoid_type_node, ptr);
+	      tmp = gfc_chainon_list (NULL_TREE, tmp);
+	      tmp = build_function_call_expr
+			(gfor_fndecl_internal_free, tmp);
+	      gfc_add_expr_to_block (&block, tmp);
+	      stmt = gfc_finish_block (&block);
+
+	      gfc_init_block (&block);
+	      tmp = build_fold_indirect_ref (parmse.expr);
+	      tmp = gfc_conv_array_data (tmp);
+	      tmp = build2 (NE_EXPR, boolean_type_node, ptr, tmp);
+	      tmp = build3_v (COND_EXPR, tmp, stmt, build_empty_stmt ());
+
+	      gfc_add_expr_to_block (&block, tmp);
+	      gfc_add_block_to_block (&se->pre, &block);
+	    }
+
+	  /* Calculate the offset for the temporary.  */
+	  offset = gfc_index_zero_node;
+	  for (n = 0; n < info->dimen; n++)
+	    {
+	      tmp = gfc_conv_descriptor_stride (info->descriptor,
+						gfc_rank_cst[n]);
+	      tmp = fold_build2 (MULT_EXPR, gfc_array_index_type,
+				 loopse->loop->from[n], tmp);
+	      offset = fold_build2 (MINUS_EXPR, gfc_array_index_type,
+					  offset, tmp);
+	    }
+	  info->offset = gfc_create_var (gfc_array_index_type, NULL);	  
+	  gfc_add_modify_expr (&se->pre, info->offset, offset);
+
+	  /* Copy the result back using unpack.  */
+	  tmp = gfc_chainon_list (NULL_TREE, parmse.expr);
+	  tmp = gfc_chainon_list (tmp, data);
+	  tmp = build_function_call_expr (gfor_fndecl_in_unpack, tmp);
+	  gfc_add_expr_to_block (&se->post, tmp);
+
+	  gfc_add_block_to_block (&se->post, &parmse.post);
+	}
+    }
+}
+
+
 /* Translate the CALL statement.  Builds a call to an F95 subroutine.  */
 
 tree
@@ -269,11 +418,15 @@
       gfc_conv_loop_setup (&loop);
       gfc_mark_ss_chain_used (ss, 1);
 
+      /* Convert the arguments, checking for dependencies.  */
+      gfc_copy_loopinfo_to_se (&loopse, &loop);
+      loopse.ss = ss;
+      gfc_conv_elemental_dependencies (&se, &loopse, code->resolved_sym,
+				       code->ext.actual);
+
       /* Generate the loop body.  */
       gfc_start_scalarized_body (&loop, &body);
       gfc_init_block (&block);
-      gfc_copy_loopinfo_to_se (&loopse, &loop);
-      loopse.ss = ss;
 
       /* Add the subroutine call to the block.  */
       gfc_conv_function_call (&loopse, code->resolved_sym, code->ext.actual);
@@ -287,6 +440,7 @@
       gfc_trans_scalarizing_loops (&loop, &body);
       gfc_add_block_to_block (&se.pre, &loop.pre);
       gfc_add_block_to_block (&se.pre, &loop.post);
+      gfc_add_block_to_block (&se.pre, &se.post);
       gfc_cleanup_loop (&loop);
     }
 
Index: gcc/fortran/dependency.c
===================================================================
--- gcc/fortran/dependency.c	(révision 113111)
+++ gcc/fortran/dependency.c	(copie de travail)
@@ -513,6 +513,10 @@
       if (!expr)
 	continue;
 
+      /* Skip other itself.  */
+      if (expr == other)
+	continue;
+
       /* Skip intent(in) arguments if OTHER itself is intent(in).  */
       if (formal
 	  && intent == INTENT_IN
Index: gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90
===================================================================
--- gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90	(révision 0)
+++ gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90	(révision 0)
@@ -0,0 +1,70 @@
+! { dg-do run }
+! Test the fix for PR25746, in which dependency checking was not being
+! done for elemental subroutines and therefore for interface assignments.
+!
+! This test is based on
+! http://home.comcast.net/~kmbtib/Fortran_stuff/elem_assign.f90
+! as reported by Harald Anlauf <anlauf@gmx.de> in the PR.
+! 
+module elem_assign
+   implicit none
+   type mytype
+      integer x
+   end type mytype
+   interface assignment(=)
+      module procedure myassign
+   end interface assignment(=)
+   contains
+      elemental subroutine myassign(x,y)
+         type(mytype), intent(out) :: x
+         type(mytype), intent(in) :: y
+! Multiply the components by 2 to verify that this is being called.
+         x%x = y%x*2
+      end subroutine myassign
+end module elem_assign
+
+program test
+   use elem_assign
+   implicit none
+   type(mytype) :: y(6), x(6) = (/mytype(1),mytype(20),mytype(300),&
+                                  mytype(4000),mytype(50000),&
+				  mytype(1000000)/)
+   type(mytype) :: z(2, 3)
+! The original case - dependency between lhs and rhs. 
+   x = x((/2,3,1,4,5,6/))
+   if (any(x%x .ne. (/40, 600, 2, 8000, 100000, 2000000/))) call abort ()
+! Slightly more elborate case with non-trivial array ref on lhs.
+   x(4:1:-1) = x((/1,3,2,4/))
+   if (any(x%x .ne. (/16000, 1200, 4, 80, 100000, 2000000/))) call abort ()
+! Check that no-dependence case works....
+   y = x
+   if (any(y%x .ne. (/32000, 2400, 8, 160, 200000, 4000000/))) call abort ()
+! ...and now a case that caused headaches during the preparation of the patch
+   x(2:5) = x(1:4)
+   if (any(x%x .ne. (/16000, 32000, 2400, 8, 160, 2000000/))) call abort ()
+! Check offsets are done correctly in multi-dimensional cases
+   z = reshape (x, (/2,3/))
+   z(:, 3:2:-1) = z(:, 1:2)
+   y = reshape (z, (/6/))
+   if (any(y%x .ne. (/ 64000, 128000, 19200, 64, 128000, 256000/))) call abort ()
+! Check an INTENT(INOUT) case
+   call nonassign (x, x((/2,3,1,4,5,6/)))
+   if (any(x%x .ne. (/-96000, 7200, -48000, -24, -480, 6000000/))) call abort ()
+! ...and again
+   call nonassign (x(2:3), x(1:2))
+   if (any(x%x .ne. (/-96000, 288000, -21600, -24, -480, 6000000/))) call abort ()
+contains
+  elemental subroutine nonassign(x,y)
+         type(mytype), intent(INout) :: x
+         type(mytype), intent(in) :: y
+! Change the sign according to the input, to verify that INOUT is working.
+	 if (x%x > 20000) then
+           x%x = y%x*3
+	 else
+	   x%x = -y%x*3
+         end if
+  end subroutine nonassign
+end program test
+
+! { dg-final { cleanup-modules "elem_assign" } }
+

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]