This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[Patch, fortran] PR25746 - elemental subroutine dependency checking - redux
- From: Paul Thomas <paulthomas2 at wanadoo dot fr>
- To: patch <gcc-patches at gcc dot gnu dot org>, "'fortran at gcc dot gnu dot org'" <fortran at gcc dot gnu dot org>, Steve Kargl <sgk at troutmask dot apl dot washington dot edu>
- Date: Fri, 21 Apr 2006 20:29:26 +0200
- Subject: [Patch, fortran] PR25746 - elemental subroutine dependency checking - redux
Dear All,
I should explain that Outlook normally seems to emit non-mime characters
that gnu-central does not like. Changing to plain text drops all
references to wrap around. I thought that I had it fixed but apparently
not. Sorry about that.
If this is screwed up, I will resubmit tonight.... well, here it is again.
This patch fixes PR25746 by adding dependency checking to the
translation of elemental subroutine calls. Quite aside from the
immediate need of the PR to ensure that user defined assignments work
correctly, the standard requires that elemental subroutines produce the
same result, independent of the loop order and this mandates the
dependency checking.
The checking is accomplished by calling a new function
gfc_conv_elemental_dependencies before starting the scalarizing body and
calling the elemental subroutine itself. This has the advantage that the
loopinfo and the ss's for each actual argument are available and the
information that they contain is used as much as possible.
The new function proceeds as follows:
It loops over the actual arguments and, if the INTENT is OUT or INOUT,
it checks for dependency with each of the arguments that are not
INTENT(OUT). I have modified dependency.c(gfc_check_fncall_dependency)
to prevent the argument expression from being checked against itself.
If there is a dependency, a temporary loopinfo is made, ready for the
call to gfc_trans_create_temp_array. This is done to prevent this
function from resetting the loop and thereby necessitating the
renormalisation of the ss->data.info's for each of the arguments. Once
the temporary is obtained, the descriptor is obtained afresh, so that
the result can be correctly unpacked after the subroutine call.
In the case of an INTENT(INOUT) argument, the original data has to be
packed and copied to the temporary. I have signalled a TODO here; the
potential creation of two temporaries could be eliminated and this is
something that I intend to contribute after I have fixed the array
TRANSFER intrinsic. This will require a new library routine that can be
employed here. In the mean time, this patch works correctly (I hope!),
even if it is a bit inefficient in a small number of cases.
As well as copying the packed data to the temporary used for the
subroutine call, this part of the code frees the temporary potentially
produced by internal_pack.
Following this, the scalarizer is satisfied by adding the offset for the
temporary to the ss->data.info, using the loop index ranges. Finally,
the result is copied to the destination array and the temporary freed.
As well as testing the fix for the original PR, the testcase checks
various kinds of assignment, including >1D arrays, and tests that
INTENT(INOUT) arguments are correctly handled. It has been corrected to
be conformant (thanks, Dominique).
Regtested on FC5/Athlon. OK for trunk and 4.1?
Paul
2006-04-21 Paul Thomas <pault@gcc.gnu.org>
PR fortran/25746
* dependency.c (gfc_check_fncall_dependency): Don't check other
against itself.
* trans-stmt.c (gfc_conv_elemental_dependencies): New function.
(gfc_trans_call): Call it.
2006-04-21 Paul Thomas <pault@gcc.gnu.org>
PR fortran/25746
* gfortran.dg/elemental_subroutine_3.f90: New test.
Index: gcc/fortran/trans-stmt.c
===================================================================
--- gcc/fortran/trans-stmt.c (révision 113111)
+++ gcc/fortran/trans-stmt.c (copie de travail)
@@ -199,6 +199,155 @@
}
+/* Check for dependencies between INTENT(IN) and INTENT([IN]OUT) arguments of
+ elemental subroutines. Make temporaries for input arguments if any such
+ dependencies are found. */
+static void
+gfc_conv_elemental_dependencies (gfc_se * se, gfc_se * loopse,
+ gfc_symbol * sym, gfc_actual_arglist * arg)
+{
+ gfc_actual_arglist *arg0;
+ gfc_expr *e;
+ gfc_formal_arglist *formal;
+ gfc_loopinfo tmp_loop;
+ gfc_se parmse;
+ gfc_ss *ss;
+ gfc_ss_info *info;
+ gfc_symbol *fsym;
+ int n;
+ stmtblock_t block;
+ tree data;
+ tree offset;
+ tree ptr;
+ tree size;
+ tree stmt;
+ tree tmp;
+
+ if (loopse->ss == NULL)
+ return;
+
+ ss = loopse->ss;
+ arg0 = arg;
+ formal = sym->formal;
+
+ /* Loop over all the arguments testing for dependencies. */
+ for (; arg != NULL; arg = arg->next, formal = formal ? formal->next : NULL)
+ {
+ e = arg->expr;
+ if (e == NULL)
+ continue;
+
+ /* Obtain the info structure for the current argument. */
+ info = NULL;
+ for (ss = loopse->ss; ss && ss != gfc_ss_terminator; ss = ss->next)
+ {
+ if (ss->expr != e)
+ continue;
+ info = &ss->data.info;
+ break;
+ }
+
+ /* If there is a dependency, create a temporary and use it
+ instead of the variable. */
+ fsym = formal ? formal->sym : NULL;
+ if (e->expr_type == EXPR_VARIABLE
+ && e->rank && fsym
+ && fsym->attr.intent != INTENT_IN
+ && gfc_check_fncall_dependency (e, INTENT_OUT, sym, arg0))
+ {
+ /* Make a local loopinfo for the temporary creation, so that
+ none of the other ss->info's have to be renormalized. */
+ gfc_init_loopinfo (&tmp_loop);
+ for (n = 0; n < info->dimen; n++)
+ {
+ tmp_loop.to[n] = loopse->loop->to[n];
+ tmp_loop.from[n] = loopse->loop->from[n];
+ tmp_loop.order[n] = loopse->loop->order[n];
+ }
+
+ /* Generate the temporary. Merge the block so that the
+ declarations are put at the right binding level. */
+ size = gfc_create_var (gfc_array_index_type, NULL);
+ data = gfc_create_var (pvoid_type_node, NULL);
+ gfc_start_block (&block);
+ tmp = gfc_typenode_for_spec (&e->ts);
+ tmp = gfc_trans_create_temp_array (&se->pre, &se->post,
+ &tmp_loop, info, tmp,
+ false, true, false);
+ gfc_add_modify_expr (&se->pre, size, tmp);
+ tmp = fold_convert (pvoid_type_node, info->data);
+ gfc_add_modify_expr (&se->pre, data, tmp);
+ gfc_merge_block_scope (&block);
+
+ /* Obtain the argument descriptor for packing/unpacking. */
+ gfc_init_se (&parmse, NULL);
+ parmse.want_pointer = 1;
+ gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
+ gfc_add_block_to_block (&se->pre, &parmse.pre);
+
+ /* Copy the data to the temporary for INTENT(INOUT).
+ TODO: Suppress the temporary memory allocation above,
+ if the internal_pack allocates memory. */
+ if (fsym->attr.intent == INTENT_INOUT)
+ {
+ /* Repack the array. */
+ tmp = gfc_chainon_list (NULL_TREE, parmse.expr);
+ ptr = build_function_call_expr (gfor_fndecl_in_pack, tmp);
+ ptr = gfc_evaluate_now (ptr, &se->pre);
+
+ /* Copy the data to the temporary. */
+ tmp = gfc_chainon_list (NULL_TREE, data);
+ tmp = gfc_chainon_list (tmp, ptr);
+ tmp = gfc_chainon_list (tmp, size);
+ tmp = build_function_call_expr
+ (built_in_decls[BUILT_IN_MEMCPY], tmp);
+ tmp = gfc_evaluate_now (tmp, &se->pre);
+
+ /* If necessary, free the temporary afterwards. */
+ gfc_start_block (&block);
+ tmp = convert (pvoid_type_node, ptr);
+ tmp = gfc_chainon_list (NULL_TREE, tmp);
+ tmp = build_function_call_expr
+ (gfor_fndecl_internal_free, tmp);
+ gfc_add_expr_to_block (&block, tmp);
+ stmt = gfc_finish_block (&block);
+
+ gfc_init_block (&block);
+ tmp = build_fold_indirect_ref (parmse.expr);
+ tmp = gfc_conv_array_data (tmp);
+ tmp = build2 (NE_EXPR, boolean_type_node, ptr, tmp);
+ tmp = build3_v (COND_EXPR, tmp, stmt, build_empty_stmt ());
+
+ gfc_add_expr_to_block (&block, tmp);
+ gfc_add_block_to_block (&se->pre, &block);
+ }
+
+ /* Calculate the offset for the temporary. */
+ offset = gfc_index_zero_node;
+ for (n = 0; n < info->dimen; n++)
+ {
+ tmp = gfc_conv_descriptor_stride (info->descriptor,
+ gfc_rank_cst[n]);
+ tmp = fold_build2 (MULT_EXPR, gfc_array_index_type,
+ loopse->loop->from[n], tmp);
+ offset = fold_build2 (MINUS_EXPR, gfc_array_index_type,
+ offset, tmp);
+ }
+ info->offset = gfc_create_var (gfc_array_index_type, NULL);
+ gfc_add_modify_expr (&se->pre, info->offset, offset);
+
+ /* Copy the result back using unpack. */
+ tmp = gfc_chainon_list (NULL_TREE, parmse.expr);
+ tmp = gfc_chainon_list (tmp, data);
+ tmp = build_function_call_expr (gfor_fndecl_in_unpack, tmp);
+ gfc_add_expr_to_block (&se->post, tmp);
+
+ gfc_add_block_to_block (&se->post, &parmse.post);
+ }
+ }
+}
+
+
/* Translate the CALL statement. Builds a call to an F95 subroutine. */
tree
@@ -269,11 +418,15 @@
gfc_conv_loop_setup (&loop);
gfc_mark_ss_chain_used (ss, 1);
+ /* Convert the arguments, checking for dependencies. */
+ gfc_copy_loopinfo_to_se (&loopse, &loop);
+ loopse.ss = ss;
+ gfc_conv_elemental_dependencies (&se, &loopse, code->resolved_sym,
+ code->ext.actual);
+
/* Generate the loop body. */
gfc_start_scalarized_body (&loop, &body);
gfc_init_block (&block);
- gfc_copy_loopinfo_to_se (&loopse, &loop);
- loopse.ss = ss;
/* Add the subroutine call to the block. */
gfc_conv_function_call (&loopse, code->resolved_sym, code->ext.actual);
@@ -287,6 +440,7 @@
gfc_trans_scalarizing_loops (&loop, &body);
gfc_add_block_to_block (&se.pre, &loop.pre);
gfc_add_block_to_block (&se.pre, &loop.post);
+ gfc_add_block_to_block (&se.pre, &se.post);
gfc_cleanup_loop (&loop);
}
Index: gcc/fortran/dependency.c
===================================================================
--- gcc/fortran/dependency.c (révision 113111)
+++ gcc/fortran/dependency.c (copie de travail)
@@ -513,6 +513,10 @@
if (!expr)
continue;
+ /* Skip other itself. */
+ if (expr == other)
+ continue;
+
/* Skip intent(in) arguments if OTHER itself is intent(in). */
if (formal
&& intent == INTENT_IN
Index: gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90
===================================================================
--- gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90 (révision 0)
+++ gcc/testsuite/gfortran.dg/elemental_subroutine_3.f90 (révision 0)
@@ -0,0 +1,70 @@
+! { dg-do run }
+! Test the fix for PR25746, in which dependency checking was not being
+! done for elemental subroutines and therefore for interface assignments.
+!
+! This test is based on
+! http://home.comcast.net/~kmbtib/Fortran_stuff/elem_assign.f90
+! as reported by Harald Anlauf <anlauf@gmx.de> in the PR.
+!
+module elem_assign
+ implicit none
+ type mytype
+ integer x
+ end type mytype
+ interface assignment(=)
+ module procedure myassign
+ end interface assignment(=)
+ contains
+ elemental subroutine myassign(x,y)
+ type(mytype), intent(out) :: x
+ type(mytype), intent(in) :: y
+! Multiply the components by 2 to verify that this is being called.
+ x%x = y%x*2
+ end subroutine myassign
+end module elem_assign
+
+program test
+ use elem_assign
+ implicit none
+ type(mytype) :: y(6), x(6) = (/mytype(1),mytype(20),mytype(300),&
+ mytype(4000),mytype(50000),&
+ mytype(1000000)/)
+ type(mytype) :: z(2, 3)
+! The original case - dependency between lhs and rhs.
+ x = x((/2,3,1,4,5,6/))
+ if (any(x%x .ne. (/40, 600, 2, 8000, 100000, 2000000/))) call abort ()
+! Slightly more elborate case with non-trivial array ref on lhs.
+ x(4:1:-1) = x((/1,3,2,4/))
+ if (any(x%x .ne. (/16000, 1200, 4, 80, 100000, 2000000/))) call abort ()
+! Check that no-dependence case works....
+ y = x
+ if (any(y%x .ne. (/32000, 2400, 8, 160, 200000, 4000000/))) call abort ()
+! ...and now a case that caused headaches during the preparation of the patch
+ x(2:5) = x(1:4)
+ if (any(x%x .ne. (/16000, 32000, 2400, 8, 160, 2000000/))) call abort ()
+! Check offsets are done correctly in multi-dimensional cases
+ z = reshape (x, (/2,3/))
+ z(:, 3:2:-1) = z(:, 1:2)
+ y = reshape (z, (/6/))
+ if (any(y%x .ne. (/ 64000, 128000, 19200, 64, 128000, 256000/))) call abort ()
+! Check an INTENT(INOUT) case
+ call nonassign (x, x((/2,3,1,4,5,6/)))
+ if (any(x%x .ne. (/-96000, 7200, -48000, -24, -480, 6000000/))) call abort ()
+! ...and again
+ call nonassign (x(2:3), x(1:2))
+ if (any(x%x .ne. (/-96000, 288000, -21600, -24, -480, 6000000/))) call abort ()
+contains
+ elemental subroutine nonassign(x,y)
+ type(mytype), intent(INout) :: x
+ type(mytype), intent(in) :: y
+! Change the sign according to the input, to verify that INOUT is working.
+ if (x%x > 20000) then
+ x%x = y%x*3
+ else
+ x%x = -y%x*3
+ end if
+ end subroutine nonassign
+end program test
+
+! { dg-final { cleanup-modules "elem_assign" } }
+