This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Update support for Fortran arrays in OpenACC


The attached patch includes various bug fixes and performance
improvements involving the use of Fortran arrays in OpenACC data
clauses. More specifically,

  * Transfers Fortran arrays using GOMP_MAP_FIRSTPRIVATE_POINTERs.
  * Privatizes array descriptors in the Fortran FE.
  * Corrects a couple of bugs involving the offsets of subarray data.

The privatization of array descriptors results in a significant speedup
when programs utilize a lot of arrays. That patch was introduced back in
gomp-4_0-branch, so I lost state on it. However, I believer that I
privatized the array descriptors directly in the Fortran FE instead of
during gimplification because the FE had more knowledge on the array
descriptor types.

For reference, this patch contains the following patches from og8:

cecd29 OpenACC subarray data alignment in fortran
be4fec Privatize internal array variables introduced by the fortran FE
629dfb [OpenACC] firstprivate subarray changes
19cfe1 Fix PR70828s "broken array-type subarrays inside acc data in
       openacc"
00c258 [libgomp, OpenACC] Adjust offsets for present data clauses
924e50 [OpenACC, Fortran] fix an ICE involving assumed-size arrays
a5736d Correct the reported line number in fortran combined OpenACC
       directives

Is this patch OK for trunk? It bootstrapped / regression tested cleanly
for x86_64 with nvptx offloading.

Thanks,
Cesar
2018-06-29  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/fortran/
	* trans-array.c (gfc_trans_array_bounds): Add an INIT_VLA argument
	to control whether VLAs should be initialized.  Don't mark this
	function as static.
	(gfc_trans_auto_array_allocation): Update call to
	gfc_trans_array_bounds.
	(gfc_trans_g77_array): Likewise.
	* trans-array.h: Declare gfc_trans_array_bounds.
	* trans-openmp.c (gfc_omp_finish_clause): Don't cast ptr into a
	character pointer.  Remove "implicit mapping of assumed size array"
	error.
	(gfc_trans_omp_clauses): Likewise.
	(gfc_scan_nodesc_arrays): New.
	(gfc_privatize_nodesc_arrays_1): New.
	(gfc_privatize_nodesc_arrays): New.
	(gfc_init_nodesc_arrays): New.
	(gfc_trans_oacc_construct): Initialize any internal variables for
	arrays without array descriptors inside the offloaded parallel and
	kernels region.
	(gfc_trans_oacc_combined_directive): Likewise.  Set the	location of
	combined acc loops.

	gcc/
	* gimplify.c (struct gimplify_omp_ctx): Add tree clauses member.
	(new_omp_context): Initialize clauses to NULL_TREE.
	(gimplify_scan_omp_clauses): Prune firstprivate clause associated with
	OACC_DATA, OACC_ENTER_DATA and OACC_EXITdata regions.  Set clauses in
	the gimplify_omp_ctx.
	(omp_clause_matching_array_ref): New.
	(gomp_needs_data_present): New.
	(gimplify_adjust_omp_clauses_1): Use preset or pointer omp clause map
	kinds when creating implicit data clauses for OpenACC offloaded
	variables defined used an acc data region as necessary. Link ACC new
	clauses with the old ones.
	* omp-low.c (lower_omp_target): Handle NULL-sized types for
	assumed-sized arrays.

	gcc/testsuite/
	* c-c++-common/goacc/acc-data-chain.c: New test.
	* gfortran.dg/goacc/mod-array.f90: New test.
	* gfortran.dg/gomp/pr78866-2.f90: Update

	libgomp/
	* oacc-parallel.c (GOACC_parallel_keyed): Add offset to devaddrs.
	* testsuite/libgomp.oacc-c-c++-common/data_offset.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/pr70828.c: New test.
	* testsuite/libgomp.oacc-fortran/assumed-size.f90: New test.
	* testsuite/libgomp.oacc-fortran/data-alignment.f90: New test.
	* testsuite/libgomp.oacc-fortran/data_offset.f90: New test.
	* testsuite/libgomp.oacc-fortran/lib-13.f90: Update.
	* testsuite/libgomp.oacc-fortran/pr70828.f90: New test.


>From 6723544609c3ae0fd9daa01c6585060625fe5454 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis <cesar@codesourcery.com>
Date: Fri, 22 Jun 2018 09:58:43 -0700
Subject: [PATCH] Fortran array support

cecd29 OpenACC subarray data alignment in fortran
be4fec Privatize internal array variables introduced by the fortran FE
629dfb [OpenACC] firstprivate subarray changes
19cfe1 Fix PR70828s "broken array-type subarrays inside acc data in openacc"
00c258 [libgomp, OpenACC] Adjust offsets for present data clauses
924e50 [OpenACC, Fortran] fix an ICE involving assumed-size arrays
a5736d Correct the reported line number in fortran combined OpenACC directives

---
 gcc/fortran/trans-array.c                     |  12 +-
 gcc/fortran/trans-array.h                     |   2 +
 gcc/fortran/trans-openmp.c                    | 197 +++++++++++++++++-
 gcc/gimplify.c                                | 112 +++++++++-
 gcc/omp-low.c                                 |   6 +
 .../c-c++-common/goacc/acc-data-chain.c       |  24 +++
 gcc/testsuite/gfortran.dg/goacc/mod-array.f90 |  23 ++
 gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90  |   3 +-
 libgomp/oacc-parallel.c                       |   3 +-
 .../libgomp.oacc-c-c++-common/data_offset.c   |  41 ++++
 .../kernels-loop-and-seq-3.c                  |   4 +
 .../kernels-loop-and-seq-4.c                  |   4 +
 .../libgomp.oacc-c-c++-common/pr70828.c       |  25 +++
 .../libgomp.oacc-fortran/assumed-size.f90     |  31 +++
 .../libgomp.oacc-fortran/data-alignment.f90   |  35 ++++
 .../libgomp.oacc-fortran/data_offset.f90      |  43 ++++
 .../testsuite/libgomp.oacc-fortran/lib-13.f90 |   1 -
 .../libgomp.oacc-fortran/pr70828.f90          |  24 +++
 18 files changed, 569 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/acc-data-chain.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/mod-array.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/data_offset.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr70828.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/assumed-size.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/data-alignment.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/data_offset.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr70828.f90

>From 99df59fac7ee5605deaca011e59fc7b3308a00fc Mon Sep 17 00:00:00 2001
From: Cesar Philippidis <cesar@codesourcery.com>
Date: Fri, 22 Jun 2018 09:58:43 -0700
Subject: [PATCH] Fortran array support

data-clauses patch 65
data-clauses patch 67
data-clauses patch 81
data-clauses patch 90
data-clauses patch 91
data-clauses patch 96
data-clauses patch 112
---
 gcc/fortran/trans-array.c                     |  12 +-
 gcc/fortran/trans-array.h                     |   2 +
 gcc/fortran/trans-openmp.c                    | 206 +++++++++++++++---
 gcc/gimplify.c                                | 109 ++++++++-
 gcc/omp-low.c                                 |   6 +
 .../c-c++-common/goacc/acc-data-chain.c       |  24 ++
 gcc/testsuite/gfortran.dg/goacc/mod-array.f90 |  23 ++
 gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90  |   3 +-
 libgomp/oacc-parallel.c                       |   3 +-
 .../libgomp.oacc-c-c++-common/data_offset.c   |  41 ++++
 .../kernels-loop-and-seq-3.c                  |   4 +
 .../kernels-loop-and-seq-4.c                  |   4 +
 .../libgomp.oacc-c-c++-common/pr70828.c       |  25 +++
 .../libgomp.oacc-fortran/assumed-size.f90     |  31 +++
 .../libgomp.oacc-fortran/data-alignment.f90   |  35 +++
 .../libgomp.oacc-fortran/data_offset.f90      |  43 ++++
 .../testsuite/libgomp.oacc-fortran/lib-13.f90 |   1 -
 .../libgomp.oacc-fortran/pr70828.f90          |  24 ++
 18 files changed, 560 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/acc-data-chain.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/mod-array.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/data_offset.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr70828.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/assumed-size.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/data-alignment.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/data_offset.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr70828.f90

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index f0f5c1b709e..92dd67f1383 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -6043,9 +6043,9 @@ gfc_trans_array_cobounds (tree type, stmtblock_t * pblock,
 /* Generate code to evaluate non-constant array bounds.  Sets *poffset and
    returns the size (in elements) of the array.  */
 
-static tree
+tree
 gfc_trans_array_bounds (tree type, gfc_symbol * sym, tree * poffset,
-                        stmtblock_t * pblock)
+                        stmtblock_t * pblock, bool init_vla)
 {
   gfc_array_spec *as;
   tree size;
@@ -6122,7 +6122,9 @@ gfc_trans_array_bounds (tree type, gfc_symbol * sym, tree * poffset,
     }
 
   gfc_trans_array_cobounds (type, pblock, sym);
-  gfc_trans_vla_type_sizes (sym, pblock);
+
+  if (init_vla)
+    gfc_trans_vla_type_sizes (sym, pblock);
 
   *poffset = offset;
   return size;
@@ -6186,7 +6188,7 @@ gfc_trans_auto_array_allocation (tree decl, gfc_symbol * sym,
       && !INTEGER_CST_P (sym->ts.u.cl->backend_decl))
     gfc_conv_string_length (sym->ts.u.cl, NULL, &init);
 
-  size = gfc_trans_array_bounds (type, sym, &offset, &init);
+  size = gfc_trans_array_bounds (type, sym, &offset, &init, true);
 
   /* Don't actually allocate space for Cray Pointees.  */
   if (sym->attr.cray_pointee)
@@ -6281,7 +6283,7 @@ gfc_trans_g77_array (gfc_symbol * sym, gfc_wrapped_block * block)
     gfc_conv_string_length (sym->ts.u.cl, NULL, &init);
 
   /* Evaluate the bounds of the array.  */
-  gfc_trans_array_bounds (type, sym, &offset, &init);
+  gfc_trans_array_bounds (type, sym, &offset, &init, true);
 
   /* Set the offset.  */
   if (VAR_P (GFC_TYPE_ARRAY_OFFSET (type)))
diff --git a/gcc/fortran/trans-array.h b/gcc/fortran/trans-array.h
index 5ef86565d8d..394aaab7253 100644
--- a/gcc/fortran/trans-array.h
+++ b/gcc/fortran/trans-array.h
@@ -39,6 +39,8 @@ void gfc_trans_dummy_array_bias (gfc_symbol *, tree, gfc_wrapped_block *);
 /* Generate entry and exit code for g77 calling convention arrays.  */
 void gfc_trans_g77_array (gfc_symbol *, gfc_wrapped_block *);
 
+tree gfc_trans_array_bounds (tree, gfc_symbol *, tree *, stmtblock_t *, bool);
+
 tree gfc_full_array_size (stmtblock_t *, tree, int);
 
 tree gfc_duplicate_allocatable (tree, tree, tree, int, tree);
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index f038f4c5bf8..4638e0f0efd 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -38,6 +38,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "omp-general.h"
 #include "omp-low.h"
+#include "hash-set.h"
+#include "tree-iterator.h"
 #undef GCC_DIAG_STYLE
 #define GCC_DIAG_STYLE __gcc_tdiag__
 #include "diagnostic-core.h"
@@ -1044,21 +1046,6 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
     return;
 
   tree decl = OMP_CLAUSE_DECL (c);
-
-  /* Assumed-size arrays can't be mapped implicitly, they have to be
-     mapped explicitly using array sections.  */
-  if (TREE_CODE (decl) == PARM_DECL
-      && GFC_ARRAY_TYPE_P (TREE_TYPE (decl))
-      && GFC_TYPE_ARRAY_AKIND (TREE_TYPE (decl)) == GFC_ARRAY_UNKNOWN
-      && GFC_TYPE_ARRAY_UBOUND (TREE_TYPE (decl),
-				GFC_TYPE_ARRAY_RANK (TREE_TYPE (decl)) - 1)
-	 == NULL)
-    {
-      error_at (OMP_CLAUSE_LOCATION (c),
-		"implicit mapping of assumed size array %qD", decl);
-      return;
-    }
-
   tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
   if (POINTER_TYPE_P (TREE_TYPE (decl)))
     {
@@ -1094,7 +1081,6 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
       gfc_start_block (&block);
       tree type = TREE_TYPE (decl);
       tree ptr = gfc_conv_descriptor_data_get (decl);
-      ptr = fold_convert (build_pointer_type (char_type_node), ptr);
       ptr = build_fold_indirect_ref (ptr);
       OMP_CLAUSE_DECL (c) = ptr;
       c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
@@ -2141,8 +2127,6 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		    {
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
-		      ptr = fold_convert (build_pointer_type (char_type_node),
-					  ptr);
 		      ptr = build_fold_indirect_ref (ptr);
 		      OMP_CLAUSE_DECL (node) = ptr;
 		      node2 = build_omp_clause (input_location,
@@ -2235,8 +2219,6 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 				       OMP_CLAUSE_SIZE (node), elemsz);
 		    }
 		  gfc_add_block_to_block (block, &se.post);
-		  ptr = fold_convert (build_pointer_type (char_type_node),
-				      ptr);
 		  OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr);
 
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
@@ -3039,22 +3021,157 @@ gfc_trans_omp_code (gfc_code *code, bool force_empty)
   return stmt;
 }
 
+void gfc_debug_expr (gfc_expr *);
+
+/* Add any array that does not have an array descriptor to the hash_set
+   pointed to by DATA.  */
+
+static int
+gfc_scan_nodesc_arrays (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+		void *data)
+{
+  hash_set<gfc_symbol *> *arrays = (hash_set<gfc_symbol *> *)data;
+
+  if ((*e)->expr_type == EXPR_VARIABLE)
+    {
+      gfc_symbol *sym = (*e)->symtree->n.sym;
+
+      if (sym->attr.dimension && gfc_is_nodesc_array (sym))
+	arrays->add (sym);
+    }
+
+  return 0;
+}
+
+/* Build a set of internal array variables (lbound, ubound, stride, etc.)
+   that need privatization.  */
+
+static tree
+gfc_privatize_nodesc_arrays_1 (tree *tp, int *walk_subtrees, void *data)
+{
+  hash_set<tree> *decls = (hash_set<tree> *)data;
+
+  if (TREE_CODE (*tp) == MODIFY_EXPR)
+    {
+      tree lhs = TREE_OPERAND (*tp, 0);
+      if (DECL_P (lhs))
+	decls->add (lhs);
+    }
+
+  if (IS_TYPE_OR_DECL_P (*tp))
+    *walk_subtrees = false;
+
+  return NULL;
+}
+
+/* Reinitialize all of the arrays inside ARRAY_SET in BLOCK.  Append private
+   clauses for those arrays in CLAUSES.  */
+
+static tree
+gfc_privatize_nodesc_arrays (hash_set<gfc_symbol *> *array_set,
+			     stmtblock_t *block, tree clauses)
+{
+  hash_set<gfc_symbol *>::iterator its = array_set->begin ();
+  hash_set<tree> *private_decls = new hash_set<tree>;
+
+  for (; its != array_set->end (); ++its)
+    {
+      gfc_symbol *sym = *its;
+      tree parm = gfc_get_symbol_decl (sym);
+      tree type = TREE_TYPE (parm);
+      tree offset, tmp;
+
+      /* Evaluate the bounds of the array.  */
+      gfc_trans_array_bounds (type, sym, &offset, block, false);
+
+      /* Set the offset.  */
+      if (TREE_CODE (GFC_TYPE_ARRAY_OFFSET (type)) == VAR_DECL)
+	gfc_add_modify (block, GFC_TYPE_ARRAY_OFFSET (type), offset);
+
+      /* Set the pointer itself if we aren't using the parameter
+	 directly.  */
+      if (TREE_CODE (parm) != PARM_DECL && DECL_LANG_SPECIFIC (parm)
+	  && GFC_DECL_SAVED_DESCRIPTOR (parm))
+	{
+	  tmp = convert (TREE_TYPE (parm),
+			 GFC_DECL_SAVED_DESCRIPTOR (parm));
+	  gfc_add_modify (block, parm, tmp);
+	}
+    }
+
+  /* Add private clauses for any variables that are used by
+     gfc_trans_array_bounds.  */
+  walk_tree_without_duplicates (&block->head, gfc_privatize_nodesc_arrays_1,
+				private_decls);
+
+  hash_set<tree>::iterator itt = private_decls->begin ();
+
+  for (; itt != private_decls->end (); ++itt)
+    {
+      tree nc = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
+      OMP_CLAUSE_DECL (nc) = *itt;
+      OMP_CLAUSE_CHAIN (nc) = clauses;
+      clauses = nc;
+    }
+
+  delete private_decls;
+
+  return clauses;
+}
+
+/* Reinitialize any arrays in CLAUSES used inside CODE which do not contain
+   array descriptors if SCAN_NODESC_ARRAYS is TRUE.  Place the initialization
+   sequences in CODE.  Update CLAUSES to contain OMP_CLAUSE_PRIVATE for any
+   arrays which were initialized.  */
+
+static hash_set<gfc_symbol *> *
+gfc_init_nodesc_arrays (stmtblock_t *inner, tree *clauses, gfc_code *code,
+			bool scan_nodesc_arrays)
+{
+  hash_set<gfc_symbol *> *array_set = NULL;
+
+  if (!scan_nodesc_arrays)
+    return NULL;
+
+  array_set = new hash_set<gfc_symbol *>;
+  gfc_code_walker (&code, gfc_dummy_code_callback, gfc_scan_nodesc_arrays,
+		   array_set);
+
+  if (array_set->elements ())
+    {
+      gfc_start_block (inner);
+      pushlevel ();
+      *clauses = gfc_privatize_nodesc_arrays (array_set, inner, *clauses);
+    }
+  else
+    {
+      delete array_set;
+      array_set = NULL;
+    }
+
+  return array_set;
+}
+
 /* Trans OpenACC directives. */
 /* parallel, kernels, data and host_data. */
 static tree
 gfc_trans_oacc_construct (gfc_code *code)
 {
-  stmtblock_t block;
+  stmtblock_t block, inner;
   tree stmt, oacc_clauses;
   enum tree_code construct_code;
+  bool scan_nodesc_arrays = false;
+  hash_set<gfc_symbol *> *array_set = NULL;
 
   switch (code->op)
     {
       case EXEC_OACC_PARALLEL:
 	construct_code = OACC_PARALLEL;
+	scan_nodesc_arrays = true;
 	break;
       case EXEC_OACC_KERNELS:
 	construct_code = OACC_KERNELS;
+	scan_nodesc_arrays = true;
 	break;
       case EXEC_OACC_DATA:
 	construct_code = OACC_DATA;
@@ -3069,10 +3186,25 @@ gfc_trans_oacc_construct (gfc_code *code)
   gfc_start_block (&block);
   oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
 					code->loc);
+
+  array_set = gfc_init_nodesc_arrays (&inner, &oacc_clauses, code,
+				      scan_nodesc_arrays);
+
   stmt = gfc_trans_omp_code (code->block->next, true);
+
+  if (array_set && array_set->elements ())
+    {
+      gfc_add_expr_to_block (&inner, stmt);
+      stmt = gfc_finish_block (&inner);
+      stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
+      delete array_set;
+    }
+
   stmt = build2_loc (input_location, construct_code, void_type_node, stmt,
 		     oacc_clauses);
+
   gfc_add_expr_to_block (&block, stmt);
+
   return gfc_finish_block (&block);
 }
 
@@ -3865,18 +3997,23 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 static tree
 gfc_trans_oacc_combined_directive (gfc_code *code)
 {
-  stmtblock_t block, *pblock = NULL;
+  stmtblock_t block, inner, *pblock = NULL;
   gfc_omp_clauses construct_clauses, loop_clauses;
   tree stmt, oacc_clauses = NULL_TREE;
   enum tree_code construct_code;
+  bool scan_nodesc_arrays = false;
+  hash_set<gfc_symbol *> *array_set = NULL;
+  location_t loc = input_location;
 
   switch (code->op)
     {
       case EXEC_OACC_PARALLEL_LOOP:
 	construct_code = OACC_PARALLEL;
+	scan_nodesc_arrays = true;
 	break;
       case EXEC_OACC_KERNELS_LOOP:
 	construct_code = OACC_KERNELS;
+	scan_nodesc_arrays = true;
 	break;
       default:
 	gcc_unreachable ();
@@ -3925,18 +4062,37 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
       oacc_clauses = gfc_trans_omp_clauses (&block, &construct_clauses,
 					    code->loc);
     }
+
+  array_set = gfc_init_nodesc_arrays (&inner, &oacc_clauses, code,
+				      scan_nodesc_arrays);
+
   if (!loop_clauses.seq)
-    pblock = &block;
+    pblock = (array_set && array_set->elements ()) ? &inner : &block;
   else
     pushlevel ();
   stmt = gfc_trans_omp_do (code, EXEC_OACC_LOOP, pblock, &loop_clauses, NULL);
+
+  if (CAN_HAVE_LOCATION_P (stmt))
+    SET_EXPR_LOCATION (stmt, loc);
+
+  if (array_set && array_set->elements ())
+    gfc_add_expr_to_block (&inner, stmt);
+
   if (TREE_CODE (stmt) != BIND_EXPR)
     stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
   else
     poplevel (0, 0);
-  stmt = build2_loc (input_location, construct_code, void_type_node, stmt,
-		     oacc_clauses);
+
+  if (array_set && array_set->elements ())
+    {
+      stmt = gfc_finish_block (&inner);
+      stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
+      delete array_set;
+    }
+
+  stmt = build2_loc (loc, construct_code, void_type_node, stmt, oacc_clauses);
   gfc_add_expr_to_block (&block, stmt);
+
   return gfc_finish_block (&block);
 }
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 48ac92e2b16..5dfc0d6805b 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -191,6 +191,7 @@ struct gimplify_omp_ctx
   bool target_map_scalars_firstprivate;
   bool target_map_pointers_as_0len_arrays;
   bool target_firstprivatize_array_bases;
+  tree clauses;
 };
 
 static struct gimplify_ctx *gimplify_ctxp;
@@ -409,6 +410,7 @@ new_omp_context (enum omp_region_type region_type)
   c->privatized_types = new hash_set<tree>;
   c->location = input_location;
   c->region_type = region_type;
+  c->clauses = NULL_TREE;
   if ((region_type & ORT_TASK) == 0)
     c->default_kind = OMP_CLAUSE_DEFAULT_SHARED;
   else
@@ -7501,6 +7503,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
   tree *prev_list_p = NULL;
 
   ctx = new_omp_context (region_type);
+  ctx->clauses = *list_p;
   outer_ctx = ctx->outer_context;
   if (code == OMP_TARGET)
     {
@@ -7806,6 +7809,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	    case OACC_ENTER_DATA:
 	    case OACC_EXIT_DATA:
 	    case OACC_HOST_DATA:
+	    case OACC_UPDATE:
 	      if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
 		  || (OMP_CLAUSE_MAP_KIND (c)
 		      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
@@ -8560,7 +8564,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 
       if (code == OACC_DATA
 	  && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
-	  && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER)
+	  && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER
+	      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
 	remove = true;
       if (remove)
 	*list_p = OMP_CLAUSE_CHAIN (c);
@@ -8696,6 +8701,58 @@ struct gimplify_adjust_omp_clauses_data
   gimple_seq *pre_p;
 };
 
+/* Return true if clause contains an array_ref of DECL.  */
+
+static bool
+omp_clause_matching_array_ref (tree clause, tree decl)
+{
+  tree cdecl = OMP_CLAUSE_DECL (clause);
+
+  if (TREE_CODE (cdecl) != ARRAY_REF)
+    return false;
+
+  return TREE_OPERAND (cdecl, 0) == decl;
+}
+
+/* Inside OpenACC parallel and kernels regions, the implicit data
+   clauses for arrays must respect the explicit data clauses set by a
+   containing acc data region.  Specifically, care must be taken
+   pointers or if an subarray of a local array is specified in an acc
+   data region, so that the referenced array inside the offloaded
+   region has a present data clasue for that array with an
+   approporiate subarray argument.  This function returns the tree
+   node of the acc data clause that utilizes DECL as an argument.  */
+
+static tree
+gomp_needs_data_present (tree decl)
+{
+  gimplify_omp_ctx *ctx = NULL;
+  bool found_match = false;
+  tree c = NULL_TREE;
+
+  if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+    return NULL_TREE;
+
+  if (gimplify_omp_ctxp->region_type != ORT_ACC_PARALLEL
+      && gimplify_omp_ctxp->region_type != ORT_ACC_KERNELS)
+    return NULL_TREE;
+
+  for (ctx = gimplify_omp_ctxp->outer_context; !found_match && ctx;
+       ctx = ctx->outer_context)
+    {
+      if (ctx->region_type != ORT_ACC_DATA)
+	break;
+
+      for (c = ctx->clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	    && (omp_clause_matching_array_ref (c, decl)
+		|| OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER))
+	  return c;
+    }
+
+  return NULL_TREE;
+}
+
 /* For all variables that were not actually used within the context,
    remove PRIVATE, SHARED, and FIRSTPRIVATE clauses.  */
 
@@ -8849,7 +8906,51 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
 	  gcc_unreachable ();
 	}
       OMP_CLAUSE_SET_MAP_KIND (clause, kind);
-      if (DECL_SIZE (decl)
+      tree c2 = gomp_needs_data_present (decl);
+      /* Handle OpenACC pointers that were declared inside acc data
+	 regions.  */
+      if (c2 != NULL && OMP_CLAUSE_MAP_KIND (c2) == GOMP_MAP_POINTER)
+	{
+	  OMP_CLAUSE_SET_MAP_KIND (clause, GOMP_MAP_POINTER);
+	  OMP_CLAUSE_SIZE (clause) = unshare_expr (OMP_CLAUSE_SIZE (c2));
+	}
+      /* Handle OpenACC subarrays that were declared inside acc data
+	 regions.  */
+      else if (c2 != NULL)
+	{
+	  tree first = OMP_CLAUSE_DECL (c2);
+
+	  /* Adjust the existing clause to make it a present data
+	     clause with the proper subarray attributes.  */
+	  OMP_CLAUSE_DECL (clause) = unshare_expr (first);
+	  OMP_CLAUSE_SET_MAP_KIND (clause, GOMP_MAP_FORCE_PRESENT);
+	  OMP_CLAUSE_SIZE (clause) = unshare_expr (OMP_CLAUSE_SIZE (c2));
+
+	  /* Create a new data clause for the firstprivate pointer.  */
+	  tree nc = build_omp_clause (OMP_CLAUSE_LOCATION (clause),
+				      OMP_CLAUSE_MAP);
+	  OMP_CLAUSE_DECL (nc) = decl;
+	  OMP_CLAUSE_SET_MAP_KIND (nc, GOMP_MAP_FIRSTPRIVATE_POINTER);
+
+	  tree t = build_fold_addr_expr (first);
+	  t = fold_convert_loc (OMP_CLAUSE_LOCATION (clause),
+				ptrdiff_type_node, t);
+	  tree ptr = build_fold_addr_expr (decl);
+	  t = fold_build2_loc (OMP_CLAUSE_LOCATION (clause), MINUS_EXPR,
+			       ptrdiff_type_node, t,
+			       fold_convert_loc (OMP_CLAUSE_LOCATION (clause),
+						 ptrdiff_type_node, ptr));
+	  OMP_CLAUSE_SIZE (nc) = t;
+
+	  struct gimplify_omp_ctx *ctx = gimplify_omp_ctxp;
+	  gimplify_omp_ctxp = ctx->outer_context;
+	  gimplify_expr (&OMP_CLAUSE_SIZE (nc),
+			 pre_p, NULL, is_gimple_val, fb_rvalue);
+	  gimplify_omp_ctxp = ctx;
+	  OMP_CLAUSE_CHAIN (nc) = OMP_CLAUSE_CHAIN (clause);
+	  OMP_CLAUSE_CHAIN (clause) = nc;
+	}
+      else if (DECL_SIZE (decl)
 	  && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
 	{
 	  tree decl2 = DECL_VALUE_EXPR (decl);
@@ -8876,7 +8977,9 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
 	  OMP_CLAUSE_CHAIN (nc) = OMP_CLAUSE_CHAIN (clause);
 	  OMP_CLAUSE_CHAIN (clause) = nc;
 	}
-      else if (gimplify_omp_ctxp->target_firstprivatize_array_bases
+      else if ((((gimplify_omp_ctxp->region_type & ORT_ACC)
+		 && lang_GNU_CXX ())
+		|| gimplify_omp_ctxp->target_firstprivatize_array_bases)
 	       && lang_hooks.decls.omp_privatize_by_reference (decl))
 	{
 	  OMP_CLAUSE_DECL (clause) = build_simple_mem_ref (decl);
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index c591231d8f1..30ce16bc54b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -7924,6 +7924,12 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)
 	      s = TYPE_SIZE_UNIT (TREE_TYPE (ovar));
+	    /* Fortran assumed-size arrays have zero size because the
+	       type is incomplete.  Set the size to one to allow the
+	       runtime to remap any existing data that is already
+	       present on the accelerator.  */
+	    if (s == NULL_TREE)
+	      s = integer_one_node;
 	    s = fold_convert (size_type_node, s);
 	    purpose = size_int (map_idx++);
 	    CONSTRUCTOR_APPEND_ELT (vsize, purpose, s);
diff --git a/gcc/testsuite/c-c++-common/goacc/acc-data-chain.c b/gcc/testsuite/c-c++-common/goacc/acc-data-chain.c
new file mode 100644
index 00000000000..30482214990
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/acc-data-chain.c
@@ -0,0 +1,24 @@
+/* Ensure that the gimplifier does not remove any existing clauses as
+   it inserts new implicit data clauses.  */
+
+/* { dg-additional-options "-fdump-tree-gimple" }  */
+
+#define N 100
+static int a[N], b[N];
+
+int main(int argc, char *argv[])
+{
+  int i;
+
+#pragma acc data copyin(a[0:N]) copyout (b[0:N])
+  {
+#pragma acc parallel loop
+    for (i = 0; i < N; i++)
+      b[i] = a[i];
+  }
+
+ return 0;
+}
+
+// { dg-final { scan-tree-dump-times "omp target oacc_data map.from:b.0. .len: 400.. map.to:a.0. .len: 400.." 1 "gimple" } }
+// { dg-final { scan-tree-dump-times "omp target oacc_parallel map.force_present:b.0. .len: 400.. map.firstprivate:b .pointer assign, bias: 0.. map.force_present:a.0. .len: 400.. map.firstprivate:a .pointer assign, bias: 0.." 1 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/mod-array.f90 b/gcc/testsuite/gfortran.dg/goacc/mod-array.f90
new file mode 100644
index 00000000000..a0b17dbfec1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/mod-array.f90
@@ -0,0 +1,23 @@
+module array_mod
+  real :: array(5) = (/ 1.0, 2.0, 3.0, 4.0, 5.0 /)
+end module array_mod
+
+module test_mod
+  use array_mod
+
+contains
+
+  subroutine test_acc
+
+    implicit none
+
+    integer :: i
+    real :: sum = 0.0
+
+    !$acc parallel loop gang copy (sum) reduction (+:sum)
+    do i = 1, 5
+       sum = sum + array(i)
+    end do
+    !$acc end parallel loop
+  end subroutine test_acc
+end module test_mod
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90 b/gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90
index 033479e5801..6e1d3ac811f 100644
--- a/gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/pr78866-2.f90
@@ -3,7 +3,8 @@
 
 subroutine pr78866(x)
   integer :: x(*)
-!$omp target		! { dg-error "implicit mapping of assumed size array" }
+! Regarding the XFAIL, see gcc/fortran/trans-openmp.c:gfc_omp_finish_clause.
+!$omp target		! { dg-error "implicit mapping of assumed size array" "" { xfail *-*-* } }
   x(1) = 1
 !$omp end target
 end
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index b80ace58590..1e08af70b4d 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -232,7 +232,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
     devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-			    + tgt->list[i].key->tgt_offset);
+			    + tgt->list[i].key->tgt_offset
+			    + tgt->list[i].offset);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
 			      async, dims, tgt);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/data_offset.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/data_offset.c
new file mode 100644
index 00000000000..ccbbfcab87b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/data_offset.c
@@ -0,0 +1,41 @@
+/* Test present data clauses in acc offloaded regions when the
+   subarray inside the present clause does not have the same base
+   offset value as the subarray in the enclosing acc data or acc enter
+   data variable.  */
+
+#include <assert.h>
+
+void
+offset (int *data, int n)
+{
+  int i;
+
+#pragma acc parallel loop present (data[0:n])
+  for (i = 0; i < n; i++)
+    data[i] = n;
+}
+
+int
+main ()
+{
+  const int n = 30;
+  int data[n], i;
+
+  for (i = 0; i < n; i++)
+    data[i] = -1;
+
+#pragma acc data copy(data[0:n])
+  {
+    offset (data+10, 10);
+  }
+
+  for (i = 0; i < n; i++)
+    {
+      if (i < 10 || i >= 20)
+	assert (data[i] == -1);
+      else
+	assert (data[i] == 10);
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
index e62297129fd..d0ea230a805 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
@@ -1,3 +1,7 @@
+/* FIXME: OpenACC kernels stopped working with the firstprivate subarray
+   changes.  */
+/* { dg-prune-output "OpenACC kernels construct will be executed sequentially" } */
+
 #include <stdlib.h>
 
 #define N 32
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
index c73127897a9..4017560d0a1 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
@@ -1,3 +1,7 @@
+/* FIXME: OpenACC kernels stopped working with the firstprivate subarray
+   changes.  */
+/* { dg-prune-output "OpenACC kernels construct will be executed sequentially" } */
+
 #include <stdlib.h>
 
 #define N 32
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr70828.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr70828.c
new file mode 100644
index 00000000000..c7dce2f42eb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr70828.c
@@ -0,0 +1,25 @@
+#include <assert.h>
+
+int
+main ()
+{
+  int a[100], i;
+
+  for (i = 0; i < 100; i++)
+    a[i] = 0;
+
+#pragma acc data copy(a[10:80])
+  {
+    #pragma acc parallel loop
+    for (i = 10; i < 90; i++)
+      a[i] = i;
+  }
+
+  for (i = 0; i < 100; i++)
+    if (i >= 10 && i < 90)
+      assert (a[i] == i);
+    else
+      assert (a[i] == 0);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/assumed-size.f90 b/libgomp/testsuite/libgomp.oacc-fortran/assumed-size.f90
new file mode 100644
index 00000000000..79de675d039
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/assumed-size.f90
@@ -0,0 +1,31 @@
+! Test if implicitly determined data clauses work with an
+! assumed-sized array variable.  Note that the array variable, 'a',
+! has been explicitly copied in and out via acc enter data and acc
+! exit data, respectively.
+
+program test
+  implicit none
+
+  integer, parameter :: n = 100
+  integer a(n), i
+
+  call dtest (a, n)
+
+  do i = 1, n
+     if (a(i) /= i) call abort
+  end do
+end program test
+
+subroutine dtest (a, n)
+  integer i, n
+  integer a(*)
+
+  !$acc enter data copyin(a(1:n))
+
+  !$acc parallel loop
+  do i = 1, n
+     a(i) = i
+  end do
+
+  !$acc exit data copyout(a(1:n))
+end subroutine dtest
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-alignment.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-alignment.f90
new file mode 100644
index 00000000000..38c90050b2d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/data-alignment.f90
@@ -0,0 +1,35 @@
+! Test if the array data associated with c is properly aligned
+! on the accelerator.  If it is not, this program will crash.
+
+! { dg-do run }
+
+integer function routine_align()
+  implicit none
+  integer, parameter :: n = 10000
+  real*8, dimension(:), allocatable :: c
+  integer :: i, idx
+
+  allocate (c(n))
+  routine_align = 0
+  c = 0.0
+
+  !$acc data copyin(idx) copy(c(1:n))
+
+  !$acc parallel vector_length(32)
+  !$acc loop vector
+  do i=1, n
+     c(i) = i
+  enddo
+  !$acc end parallel
+
+  !$acc end data
+end function routine_align
+
+
+! main driver
+program routine_align_main
+  implicit none
+  integer :: success
+  integer routine_align
+  success = routine_align()
+end program routine_align_main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data_offset.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data_offset.f90
new file mode 100644
index 00000000000..ff8ee39f964
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/data_offset.f90
@@ -0,0 +1,43 @@
+! Test present data clauses in acc offloaded regions when the subarray
+! inside the present clause does not have the same base offset value
+! as the subarray in the enclosing acc data or acc enter data variable.
+
+program test
+  implicit none
+
+  integer, parameter :: n = 30, m = 10
+  integer :: i
+  integer, allocatable :: data(:)
+  logical bounded
+
+  allocate (data(n))
+
+  data(:) = -1
+
+  !$acc data copy (data(5:20))
+  call test_data (data, n, m)
+  !$acc end data
+
+  do i = 1, n
+     bounded = i < m .or. i >= m+m
+     if (bounded .and. (data(i) /= -1)) then
+        call abort
+     else if (.not. bounded .and. data(i) /= 10) then
+        call abort
+     end if
+  end do
+
+  deallocate (data)
+end program test
+
+subroutine test_data (data, n, m)
+  implicit none
+
+  integer :: n, m, data(n), i
+
+  !$acc parallel loop present (data(m:m))
+  do i = m, m+m-1
+     data(i) = m
+  end do
+  !$acc end parallel loop
+end subroutine test_data
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-13.f90 b/libgomp/testsuite/libgomp.oacc-fortran/lib-13.f90
index 6d713b1cd95..6eef7e907a3 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/lib-13.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-13.f90
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } { "-O0" "-O1" } { "" } }
 
 program main
   use openacc
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr70828.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr70828.f90
new file mode 100644
index 00000000000..d1eba162c6a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr70828.f90
@@ -0,0 +1,24 @@
+! Runtime data mapping error.
+
+program test
+  integer, parameter :: n = 100
+  integer i, data(n)
+
+  data(:) = 0
+
+  !$acc data copy(data(5:n-10))
+  !$acc parallel loop
+  do i = 10, n - 10
+     data(i) = i
+  end do
+  !$acc end parallel loop
+  !$acc end data
+
+  do i = 1, n
+     if ((i < 10 .or. i > n-10)) then
+        if ((data(i) .ne. 0)) call abort
+     else if (data(i) .ne. i) then
+        call abort
+     end if
+  end do
+end program test
-- 
2.17.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]