[gomp3] Collapsed loops support

Jakub Jelinek jakub@redhat.com
Fri Feb 1 14:02:00 GMT 2008


Hi!

The patch below adds support for collapsed loops.  The only converted
FE is Fortran where the parsing changes are easiest, C/C++ continue
to emit collapsed loops as non-collapsed, just with OMP_CLAUSE_COLLAPSE
and for the time being only expand_omp_for_generic handles collapsed
loops, so collapsed loops use library loop *_start, *_next and *_end
even for static schedules.

OMP_FOR_{INIT,COND,INCR} arguments of OMP_FOR are changed into vectors
unconditionally (even for collapse(1) loops or loops without collapse
clause altogether - then they are 1 element vectors).
The comment above expand_omp_for_generic has pseudo-code how collapsed
loops are expanded.

While testing this I've noticed a problem with gfortran - in f951
boolean_type_node is logical(kind=4), which is something different from
the C bool type on many targets.  Several omp-builtins.def routines
return the C bool type, so there was a mismatch between what the routines
were actually returning and what f951 was expecting to be returned
(e.g. on x86_64/i?86 the difference is that for C bool only low 8 bits
are significant, while for logical(kind=4) 32 bits).  Thus this patch
changes BT_BOOL for builtins to be an integer type with BOOL_TYPE_SIZE.

Also, there is another preexisting bug with lastprivate iteration variables.
For Fortran complex do loops (non-1/-1 step), or if gimplifier decides to
use a temporary (e.g. for parallel for lastprivate iterators), or newly
for collapsed loop iterators and soon for C++ random access iterators,
the value copied by lastprivate is what the iterator contained during the
last loop's body execution, but the iterators should have step added to it
once more.  ATM I'm leaning towards adding an (optional) statement list
to OMP_CLAUSE_LASTPRIVATE, which would contain code that needs to be
performed on the iterator before copying it.  For lastprivate vars other
than omp do/omp for iterators this would be always NULL, nothing is needed,
otherwise for integral iterators it would be iter += step and for
C++ random access iterators iter operator+ (step).  Comments?

This has been regtested on x86_64-linux, I'll commit tonight to
gomp-3_0-branch unless one of you objects against it.

2008-02-01  Jakub Jelinek  <jakub@redhat.com>

	* tree.h (OMP_CLAUSE_COLLAPSE_ITERVAR,
	OMP_CLAUSE_COLLAPSE_COUNT): Define.
	* tree.c (omp_clause_num_ops): Change OMP_CLAUSE_COLLAPSE
	to 3 ops from 1 op.
	(walk_tree_1): Handle the 2 extra ops in OMP_CLAUSE_COLLAPSE.
	* tree-pretty-print.c (dump_generic_node): Handle collapsed
	OMP_FOR loops.
	* tree-parloops.c (create_parallel_loop): Create 1 entry
	vectors for OMP_FOR_{INIT,COND,INCR}.
	* gimplify.c (gimplify_omp_for): Handle collapsed OMP_FOR
	loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* c-omp.c (c_finish_omp_for): Create 1 entry vectors for
	OMP_FOR_{INIT,COND,INCR}.
	* tree-nested.c (walk_omp_for): Adjust for OMP_FOR_{INIT,COND,INCR}
	changes.
	(convert_nonlocal_omp_clauses, convert_local_omp_clauses): Handle
	OMP_CLAUSE_COLLAPSE and OMP_CLAUSE_UNTIED.
	* c-parser.c (c_parser_omp_clause_collapse): Clear
	OMP_CLAUSE_COLLAPSE_ITERVAR and OMP_CLAUSE_COLLAPSE_COUNT.
	* omp-low.c (struct omp_for_data_loop): New type.
	(struct omp_for_data): Remove v, n1, n2, step, cond_code fields.
	Add loop, loops, collapse fields.
	(extract_omp_for_data): Add loops argument.  Extract data for
	collapsed OMP_FOR loops.
	(workshare_safe_to_combine_p): Don't combine collapse > 1 loops
	unless all bounds and steps are constant.  Adjust extract_omp_for_data
	caller.
	(get_ws_args_for): Adjust extract_omp_for_data caller.
	(scan_omp_for): Handle collapsed OMP_FOR
	loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
	(expand_omp_for_generic): Handle collapsed OMP_FOR loops.  Adjust
	for struct omp_for_data changes.  If libgomp function doesn't return
	boolean_type_node, add comparison of the return value with 0.
	(expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Adjust
	for struct omp_for_data changes.
	(expand_omp_for): Allocate loops array, pass it to
	extract_omp_for_data.  For collapse > 1 loops use always
	expand_omp_for_generic.
	(lower_omp_single_simple): If libgomp function doesn't return
	boolean_type_node, add comparison of the return value with 0.
	(lower_omp_for_lastprivate): Adjust for struct omp_for_data changes.
	(lower_omp_for): Handle collapsed OMP_FOR loops, adjust for
	OMP_FOR_{INIT,COND,INCR} changes, adjust extract_omp_for_data
	caller.
	(diagnose_sb_1, diagnose_sb_2): Handle collapsed OMP_FOR
	loops, adjust for OMP_FOR_{INIT,COND,INCR} changes.
gcc/fortran/
	* openmp.c (omp_current_do_code): Made static.
	(omp_current_do_collapse): New variable.
	(gfc_resolve_omp_do_blocks): Compute omp_current_do_collapse,
	clear omp_current_do_code and omp_current_do_collapse on return.
	(gfc_resolve_do_iterator): Handle collapsed do loops.
	(resolve_omp_do): Likewise, diagnose errorneous collapsed do loops.
	* trans-openmp.c (gfc_trans_omp_clauses): Clear
	OMP_CLAUSE_COLLAPSE_ITERVAR and OMP_CLAUSE_COLLAPSE_COUNT.
	(gfc_trans_omp_do): Handle collapsed do loops.
	* gfortran.h (gfc_find_sym_in_expr): New prototype.
	* resolve.c (find_sym_in_expr): Rename to...
	(gfc_find_sym_in_expr): ... this.  No longer static.
	(resolve_allocate_expr, resolve_ordinary_assign): Adjust caller.
	* types.def (BT_BOOL): Use integer type with BOOL_TYPE_SIZE rather
	than boolean_type_node.
gcc/cp/
	* pt.c (tsubst_expr): Adjust for OMP_FOR_{INIT,COND,INCR} changes.
	* semantics.c (finish_omp_for): Likewise.
	* parser.c (cp_parser_omp_clause_collapse): Clear
	OMP_CLAUSE_COLLAPSE_ITERVAR and OMP_CLAUSE_COLLAPSE_COUNT.
gcc/testsuite/
	* gfortran.dg/gomp/collapse1.f90: New test.
libgomp/
	* testsuite/libgomp.fortran/collapse2.f90: New test.
	* testsuite/libgomp.fortran/collapse3.f90: New test.

--- gcc/tree.h	(revision 131902)
+++ gcc/tree.h	(working copy)
@@ -1824,8 +1824,13 @@ struct tree_constructor GTY(())
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_NUM_THREADS),0)
 #define OMP_CLAUSE_SCHEDULE_CHUNK_EXPR(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_SCHEDULE), 0)
+
 #define OMP_CLAUSE_COLLAPSE_EXPR(NODE) \
-  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_COLLAPSE),0)
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_COLLAPSE), 0)
+#define OMP_CLAUSE_COLLAPSE_ITERVAR(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_COLLAPSE), 1)
+#define OMP_CLAUSE_COLLAPSE_COUNT(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_COLLAPSE), 2)
 
 #define OMP_CLAUSE_REDUCTION_CODE(NODE)	\
   (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_REDUCTION)->omp_clause.subcode.reduction_code)
--- gcc/tree.c	(revision 131902)
+++ gcc/tree.c	(working copy)
@@ -187,7 +187,7 @@ unsigned const char omp_clause_num_ops[]
   0, /* OMP_CLAUSE_NOWAIT  */
   0, /* OMP_CLAUSE_ORDERED  */
   0, /* OMP_CLAUSE_DEFAULT  */
-  1, /* OMP_CLAUSE_COLLAPSE  */
+  3, /* OMP_CLAUSE_COLLAPSE  */
   0  /* OMP_CLAUSE_UNTIED   */
 };
 
@@ -8519,10 +8519,17 @@ walk_tree_1 (tree *tp, walk_tree_fn func
 	case OMP_CLAUSE_NOWAIT:
 	case OMP_CLAUSE_ORDERED:
 	case OMP_CLAUSE_DEFAULT:
-	case OMP_CLAUSE_COLLAPSE:
 	case OMP_CLAUSE_UNTIED:
 	  WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
 
+	case OMP_CLAUSE_COLLAPSE:
+	  {
+	    int i;
+	    for (i = 0; i < 3; i++)
+	      WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, i));
+	    WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
+	  }
+
 	case OMP_CLAUSE_REDUCTION:
 	  {
 	    int i;
--- gcc/tree-pretty-print.c	(revision 131902)
+++ gcc/tree-pretty-print.c	(working copy)
@@ -1891,6 +1891,8 @@ dump_generic_node (pretty_printer *buffe
 
       if (!(flags & TDF_SLIM))
 	{
+	  int i;
+
 	  if (OMP_FOR_PRE_BODY (node))
 	    {
 	      newline_and_indent (buffer, spc + 2);
@@ -1900,14 +1902,22 @@ dump_generic_node (pretty_printer *buffe
 	      dump_generic_node (buffer, OMP_FOR_PRE_BODY (node),
 		  spc, flags, false);
 	    }
-	  newline_and_indent (buffer, spc);
-	  pp_string (buffer, "for (");
-	  dump_generic_node (buffer, OMP_FOR_INIT (node), spc, flags, false);
-	  pp_string (buffer, "; ");
-	  dump_generic_node (buffer, OMP_FOR_COND (node), spc, flags, false);
-	  pp_string (buffer, "; ");
-	  dump_generic_node (buffer, OMP_FOR_INCR (node), spc, flags, false);
-	  pp_string (buffer, ")");
+	  spc -= 2;
+	  for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (node)); i++)
+	    {
+	      spc += 2;
+	      newline_and_indent (buffer, spc);
+	      pp_string (buffer, "for (");
+	      dump_generic_node (buffer, TREE_VEC_ELT (OMP_FOR_INIT (node), i),
+				 spc, flags, false);
+	      pp_string (buffer, "; ");
+	      dump_generic_node (buffer, TREE_VEC_ELT (OMP_FOR_COND (node), i),
+				 spc, flags, false);
+	      pp_string (buffer, "; ");
+	      dump_generic_node (buffer, TREE_VEC_ELT (OMP_FOR_INCR (node), i),
+				 spc, flags, false);
+	      pp_string (buffer, ")");
+	    }
 	  if (OMP_FOR_BODY (node))
 	    {
 	      newline_and_indent (buffer, spc + 2);
@@ -1918,6 +1928,7 @@ dump_generic_node (pretty_printer *buffe
 	      newline_and_indent (buffer, spc + 2);
 	      pp_character (buffer, '}');
 	    }
+	  spc -= 2 * TREE_VEC_LENGTH (OMP_FOR_INIT (node)) - 2;
 	  if (OMP_FOR_PRE_BODY (node))
 	    {
 	      spc -= 4;
--- gcc/omp-low.c	(revision 131902)
+++ gcc/omp-low.c	(working copy)
@@ -95,15 +95,23 @@ typedef struct omp_context
 } omp_context;
 
 
+struct omp_for_data_loop
+{
+  tree v, n1, n2, step;
+  enum tree_code cond_code;
+};
+
 /* A structure describing the main elements of a parallel loop.  */
 
 struct omp_for_data
 {
-  tree v, n1, n2, step, chunk_size, for_stmt;
-  enum tree_code cond_code;
+  struct omp_for_data_loop loop;
+  tree chunk_size, for_stmt;
   tree pre;
+  int collapse;
   bool have_nowait, have_ordered;
   enum omp_clause_schedule_kind sched_kind;
+  struct omp_for_data_loop *loops;
 };
 
 
@@ -160,65 +168,28 @@ is_combined_parallel (struct omp_region 
    them into *FD.  */
 
 static void
-extract_omp_for_data (tree for_stmt, struct omp_for_data *fd)
+extract_omp_for_data (tree for_stmt, struct omp_for_data *fd,
+		      struct omp_for_data_loop *loops)
 {
-  tree t, var;
+  tree t, var, *collapse_iter, *collapse_count;
+  tree count = NULL_TREE, iter_type = NULL_TREE;
+  struct omp_for_data_loop *loop;
+  int i;
+  struct omp_for_data_loop dummy_loop;
 
   fd->for_stmt = for_stmt;
   fd->pre = NULL;
-
-  t = OMP_FOR_INIT (for_stmt);
-  gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
-  fd->v = GIMPLE_STMT_OPERAND (t, 0);
-  gcc_assert (SSA_VAR_P (fd->v));
-  gcc_assert (TREE_CODE (TREE_TYPE (fd->v)) == INTEGER_TYPE);
-  var = TREE_CODE (fd->v) == SSA_NAME ? SSA_NAME_VAR (fd->v) : fd->v;
-  fd->n1 = GIMPLE_STMT_OPERAND (t, 1);
-
-  t = OMP_FOR_COND (for_stmt);
-  fd->cond_code = TREE_CODE (t);
-  gcc_assert (TREE_OPERAND (t, 0) == var);
-  fd->n2 = TREE_OPERAND (t, 1);
-  switch (fd->cond_code)
-    {
-    case LT_EXPR:
-    case GT_EXPR:
-      break;
-    case LE_EXPR:
-      fd->n2 = fold_build2 (PLUS_EXPR, TREE_TYPE (fd->n2), fd->n2,
-			   build_int_cst (TREE_TYPE (fd->n2), 1));
-      fd->cond_code = LT_EXPR;
-      break;
-    case GE_EXPR:
-      fd->n2 = fold_build2 (MINUS_EXPR, TREE_TYPE (fd->n2), fd->n2,
-			   build_int_cst (TREE_TYPE (fd->n2), 1));
-      fd->cond_code = GT_EXPR;
-      break;
-    default:
-      gcc_unreachable ();
-    }
-
-  t = OMP_FOR_INCR (fd->for_stmt);
-  gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
-  gcc_assert (GIMPLE_STMT_OPERAND (t, 0) == var);
-  t = GIMPLE_STMT_OPERAND (t, 1);
-  gcc_assert (TREE_OPERAND (t, 0) == var);
-  switch (TREE_CODE (t))
-    {
-    case PLUS_EXPR:
-      fd->step = TREE_OPERAND (t, 1);
-      break;
-    case MINUS_EXPR:
-      fd->step = TREE_OPERAND (t, 1);
-      fd->step = fold_build1 (NEGATE_EXPR, TREE_TYPE (fd->step), fd->step);
-      break;
-    default:
-      gcc_unreachable ();
-    }
+  fd->collapse = TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt));
+  if (fd->collapse > 1)
+    fd->loops = loops;
+  else
+    fd->loops = &fd->loop;
 
   fd->have_nowait = fd->have_ordered = false;
   fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC;
   fd->chunk_size = NULL_TREE;
+  collapse_iter = NULL;
+  collapse_count = NULL;
 
   for (t = OMP_FOR_CLAUSES (for_stmt); t ; t = OMP_CLAUSE_CHAIN (t))
     switch (OMP_CLAUSE_CODE (t))
@@ -233,20 +204,139 @@ extract_omp_for_data (tree for_stmt, str
 	fd->sched_kind = OMP_CLAUSE_SCHEDULE_KIND (t);
 	fd->chunk_size = OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t);
 	break;
+      case OMP_CLAUSE_COLLAPSE:
+	if (fd->collapse > 1)
+	  {
+	    collapse_iter = &OMP_CLAUSE_COLLAPSE_ITERVAR (t);
+	    collapse_count = &OMP_CLAUSE_COLLAPSE_COUNT (t);
+	  }
       default:
 	break;
       }
 
-  if (fd->sched_kind == OMP_CLAUSE_SCHEDULE_RUNTIME)
+  gcc_assert (fd->collapse == 1 || collapse_iter != NULL);
+  if (fd->sched_kind == OMP_CLAUSE_SCHEDULE_RUNTIME
+      || fd->sched_kind == OMP_CLAUSE_SCHEDULE_AUTO)
     gcc_assert (fd->chunk_size == NULL);
   else if (fd->chunk_size == NULL)
     {
       /* We only need to compute a default chunk size for ordered
 	 static loops and dynamic loops.  */
-      if (fd->sched_kind != OMP_CLAUSE_SCHEDULE_STATIC || fd->have_ordered)
+      if (fd->sched_kind != OMP_CLAUSE_SCHEDULE_STATIC
+	  || fd->have_ordered
+	  || fd->collapse > 1)
 	fd->chunk_size = (fd->sched_kind == OMP_CLAUSE_SCHEDULE_STATIC)
 			 ? integer_zero_node : integer_one_node;
     }
+
+  for (i = 0; i < fd->collapse; i++)
+    {
+      if (fd->collapse == 1)
+	loop = &fd->loop;
+      else if (loops != NULL)
+	loop = loops + i;
+      else
+	loop = &dummy_loop;
+
+      t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
+      gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
+      loop->v = GIMPLE_STMT_OPERAND (t, 0);
+      gcc_assert (SSA_VAR_P (loop->v));
+      gcc_assert (TREE_CODE (TREE_TYPE (loop->v)) == INTEGER_TYPE);
+      var = TREE_CODE (loop->v) == SSA_NAME ? SSA_NAME_VAR (loop->v) : loop->v;
+      loop->n1 = GIMPLE_STMT_OPERAND (t, 1);
+
+      t = TREE_VEC_ELT (OMP_FOR_COND (for_stmt), i);
+      loop->cond_code = TREE_CODE (t);
+      gcc_assert (TREE_OPERAND (t, 0) == var);
+      loop->n2 = TREE_OPERAND (t, 1);
+      switch (loop->cond_code)
+	{
+	case LT_EXPR:
+	case GT_EXPR:
+	  break;
+	case LE_EXPR:
+	  loop->n2 = fold_build2 (PLUS_EXPR, TREE_TYPE (loop->n2), loop->n2,
+				  build_int_cst (TREE_TYPE (loop->n2), 1));
+	  loop->cond_code = LT_EXPR;
+	  break;
+	case GE_EXPR:
+	  loop->n2 = fold_build2 (MINUS_EXPR, TREE_TYPE (loop->n2), loop->n2,
+				  build_int_cst (TREE_TYPE (loop->n2), 1));
+	  loop->cond_code = GT_EXPR;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+
+      t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
+      gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
+      gcc_assert (GIMPLE_STMT_OPERAND (t, 0) == var);
+      t = GIMPLE_STMT_OPERAND (t, 1);
+      gcc_assert (TREE_OPERAND (t, 0) == var);
+      switch (TREE_CODE (t))
+	{
+	case PLUS_EXPR:
+	  loop->step = TREE_OPERAND (t, 1);
+	  break;
+	case MINUS_EXPR:
+	  loop->step = TREE_OPERAND (t, 1);
+	  loop->step = fold_build1 (NEGATE_EXPR, TREE_TYPE (loop->step),
+				    loop->step);
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+
+      if (collapse_count && *collapse_count == NULL)
+	{
+	  tree type;
+
+	  /* FIXME: wait for final OpenMP 3.0 standard to find out
+	     which type should be used for the collapsed count
+	     computation.  */
+	  if (i == 0)
+	    iter_type = TREE_TYPE (loop->v);
+	  if ((i == 0 || count != NULL_TREE)
+	      && TREE_CODE (loop->n1) == INTEGER_CST
+	      && TREE_CODE (loop->n2) == INTEGER_CST
+	      && TREE_CODE (loop->step) == INTEGER_CST)
+	    {
+	      type = TREE_TYPE (loop->v);
+	      t = build_int_cst (type, (loop->cond_code == LT_EXPR ? -1 : 1));
+	      t = fold_build2 (PLUS_EXPR, type, loop->step, t);
+	      t = fold_build2 (PLUS_EXPR, type, t, loop->n2);
+	      t = fold_build2 (MINUS_EXPR, type, t, loop->n1);
+	      t = fold_build2 (TRUNC_DIV_EXPR, type, t, loop->step);
+	      t = fold_convert (iter_type, t);
+	      if (count != NULL_TREE)
+		count = fold_build2 (MULT_EXPR, iter_type, count, t);
+	      else
+		count = t;
+	    }
+	  else
+	    count = NULL_TREE;
+	}
+    }
+
+  if (collapse_iter && *collapse_iter == NULL)
+    *collapse_iter = create_tmp_var (iter_type, ".iter");
+  if (collapse_count && *collapse_count == NULL)
+    {
+      if (count)
+	*collapse_count = count;
+      else
+	*collapse_count = create_tmp_var (iter_type, ".count");
+    }
+
+  if (fd->collapse > 1)
+    {
+      fd->loop.v = *collapse_iter;
+      fd->loop.n1 = build_int_cst (TREE_TYPE (fd->loop.v), 0);
+      fd->loop.n2 = *collapse_count;
+      fd->loop.step = build_int_cst (TREE_TYPE (fd->loop.v), 1);
+      fd->loop.cond_code = LT_EXPR;
+    }
 }
 
 
@@ -306,16 +396,19 @@ workshare_safe_to_combine_p (basic_block
 
   gcc_assert (TREE_CODE (ws_stmt) == OMP_FOR);
 
-  extract_omp_for_data (ws_stmt, &fd);
+  extract_omp_for_data (ws_stmt, &fd, NULL);
+
+  if (fd.collapse > 1 && TREE_CODE (fd.loop.n2) != INTEGER_CST)
+    return false;
 
   /* FIXME.  We give up too easily here.  If any of these arguments
      are not constants, they will likely involve variables that have
      been mapped into fields of .omp_data_s for sharing with the child
      function.  With appropriate data flow, it would be possible to
      see through this.  */
-  if (!is_gimple_min_invariant (fd.n1)
-      || !is_gimple_min_invariant (fd.n2)
-      || !is_gimple_min_invariant (fd.step)
+  if (!is_gimple_min_invariant (fd.loop.n1)
+      || !is_gimple_min_invariant (fd.loop.n2)
+      || !is_gimple_min_invariant (fd.loop.step)
       || (fd.chunk_size && !is_gimple_min_invariant (fd.chunk_size)))
     return false;
 
@@ -337,7 +430,7 @@ get_ws_args_for (tree ws_stmt)
       struct omp_for_data fd;
       tree ws_args;
 
-      extract_omp_for_data (ws_stmt, &fd);
+      extract_omp_for_data (ws_stmt, &fd, NULL);
 
       ws_args = NULL_TREE;
       if (fd.chunk_size)
@@ -346,13 +439,13 @@ get_ws_args_for (tree ws_stmt)
 	  ws_args = tree_cons (NULL, t, ws_args);
 	}
 
-      t = fold_convert (long_integer_type_node, fd.step);
+      t = fold_convert (long_integer_type_node, fd.loop.step);
       ws_args = tree_cons (NULL, t, ws_args);
 
-      t = fold_convert (long_integer_type_node, fd.n2);
+      t = fold_convert (long_integer_type_node, fd.loop.n2);
       ws_args = tree_cons (NULL, t, ws_args);
 
-      t = fold_convert (long_integer_type_node, fd.n1);
+      t = fold_convert (long_integer_type_node, fd.loop.n1);
       ws_args = tree_cons (NULL, t, ws_args);
 
       return ws_args;
@@ -1270,6 +1363,7 @@ scan_omp_for (tree *stmt_p, omp_context 
 {
   omp_context *ctx;
   tree stmt;
+  int i;
 
   stmt = *stmt_p;
   ctx = new_omp_context (stmt, outer_ctx);
@@ -1277,9 +1371,12 @@ scan_omp_for (tree *stmt_p, omp_context 
   scan_sharing_clauses (OMP_FOR_CLAUSES (stmt), ctx);
 
   scan_omp (&OMP_FOR_PRE_BODY (stmt), ctx);
-  scan_omp (&OMP_FOR_INIT (stmt), ctx);
-  scan_omp (&OMP_FOR_COND (stmt), ctx);
-  scan_omp (&OMP_FOR_INCR (stmt), ctx);
+  for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (stmt)); i++)
+    {
+      scan_omp (&TREE_VEC_ELT (OMP_FOR_INIT (stmt), i), ctx);
+      scan_omp (&TREE_VEC_ELT (OMP_FOR_COND (stmt), i), ctx);
+      scan_omp (&TREE_VEC_ELT (OMP_FOR_INCR (stmt), i), ctx);
+    }
   scan_omp (&OMP_FOR_BODY (stmt), ctx);
 }
 
@@ -2810,7 +2907,64 @@ expand_omp_taskreg (struct omp_region *r
     L3:
 
     If this is a combined omp parallel loop, instead of the call to
-    GOMP_loop_foo_start, we call GOMP_loop_foo_next.  */
+    GOMP_loop_foo_start, we call GOMP_loop_foo_next.
+
+    For collapsed loops, given parameters:
+      collapse(3)
+      for (V1 = N11; V1 cond1 N12; V1 += STEP1)
+	for (V2 = N21; V2 cond2 N22; V2 += STEP2)
+	  for (V3 = N31; V3 cond3 N32; V3 += STEP3)
+	    BODY;
+
+    we generate pseudocode
+
+	if (cond3 is <)
+	  adj = STEP3 - 1;
+	else
+	  adj = STEP3 + 1;
+	count3 = (adj + N32 - N31) / STEP3;
+	if (cond2 is <)
+	  adj = STEP2 - 1;
+	else
+	  adj = STEP2 + 1;
+	count2 = (adj + N22 - N21) / STEP2;
+	if (cond1 is <)
+	  adj = STEP1 - 1;
+	else
+	  adj = STEP1 + 1;
+	count1 = (adj + N12 - N11) / STEP1;
+	count = count1 * count2 * count3;
+	more = GOMP_loop_foo_start (0, count, 1, CHUNK, &istart0, &iend0);
+	if (more) goto L0; else goto L3;
+    L0:
+	V = istart0;
+	T = V;
+	V3 = N31 + (T % count3) * STEP3;
+	T = T / count3;
+	V2 = N21 + (T % count2) * STEP2;
+	T = T / count2;
+	V1 = N11 + T * STEP1;
+	iend = iend0;
+    L1:
+	BODY;
+	V += 1;
+	if (V < iend) goto L10; else goto L2;
+    L10:
+	V3 += STEP3;
+	if (V3 cond3 N32) goto L1; else goto L11;
+    L11:
+	V3 = N31;
+	V2 += STEP2;
+	if (V2 cond2 N22) goto L1; else goto L12;
+    L12:
+	V2 = N21;
+	V1 += STEP1;
+	goto L1;
+    L2:
+	if (GOMP_loop_foo_next (&istart0, &iend0)) goto L0; else goto L3;
+    L3:
+
+      */
 
 static void
 expand_omp_for_generic (struct omp_region *region,
@@ -2820,16 +2974,18 @@ expand_omp_for_generic (struct omp_regio
 {
   tree type, istart0, iend0, iend, phi;
   tree t, vmain, vback;
-  basic_block entry_bb, cont_bb, exit_bb, l0_bb, l1_bb;
+  basic_block entry_bb, cont_bb, exit_bb, l0_bb, l1_bb, collapse_bb;
   basic_block l2_bb = NULL, l3_bb = NULL;
   block_stmt_iterator si;
   bool in_combined_parallel = is_combined_parallel (region);
   bool broken_loop = region->cont == NULL;
   edge e, ne;
+  tree *counts = NULL;
+  int i;
 
   gcc_assert (!broken_loop || !in_combined_parallel);
 
-  type = TREE_TYPE (fd->v);
+  type = TREE_TYPE (fd->loop.v);
 
   istart0 = create_tmp_var (long_integer_type_node, ".istart0");
   iend0 = create_tmp_var (long_integer_type_node, ".iend0");
@@ -2843,6 +2999,7 @@ expand_omp_for_generic (struct omp_regio
 
   entry_bb = region->entry;
   cont_bb = region->cont;
+  collapse_bb = NULL;
   gcc_assert (EDGE_COUNT (entry_bb->succs) == 2);
   gcc_assert (broken_loop
 	      || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest);
@@ -2860,7 +3017,46 @@ expand_omp_for_generic (struct omp_regio
   exit_bb = region->exit;
 
   si = bsi_last (entry_bb);
+
   gcc_assert (TREE_CODE (bsi_stmt (si)) == OMP_FOR);
+  if (fd->collapse > 1)
+    {
+      /* collapsed loops need work for expansion in SSA form.  */
+      gcc_assert (!gimple_in_ssa_p (cfun));
+      counts = (tree *) alloca (fd->collapse * sizeof (tree));
+      for (i = 0; i < fd->collapse; i++)
+	{
+	  tree itype = TREE_TYPE (fd->loops[i].v);
+	  t = build_int_cst (itype, (fd->loops[i].cond_code == LT_EXPR
+				     ? -1 : 1));
+	  t = fold_build2 (PLUS_EXPR, itype, fd->loops[i].step, t);
+	  t = fold_build2 (PLUS_EXPR, itype, t, fd->loops[i].n2);
+	  t = fold_build2 (MINUS_EXPR, itype, t, fd->loops[i].n1);
+	  t = fold_build2 (TRUNC_DIV_EXPR, itype, t, fd->loops[i].step);
+	  t = fold_convert (type, t);
+	  if (TREE_CODE (t) == INTEGER_CST)
+	    counts[i] = t;
+	  else
+	    {
+	      counts[i] = create_tmp_var (type, ".count");
+	      t = build_gimple_modify_stmt (counts[i], t);
+	      force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+					true, BSI_SAME_STMT);
+	    }
+	  if (SSA_VAR_P (fd->loop.n2))
+	    {
+	      if (i == 0)
+		t = build_gimple_modify_stmt (fd->loop.n2, counts[0]);
+	      else
+		{
+		  t = fold_build2 (MULT_EXPR, type, fd->loop.n2, counts[i]);
+		  t = build_gimple_modify_stmt (fd->loop.n2, t);
+		}
+	      force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+					true, BSI_SAME_STMT);
+	    }
+	}
+    }
   if (in_combined_parallel)
     {
       /* In a combined parallel loop, emit a call to
@@ -2876,9 +3072,9 @@ expand_omp_for_generic (struct omp_regio
 	 GOMP_loop_foo_start in ENTRY_BB.  */
       t4 = build_fold_addr_expr (iend0);
       t3 = build_fold_addr_expr (istart0);
-      t2 = fold_convert (long_integer_type_node, fd->step);
-      t1 = fold_convert (long_integer_type_node, fd->n2);
-      t0 = fold_convert (long_integer_type_node, fd->n1);
+      t2 = fold_convert (long_integer_type_node, fd->loop.step);
+      t1 = fold_convert (long_integer_type_node, fd->loop.n2);
+      t0 = fold_convert (long_integer_type_node, fd->loop.n1);
       if (fd->chunk_size)
 	{
 	  t = fold_convert (long_integer_type_node, fd->chunk_size);
@@ -2889,6 +3085,9 @@ expand_omp_for_generic (struct omp_regio
 	t = build_call_expr (built_in_decls[start_fn], 5,
 			     t0, t1, t2, t3, t4);
     }
+  if (TREE_TYPE (t) != boolean_type_node)
+    t = fold_build2 (NE_EXPR, boolean_type_node,
+		     t, build_int_cst (TREE_TYPE (t), 0));
   t = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 			       	true, BSI_SAME_STMT);
   t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
@@ -2901,12 +3100,12 @@ expand_omp_for_generic (struct omp_regio
     {
       e = find_edge (entry_bb, l3_bb);
       for (phi = phi_nodes (l3_bb); phi; phi = PHI_CHAIN (phi))
-	if (PHI_ARG_DEF_FROM_EDGE (phi, e) == fd->v)
-	  SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, e), fd->n1);
+	if (PHI_ARG_DEF_FROM_EDGE (phi, e) == fd->loop.v)
+	  SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, e), fd->loop.n1);
     }
   else
     {
-      t = build_gimple_modify_stmt (fd->v, fd->n1);
+      t = build_gimple_modify_stmt (fd->loop.v, fd->loop.n1);
       bsi_insert_before (&si, t, BSI_SAME_STMT);
     }
 
@@ -2918,14 +3117,39 @@ expand_omp_for_generic (struct omp_regio
   t = fold_convert (type, istart0);
   t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
 				false, BSI_CONTINUE_LINKING);
-  t = build_gimple_modify_stmt (fd->v, t);
+  t = build_gimple_modify_stmt (fd->loop.v, t);
   bsi_insert_after (&si, t, BSI_CONTINUE_LINKING);
   if (gimple_in_ssa_p (cfun))
-    SSA_NAME_DEF_STMT (fd->v) = t;
+    SSA_NAME_DEF_STMT (fd->loop.v) = t;
 
   t = fold_convert (type, iend0);
   iend = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				   false, BSI_CONTINUE_LINKING);
+  if (fd->collapse > 1)
+    {
+      tree tem = create_tmp_var (type, ".tem");
+
+      t = build_gimple_modify_stmt (tem, fd->loop.v);
+      bsi_insert_after (&si, t, BSI_CONTINUE_LINKING);
+      for (i = fd->collapse - 1; i >= 0; i--)
+	{
+	  tree itype = TREE_TYPE (fd->loops[i].v);
+	  t = fold_build2 (TRUNC_MOD_EXPR, type, tem, counts[i]);
+	  t = fold_convert (itype, t);
+	  t = fold_build2 (MULT_EXPR, itype, t, fd->loops[i].step);
+	  t = fold_build2 (PLUS_EXPR, itype, fd->loops[i].n1, t);
+	  t = build_gimple_modify_stmt (fd->loops[i].v, t);
+	  force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+				    false, BSI_CONTINUE_LINKING);
+	  if (i != 0)
+	    {
+	      t = fold_build2 (TRUNC_DIV_EXPR, type, tem, counts[i]);
+	      t = build_gimple_modify_stmt (tem, t);
+	      force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+					false, BSI_CONTINUE_LINKING);
+	    }
+	}
+    }
 
   if (!broken_loop)
     {
@@ -2937,7 +3161,7 @@ expand_omp_for_generic (struct omp_regio
       vmain = TREE_OPERAND (t, 1);
       vback = TREE_OPERAND (t, 0);
 
-      t = fold_build2 (PLUS_EXPR, type, vmain, fd->step);
+      t = fold_build2 (PLUS_EXPR, type, vmain, fd->loop.step);
       t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
 				    true, BSI_SAME_STMT);
       t = build_gimple_modify_stmt (vback, t);
@@ -2945,19 +3169,73 @@ expand_omp_for_generic (struct omp_regio
       if (gimple_in_ssa_p (cfun))
 	SSA_NAME_DEF_STMT (vback) = t;
   
-      t = build2 (fd->cond_code, boolean_type_node, vback, iend);
+      t = build2 (fd->loop.cond_code, boolean_type_node, vback, iend);
       t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
       bsi_insert_before (&si, t, BSI_SAME_STMT);
 
       /* Remove OMP_CONTINUE.  */
       bsi_remove (&si, true);
 
+      if (fd->collapse > 1)
+	{
+	  basic_block last_bb, bb;
+
+	  last_bb = cont_bb;
+	  for (i = fd->collapse - 1; i >= 0; i--)
+	    {
+	      tree itype = TREE_TYPE (fd->loops[i].v);
+
+	      bb = create_empty_bb (last_bb);
+	      si = bsi_start (bb);
+
+	      if (i < fd->collapse - 1)
+		{
+		  e = make_edge (last_bb, bb, EDGE_FALSE_VALUE);
+		  e->probability = REG_BR_PROB_BASE / 8;
+
+		  t = build_gimple_modify_stmt (fd->loops[i + 1].v,
+						fd->loops[i + 1].n1);
+		  force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+					    false, BSI_CONTINUE_LINKING);
+		}
+	      else
+		collapse_bb = bb;
+
+	      set_immediate_dominator (CDI_DOMINATORS, bb, last_bb);
+
+	      t = fold_build2 (PLUS_EXPR, itype, fd->loops[i].v,
+			       fd->loops[i].step);
+	      t = build_gimple_modify_stmt (fd->loops[i].v, t);
+	      force_gimple_operand_bsi (&si, t, true, NULL_TREE,
+					false, BSI_CONTINUE_LINKING);
+
+	      if (i > 0)
+		{
+		  t = fold_build2 (fd->loops[i].cond_code, boolean_type_node,
+				   fd->loops[i].v, fd->loops[i].n2);
+		  t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
+						false, BSI_CONTINUE_LINKING);
+		  t = build3 (COND_EXPR, void_type_node, t,
+			      NULL_TREE, NULL_TREE);
+		  bsi_insert_after (&si, t, BSI_CONTINUE_LINKING);
+		  e = make_edge (bb, l1_bb, EDGE_TRUE_VALUE);
+		  e->probability = REG_BR_PROB_BASE * 7 / 8;
+		}
+	      else
+		make_edge (bb, l1_bb, EDGE_FALLTHRU);
+	      last_bb = bb;
+	    }
+	}
+
       /* Emit code to get the next parallel iteration in L2_BB.  */
       si = bsi_start (l2_bb);
 
       t = build_call_expr (built_in_decls[next_fn], 2,
 			   build_fold_addr_expr (istart0),
 			   build_fold_addr_expr (iend0));
+      if (TREE_TYPE (t) != boolean_type_node)
+	t = fold_build2 (NE_EXPR, boolean_type_node,
+			 t, build_int_cst (TREE_TYPE (t), 0));
       t = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				    false, BSI_CONTINUE_LINKING);
       t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
@@ -2988,8 +3266,20 @@ expand_omp_for_generic (struct omp_regio
 		 PHI_ARG_DEF_FROM_EDGE (phi, e));
       remove_edge (e);
 
-      find_edge (cont_bb, l1_bb)->flags = EDGE_TRUE_VALUE;
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
+      if (fd->collapse > 1)
+	{
+	  e = find_edge (cont_bb, l1_bb);
+	  remove_edge (e);
+	  e = make_edge (cont_bb, collapse_bb, EDGE_TRUE_VALUE);
+	}
+      else
+	{
+	  e = find_edge (cont_bb, l1_bb);
+	  e->flags = EDGE_TRUE_VALUE;
+	}
+      e->probability = REG_BR_PROB_BASE * 7 / 8;
+      find_edge (cont_bb, l2_bb)->probability = REG_BR_PROB_BASE / 8;
       make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
 
       set_immediate_dominator (CDI_DOMINATORS, l2_bb,
@@ -3042,7 +3332,7 @@ expand_omp_for_static_nochunk (struct om
   basic_block fin_bb;
   block_stmt_iterator si;
 
-  type = TREE_TYPE (fd->v);
+  type = TREE_TYPE (fd->loop.v);
 
   entry_bb = region->entry;
   cont_bb = region->cont;
@@ -3069,26 +3359,21 @@ expand_omp_for_static_nochunk (struct om
   threadid = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				       true, BSI_SAME_STMT);
 
-  fd->n1 = force_gimple_operand_bsi (&si,
-				     fold_convert (type, fd->n1),
-				     true, NULL_TREE,
-				     true, BSI_SAME_STMT);
-
-  fd->n2 = force_gimple_operand_bsi (&si,
-				    fold_convert (type, fd->n2),
-				    true, NULL_TREE,
-				    true, BSI_SAME_STMT);
-
-  fd->step = force_gimple_operand_bsi (&si,
-				       fold_convert (type, fd->step),
-				       true, NULL_TREE,
-				       true, BSI_SAME_STMT);
-
-  t = build_int_cst (type, (fd->cond_code == LT_EXPR ? -1 : 1));
-  t = fold_build2 (PLUS_EXPR, type, fd->step, t);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n2);
-  t = fold_build2 (MINUS_EXPR, type, t, fd->n1);
-  t = fold_build2 (TRUNC_DIV_EXPR, type, t, fd->step);
+  fd->loop.n1
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.n1),
+				true, NULL_TREE, true, BSI_SAME_STMT);
+  fd->loop.n2
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.n2),
+				true, NULL_TREE, true, BSI_SAME_STMT);
+  fd->loop.step
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.step),
+				true, NULL_TREE, true, BSI_SAME_STMT);
+
+  t = build_int_cst (type, (fd->loop.cond_code == LT_EXPR ? -1 : 1));
+  t = fold_build2 (PLUS_EXPR, type, fd->loop.step, t);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n2);
+  t = fold_build2 (MINUS_EXPR, type, t, fd->loop.n1);
+  t = fold_build2 (TRUNC_DIV_EXPR, type, t, fd->loop.step);
   t = fold_convert (type, t);
   n = force_gimple_operand_bsi (&si, t, true, NULL_TREE, true, BSI_SAME_STMT);
 
@@ -3108,14 +3393,14 @@ expand_omp_for_static_nochunk (struct om
   e0 = force_gimple_operand_bsi (&si, t, true, NULL_TREE, true, BSI_SAME_STMT);
 
   t = fold_convert (type, s0);
-  t = fold_build2 (MULT_EXPR, type, t, fd->step);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n1);
+  t = fold_build2 (MULT_EXPR, type, t, fd->loop.step);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n1);
   t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
 				true, BSI_SAME_STMT);
-  t = build_gimple_modify_stmt (fd->v, t);
+  t = build_gimple_modify_stmt (fd->loop.v, t);
   bsi_insert_before (&si, t, BSI_SAME_STMT);
   if (gimple_in_ssa_p (cfun))
-    SSA_NAME_DEF_STMT (fd->v) = t;
+    SSA_NAME_DEF_STMT (fd->loop.v) = t;
 
   t = build2 (GE_EXPR, boolean_type_node, s0, e0);
   t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
@@ -3128,8 +3413,8 @@ expand_omp_for_static_nochunk (struct om
   si = bsi_start (seq_start_bb);
 
   t = fold_convert (type, e0);
-  t = fold_build2 (MULT_EXPR, type, t, fd->step);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n1);
+  t = fold_build2 (MULT_EXPR, type, t, fd->loop.step);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n1);
   e = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				false, BSI_CONTINUE_LINKING);
 
@@ -3140,7 +3425,7 @@ expand_omp_for_static_nochunk (struct om
   vmain = TREE_OPERAND (t, 1);
   vback = TREE_OPERAND (t, 0);
 
-  t = fold_build2 (PLUS_EXPR, type, vmain, fd->step);
+  t = fold_build2 (PLUS_EXPR, type, vmain, fd->loop.step);
   t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
 				true, BSI_SAME_STMT);
   t = build_gimple_modify_stmt (vback, t);
@@ -3148,7 +3433,7 @@ expand_omp_for_static_nochunk (struct om
   if (gimple_in_ssa_p (cfun))
     SSA_NAME_DEF_STMT (vback) = t;
 
-  t = build2 (fd->cond_code, boolean_type_node, vback, e);
+  t = build2 (fd->loop.cond_code, boolean_type_node, vback, e);
   t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
   bsi_insert_before (&si, t, BSI_SAME_STMT);
 
@@ -3212,7 +3497,8 @@ expand_omp_for_static_nochunk (struct om
 */
 
 static void
-expand_omp_for_static_chunk (struct omp_region *region, struct omp_for_data *fd)
+expand_omp_for_static_chunk (struct omp_region *region,
+			     struct omp_for_data *fd)
 {
   tree n, s0, e0, e, t, phi, nphi, args;
   tree trip_var, trip_init, trip_main, trip_back, nthreads, threadid;
@@ -3222,7 +3508,7 @@ expand_omp_for_static_chunk (struct omp_
   block_stmt_iterator si;
   edge se, re, ene;
 
-  type = TREE_TYPE (fd->v);
+  type = TREE_TYPE (fd->loop.v);
 
   entry_bb = region->entry;
   se = split_block (entry_bb, last_stmt (entry_bb));
@@ -3254,26 +3540,24 @@ expand_omp_for_static_chunk (struct omp_
   threadid = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				       true, BSI_SAME_STMT);
 
-  fd->n1 = force_gimple_operand_bsi (&si, fold_convert (type, fd->n1),
-				     true, NULL_TREE,
-				     true, BSI_SAME_STMT);
-  fd->n2 = force_gimple_operand_bsi (&si, fold_convert (type, fd->n2),
-				     true, NULL_TREE,
-				     true, BSI_SAME_STMT);
-  fd->step = force_gimple_operand_bsi (&si, fold_convert (type, fd->step),
-				       true, NULL_TREE,
-				       true, BSI_SAME_STMT);
+  fd->loop.n1
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.n1),
+				true, NULL_TREE, true, BSI_SAME_STMT);
+  fd->loop.n2
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.n2),
+				true, NULL_TREE, true, BSI_SAME_STMT);
+  fd->loop.step
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->loop.step),
+				true, NULL_TREE, true, BSI_SAME_STMT);
   fd->chunk_size
-	  = force_gimple_operand_bsi (&si, fold_convert (type,
-							 fd->chunk_size),
-				      true, NULL_TREE,
-				      true, BSI_SAME_STMT);
+    = force_gimple_operand_bsi (&si, fold_convert (type, fd->chunk_size),
+				true, NULL_TREE, true, BSI_SAME_STMT);
 
-  t = build_int_cst (type, (fd->cond_code == LT_EXPR ? -1 : 1));
-  t = fold_build2 (PLUS_EXPR, type, fd->step, t);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n2);
-  t = fold_build2 (MINUS_EXPR, type, t, fd->n1);
-  t = fold_build2 (TRUNC_DIV_EXPR, type, t, fd->step);
+  t = build_int_cst (type, (fd->loop.cond_code == LT_EXPR ? -1 : 1));
+  t = fold_build2 (PLUS_EXPR, type, fd->loop.step, t);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n2);
+  t = fold_build2 (MINUS_EXPR, type, t, fd->loop.n1);
+  t = fold_build2 (TRUNC_DIV_EXPR, type, t, fd->loop.step);
   t = fold_convert (type, t);
   n = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				true, BSI_SAME_STMT);
@@ -3299,8 +3583,8 @@ expand_omp_for_static_chunk (struct omp_
     SSA_NAME_DEF_STMT (trip_init) = t;
 
   t = fold_build2 (MULT_EXPR, type, threadid, fd->chunk_size);
-  t = fold_build2 (MULT_EXPR, type, t, fd->step);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n1);
+  t = fold_build2 (MULT_EXPR, type, t, fd->loop.step);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n1);
   v_extra = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				      true, BSI_SAME_STMT);
 
@@ -3329,18 +3613,18 @@ expand_omp_for_static_chunk (struct omp_
   si = bsi_start (seq_start_bb);
 
   t = fold_convert (type, s0);
-  t = fold_build2 (MULT_EXPR, type, t, fd->step);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n1);
+  t = fold_build2 (MULT_EXPR, type, t, fd->loop.step);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n1);
   t = force_gimple_operand_bsi (&si, t, false, NULL_TREE,
 				false, BSI_CONTINUE_LINKING);
-  t = build_gimple_modify_stmt (fd->v, t);
+  t = build_gimple_modify_stmt (fd->loop.v, t);
   bsi_insert_after (&si, t, BSI_CONTINUE_LINKING);
   if (gimple_in_ssa_p (cfun))
-    SSA_NAME_DEF_STMT (fd->v) = t;
+    SSA_NAME_DEF_STMT (fd->loop.v) = t;
 
   t = fold_convert (type, e0);
-  t = fold_build2 (MULT_EXPR, type, t, fd->step);
-  t = fold_build2 (PLUS_EXPR, type, t, fd->n1);
+  t = fold_build2 (MULT_EXPR, type, t, fd->loop.step);
+  t = fold_build2 (PLUS_EXPR, type, t, fd->loop.n1);
   e = force_gimple_operand_bsi (&si, t, true, NULL_TREE,
 				false, BSI_CONTINUE_LINKING);
 
@@ -3352,13 +3636,13 @@ expand_omp_for_static_chunk (struct omp_
   v_main = TREE_OPERAND (cont, 1);
   v_back = TREE_OPERAND (cont, 0);
 
-  t = build2 (PLUS_EXPR, type, v_main, fd->step);
+  t = build2 (PLUS_EXPR, type, v_main, fd->loop.step);
   t = build_gimple_modify_stmt (v_back, t);
   bsi_insert_before (&si, t, BSI_SAME_STMT);
   if (gimple_in_ssa_p (cfun))
     SSA_NAME_DEF_STMT (v_back) = t;
 
-  t = build2 (fd->cond_code, boolean_type_node, v_back, e);
+  t = build2 (fd->loop.cond_code, boolean_type_node, v_back, e);
   t = build3 (COND_EXPR, void_type_node, t, NULL_TREE, NULL_TREE);
   bsi_insert_before (&si, t, BSI_SAME_STMT);
   
@@ -3412,9 +3696,9 @@ expand_omp_for_static_chunk (struct omp_
 	  SSA_NAME_DEF_STMT (t) = nphi;
 
 	  t = PHI_ARG_DEF_FROM_EDGE (phi, se);
-	  /* A special case -- fd->v is not yet computed in iter_part_bb, we
-	     need to use v_extra instead.  */
-	  if (t == fd->v)
+	  /* A special case -- fd->loop.v is not yet computed in
+	     iter_part_bb, we need to use v_extra instead.  */
+	  if (t == fd->loop.v)
 	    t = v_extra;
 	  add_phi_arg (nphi, t, ene);
 	  add_phi_arg (nphi, TREE_VALUE (args), re);
@@ -3448,8 +3732,14 @@ static void
 expand_omp_for (struct omp_region *region)
 {
   struct omp_for_data fd;
+  struct omp_for_data_loop *loops;
+
+  loops
+    = (struct omp_for_data_loop *)
+      alloca (TREE_VEC_LENGTH (OMP_FOR_INIT (last_stmt (region->entry)))
+	      * sizeof (struct omp_for_data_loop));
 
-  extract_omp_for_data (last_stmt (region->entry), &fd);
+  extract_omp_for_data (last_stmt (region->entry), &fd, loops);
   region->sched_kind = fd.sched_kind;
 
   gcc_assert (EDGE_COUNT (region->entry->succs) == 2);
@@ -3464,6 +3754,7 @@ expand_omp_for (struct omp_region *regio
 
   if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC
       && !fd.have_ordered
+      && fd.collapse == 1
       && region->cont != NULL)
     {
       if (fd.chunk_size == NULL)
@@ -4453,6 +4744,9 @@ lower_omp_single_simple (tree single_stm
   tree t;
 
   t = build_call_expr (built_in_decls[BUILT_IN_GOMP_SINGLE_START], 0);
+  if (TREE_TYPE (t) != boolean_type_node)
+    t = fold_build2 (NE_EXPR, boolean_type_node,
+		     t, build_int_cst (TREE_TYPE (t), 0));
   t = build3 (COND_EXPR, void_type_node, t,
 	      OMP_SINGLE_BODY (single_stmt), NULL);
   gimplify_and_add (t, pre_p);
@@ -4759,19 +5053,19 @@ lower_omp_for_lastprivate (struct omp_fo
   tree clauses, cond, stmts, vinit, t;
   enum tree_code cond_code;
   
-  cond_code = fd->cond_code;
+  cond_code = fd->loop.cond_code;
   cond_code = cond_code == LT_EXPR ? GE_EXPR : LE_EXPR;
 
   /* When possible, use a strict equality expression.  This can let VRP
      type optimizations deduce the value and remove a copy.  */
-  if (host_integerp (fd->step, 0))
+  if (host_integerp (fd->loop.step, 0))
     {
-      HOST_WIDE_INT step = TREE_INT_CST_LOW (fd->step);
+      HOST_WIDE_INT step = TREE_INT_CST_LOW (fd->loop.step);
       if (step == 1 || step == -1)
 	cond_code = EQ_EXPR;
     }
 
-  cond = build2 (cond_code, boolean_type_node, fd->v, fd->n2);
+  cond = build2 (cond_code, boolean_type_node, fd->loop.v, fd->loop.n2);
 
   clauses = OMP_FOR_CLAUSES (fd->for_stmt);
   stmts = NULL;
@@ -4781,15 +5075,15 @@ lower_omp_for_lastprivate (struct omp_fo
       append_to_statement_list (stmts, dlist);
 
       /* Optimize: v = 0; is usually cheaper than v = some_other_constant.  */
-      vinit = fd->n1;
+      vinit = fd->loop.n1;
       if (cond_code == EQ_EXPR
-	  && host_integerp (fd->n2, 0)
-	  && ! integer_zerop (fd->n2))
-	vinit = build_int_cst (TREE_TYPE (fd->v), 0);
+	  && host_integerp (fd->loop.n2, 0)
+	  && ! integer_zerop (fd->loop.n2))
+	vinit = build_int_cst (TREE_TYPE (fd->loop.v), 0);
 
       /* Initialize the iterator variable, so that threads that don't execute
 	 any iterations don't execute the lastprivate clauses by accident.  */
-      t = build_gimple_modify_stmt (fd->v, vinit);
+      t = build_gimple_modify_stmt (fd->loop.v, vinit);
       gimplify_and_add (t, body_p);
     }
 }
@@ -4802,6 +5096,7 @@ lower_omp_for (tree *stmt_p, omp_context
 {
   tree t, stmt, ilist, dlist, new_stmt, *body_p, *rhs_p;
   struct omp_for_data fd;
+  int i;
 
   stmt = *stmt_p;
 
@@ -4832,20 +5127,24 @@ lower_omp_for (tree *stmt_p, omp_context
 
      We just need to make sure that VAL1, VAL2 and VAL3 are lowered
      using the .omp_data_s mapping, if needed.  */
-  rhs_p = &GIMPLE_STMT_OPERAND (OMP_FOR_INIT (stmt), 1);
-  if (!is_gimple_min_invariant (*rhs_p))
-    *rhs_p = get_formal_tmp_var (*rhs_p, body_p);
-
-  rhs_p = &TREE_OPERAND (OMP_FOR_COND (stmt), 1);
-  if (!is_gimple_min_invariant (*rhs_p))
-    *rhs_p = get_formal_tmp_var (*rhs_p, body_p);
-
-  rhs_p = &TREE_OPERAND (GIMPLE_STMT_OPERAND (OMP_FOR_INCR (stmt), 1), 1);
-  if (!is_gimple_min_invariant (*rhs_p))
-    *rhs_p = get_formal_tmp_var (*rhs_p, body_p);
+  for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (stmt)); i++)
+    {
+      rhs_p = &GIMPLE_STMT_OPERAND (TREE_VEC_ELT (OMP_FOR_INIT (stmt), i), 1);
+      if (!is_gimple_min_invariant (*rhs_p))
+	*rhs_p = get_formal_tmp_var (*rhs_p, body_p);
+
+      rhs_p = &TREE_OPERAND (TREE_VEC_ELT (OMP_FOR_COND (stmt), i), 1);
+      if (!is_gimple_min_invariant (*rhs_p))
+	*rhs_p = get_formal_tmp_var (*rhs_p, body_p);
+
+      rhs_p = &TREE_OPERAND (GIMPLE_STMT_OPERAND
+			       (TREE_VEC_ELT (OMP_FOR_INCR (stmt), i), 1), 1);
+      if (!is_gimple_min_invariant (*rhs_p))
+	*rhs_p = get_formal_tmp_var (*rhs_p, body_p);
+    }
 
   /* Once lowered, extract the bounds and clauses.  */
-  extract_omp_for_data (stmt, &fd);
+  extract_omp_for_data (stmt, &fd, NULL);
 
   lower_omp_for_lastprivate (&fd, body_p, &dlist, ctx);
 
@@ -4853,7 +5152,7 @@ lower_omp_for (tree *stmt_p, omp_context
 
   append_to_statement_list (OMP_FOR_BODY (stmt), body_p);
 
-  t = build2 (OMP_CONTINUE, void_type_node, fd.v, fd.v);
+  t = build2 (OMP_CONTINUE, void_type_node, fd.loop.v, fd.loop.v);
   append_to_statement_list (t, body_p);
 
   /* After the loop, add exit clauses.  */
@@ -5265,6 +5564,7 @@ diagnose_sb_1 (tree *tp, int *walk_subtr
   tree context = (tree) wi->info;
   tree inner_context;
   tree t = *tp;
+  int i;
 
   *walk_subtrees = 0;
   switch (TREE_CODE (t))
@@ -5290,9 +5590,15 @@ diagnose_sb_1 (tree *tp, int *walk_subtr
       walk_tree (&OMP_FOR_CLAUSES (t), diagnose_sb_1, wi, NULL);
       inner_context = tree_cons (NULL, t, context);
       wi->info = inner_context;
-      walk_tree (&OMP_FOR_INIT (t), diagnose_sb_1, wi, NULL);
-      walk_tree (&OMP_FOR_COND (t), diagnose_sb_1, wi, NULL);
-      walk_tree (&OMP_FOR_INCR (t), diagnose_sb_1, wi, NULL);
+      for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (t)); i++)
+	{
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_INIT (t), i), diagnose_sb_1,
+		     wi, NULL);
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_COND (t), i), diagnose_sb_1,
+		     wi, NULL);
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_INCR (t), i), diagnose_sb_1,
+		     wi, NULL);
+	}
       walk_stmts (wi, &OMP_FOR_PRE_BODY (t));
       walk_stmts (wi, &OMP_FOR_BODY (t));
       wi->info = context;
@@ -5320,6 +5626,7 @@ diagnose_sb_2 (tree *tp, int *walk_subtr
   tree context = (tree) wi->info;
   splay_tree_node n;
   tree t = *tp;
+  int i;
 
   *walk_subtrees = 0;
   switch (TREE_CODE (t))
@@ -5342,9 +5649,15 @@ diagnose_sb_2 (tree *tp, int *walk_subtr
     case OMP_FOR:
       walk_tree (&OMP_FOR_CLAUSES (t), diagnose_sb_2, wi, NULL);
       wi->info = t;
-      walk_tree (&OMP_FOR_INIT (t), diagnose_sb_2, wi, NULL);
-      walk_tree (&OMP_FOR_COND (t), diagnose_sb_2, wi, NULL);
-      walk_tree (&OMP_FOR_INCR (t), diagnose_sb_2, wi, NULL);
+      for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (t)); i++)
+	{
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_INIT (t), i), diagnose_sb_2,
+		     wi, NULL);
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_COND (t), i), diagnose_sb_2,
+		     wi, NULL);
+	  walk_tree (&TREE_VEC_ELT (OMP_FOR_INCR (t), i), diagnose_sb_2,
+		     wi, NULL);
+	}
       walk_stmts (wi, &OMP_FOR_PRE_BODY (t));
       walk_stmts (wi, &OMP_FOR_BODY (t));
       wi->info = context;
--- gcc/cp/pt.c	(revision 131902)
+++ gcc/cp/pt.c	(working copy)
@@ -10525,12 +10525,12 @@ tsubst_expr (tree t, tree args, tsubst_f
 
 	clauses = tsubst_omp_clauses (OMP_FOR_CLAUSES (t),
 				      args, complain, in_decl);
-	init = OMP_FOR_INIT (t);
+	init = TREE_VEC_ELT (OMP_FOR_INIT (t), 0);
 	gcc_assert (TREE_CODE (init) == MODIFY_EXPR);
 	decl = RECUR (TREE_OPERAND (init, 0));
 	init = RECUR (TREE_OPERAND (init, 1));
-	cond = RECUR (OMP_FOR_COND (t));
-	incr = RECUR (OMP_FOR_INCR (t));
+	cond = RECUR (TREE_VEC_ELT (OMP_FOR_COND (t), 0));
+	incr = RECUR (TREE_VEC_ELT (OMP_FOR_INCR (t), 0));
 
 	stmt = begin_omp_structured_block ();
 
--- gcc/cp/semantics.c	(revision 131902)
+++ gcc/cp/semantics.c	(working copy)
@@ -3907,9 +3907,12 @@ finish_omp_for (location_t locus, tree d
       init = build2 (MODIFY_EXPR, void_type_node, decl, init);
 
       TREE_TYPE (stmt) = void_type_node;
-      OMP_FOR_INIT (stmt) = init;
-      OMP_FOR_COND (stmt) = cond;
-      OMP_FOR_INCR (stmt) = incr;
+      OMP_FOR_INIT (stmt) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_INIT (stmt), 0) = init;
+      OMP_FOR_COND (stmt) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_COND (stmt), 0) = cond;
+      OMP_FOR_INCR (stmt) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_INCR (stmt), 0) = incr;
       OMP_FOR_BODY (stmt) = body;
       OMP_FOR_PRE_BODY (stmt) = pre_body;
 
@@ -3946,11 +3949,11 @@ finish_omp_for (location_t locus, tree d
   if (decl != error_mark_node && init != error_mark_node)
     omp_for = c_finish_omp_for (locus, decl, init, cond, incr, body, pre_body);
   if (omp_for != NULL
-      && TREE_CODE (OMP_FOR_INCR (omp_for)) == MODIFY_EXPR
-      && TREE_SIDE_EFFECTS (TREE_OPERAND (OMP_FOR_INCR (omp_for), 1))
-      && BINARY_CLASS_P (TREE_OPERAND (OMP_FOR_INCR (omp_for), 1)))
+      && TREE_CODE (TREE_VEC_ELT (OMP_FOR_INCR (omp_for), 0)) == MODIFY_EXPR
+      && TREE_SIDE_EFFECTS (TREE_OPERAND (TREE_VEC_ELT (OMP_FOR_INCR (omp_for), 0), 1))
+      && BINARY_CLASS_P (TREE_OPERAND (TREE_VEC_ELT (OMP_FOR_INCR (omp_for), 0), 1)))
     {
-      tree t = TREE_OPERAND (OMP_FOR_INCR (omp_for), 1);
+      tree t = TREE_OPERAND (TREE_VEC_ELT (OMP_FOR_INCR (omp_for), 0), 1);
       int n = TREE_SIDE_EFFECTS (TREE_OPERAND (t, 1)) != 0;
 
       if (!processing_template_decl)
--- gcc/cp/parser.c	(revision 131902)
+++ gcc/cp/parser.c	(working copy)
@@ -19479,6 +19479,8 @@ cp_parser_omp_clause_collapse (cp_parser
   c = build_omp_clause (OMP_CLAUSE_COLLAPSE);
   OMP_CLAUSE_CHAIN (c) = list;
   OMP_CLAUSE_COLLAPSE_EXPR (c) = num;
+  OMP_CLAUSE_COLLAPSE_ITERVAR (c) = NULL;
+  OMP_CLAUSE_COLLAPSE_COUNT (c) = NULL;
 
   return c;
 }
--- gcc/tree-parloops.c	(revision 131902)
+++ gcc/tree-parloops.c	(working copy)
@@ -1562,13 +1562,16 @@ create_parallel_loop (struct loop *loop,
   for_stmt = make_node (OMP_FOR);
   TREE_TYPE (for_stmt) = void_type_node;
   OMP_FOR_CLAUSES (for_stmt) = t;
-  OMP_FOR_INIT (for_stmt) = build_gimple_modify_stmt (initvar, cvar_init);
-  OMP_FOR_COND (for_stmt) = cond;
-  OMP_FOR_INCR (for_stmt) = build_gimple_modify_stmt (cvar_base,
-						      build2 (PLUS_EXPR, type,
-							      cvar_base,
-							      build_int_cst
-							      (type, 1)));
+  OMP_FOR_INIT (for_stmt) = make_tree_vec (1);
+  TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), 0)
+    = build_gimple_modify_stmt (initvar, cvar_init);
+  OMP_FOR_COND (for_stmt) = make_tree_vec (1);
+  TREE_VEC_ELT (OMP_FOR_COND (for_stmt), 0) = cond;
+  OMP_FOR_INCR (for_stmt) = make_tree_vec (2);
+  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), 0)
+    = build_gimple_modify_stmt (cvar_base,
+				build2 (PLUS_EXPR, type, cvar_base,
+					build_int_cst (type, 1)));
   OMP_FOR_BODY (for_stmt) = NULL_TREE;
   OMP_FOR_PRE_BODY (for_stmt) = NULL_TREE;
 
--- gcc/fortran/openmp.c	(revision 131899)
+++ gcc/fortran/openmp.c	(working copy)
@@ -1275,15 +1275,34 @@ struct omp_context
   struct pointer_set_t *private_iterators;
   struct omp_context *previous;
 } *omp_current_ctx;
-gfc_code *omp_current_do_code;
-
+static gfc_code *omp_current_do_code;
+static int omp_current_do_collapse;
 
 void
 gfc_resolve_omp_do_blocks (gfc_code *code, gfc_namespace *ns)
 {
   if (code->block->next && code->block->next->op == EXEC_DO)
-    omp_current_do_code = code->block->next;
+    {
+      int i;
+      gfc_code *c;
+
+      omp_current_do_code = code->block->next;
+      omp_current_do_collapse = code->ext.omp_clauses->collapse;
+      for (i = 1, c = omp_current_do_code; i < omp_current_do_collapse; i++)
+	{
+	  c = c->block;
+	  if (c->op != EXEC_DO || c->next == NULL)
+	    break;
+	  c = c->next;
+	  if (c->op != EXEC_DO)
+	    break;
+	}
+      if (i < omp_current_do_collapse || omp_current_do_collapse <= 0)
+	omp_current_do_collapse = 1;
+    }
   gfc_resolve_blocks (code->block, ns);
+  omp_current_do_collapse = 0;
+  omp_current_do_code = NULL;
 }
 
 
@@ -1323,6 +1342,8 @@ void
 gfc_resolve_do_iterator (gfc_code *code, gfc_symbol *sym)
 {
   struct omp_context *ctx;
+  int i = omp_current_do_collapse;
+  gfc_code *c = omp_current_do_code;
 
   if (sym->attr.threadprivate)
     return;
@@ -1330,8 +1351,14 @@ gfc_resolve_do_iterator (gfc_code *code,
   /* !$omp do and !$omp parallel do iteration variable is predetermined
      private just in the !$omp do resp. !$omp parallel do construct,
      with no implications for the outer parallel constructs.  */
-  if (code == omp_current_do_code)
-    return;
+
+  while (i-- >= 1)
+    {
+      if (code == c)
+	return;
+
+      c = c->block->next;
+    }
 
   for (ctx = omp_current_ctx; ctx; ctx = ctx->previous)
     {
@@ -1355,8 +1382,8 @@ gfc_resolve_do_iterator (gfc_code *code,
 static void
 resolve_omp_do (gfc_code *code)
 {
-  gfc_code *do_code;
-  int list;
+  gfc_code *do_code, *c;
+  int list, i, collapse;
   gfc_namelist *n;
   gfc_symbol *dovar;
 
@@ -1364,11 +1391,17 @@ resolve_omp_do (gfc_code *code)
     resolve_omp_clauses (code);
 
   do_code = code->block->next;
-  if (do_code->op == EXEC_DO_WHILE)
-    gfc_error ("!$OMP DO cannot be a DO WHILE or DO without loop control "
-	       "at %L", &do_code->loc);
-  else
+  collapse = code->ext.omp_clauses->collapse;
+  if (collapse <= 0)
+    collapse = 1;
+  for (i = 1; i <= collapse; i++)
     {
+      if (do_code->op == EXEC_DO_WHILE)
+	{
+	  gfc_error ("!$OMP DO cannot be a DO WHILE or DO without loop control "
+		     "at %L", &do_code->loc);
+	  break;
+	}
       gcc_assert (do_code->op == EXEC_DO);
       if (do_code->ext.iterator->var->ts.type != BT_INTEGER)
 	gfc_error ("!$OMP DO iteration variable must be of type integer at %L",
@@ -1388,6 +1421,53 @@ resolve_omp_do (gfc_code *code)
 			     &do_code->loc);
 		  break;
 		}
+      if (i > 1)
+	{
+	  gfc_code *do_code2 = code->block->next;
+	  int j;
+
+	  for (j = 1; j < i; j++)
+	    {
+	      gfc_symbol *ivar = do_code2->ext.iterator->var->symtree->n.sym;
+	      if (dovar == ivar
+		  || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->start)
+		  || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->end)
+		  || gfc_find_sym_in_expr (ivar, do_code->ext.iterator->step))
+		{
+		  gfc_error ("!$OMP DO collapsed loops don't form rectangular iteration space at %L",
+			     &do_code->loc);
+		  break;
+		}
+	      if (j < i)
+		break;
+	      do_code2 = do_code2->block->next;
+	    }
+	}
+      if (i == collapse)
+	break;
+      for (c = do_code->next; c; c = c->next)
+	if (c->op != EXEC_NOP && c->op != EXEC_CONTINUE)
+	  {
+	    gfc_error ("collapsed !$OMP DO loops not perfectly nested at %L",
+		       &c->loc);
+	    break;
+	  }
+      if (c)
+	break;
+      do_code = do_code->block;
+      if (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE)
+	{
+	  gfc_error ("not enough DO loops for collapsed !$OMP DO at %L",
+		     &code->loc);
+	  break;
+	}
+      do_code = do_code->next;
+      if (do_code->op != EXEC_DO && do_code->op != EXEC_DO_WHILE)
+	{
+	  gfc_error ("not enough DO loops for collapsed !$OMP DO at %L",
+		     &code->loc);
+	  break;
+	}
     }
 }
 
--- gcc/fortran/trans-openmp.c	(revision 131899)
+++ gcc/fortran/trans-openmp.c	(working copy)
@@ -693,6 +693,8 @@ gfc_trans_omp_clauses (stmtblock_t *bloc
     {
       c = build_omp_clause (OMP_CLAUSE_COLLAPSE);
       OMP_CLAUSE_COLLAPSE_EXPR (c) = build_int_cst (NULL, clauses->collapse);
+      OMP_CLAUSE_COLLAPSE_ITERVAR (c) = NULL;
+      OMP_CLAUSE_COLLAPSE_COUNT (c) = NULL;
       omp_clauses = gfc_trans_add_clause (c, omp_clauses);
     }
 
@@ -919,13 +921,21 @@ gfc_trans_omp_do (gfc_code *code, stmtbl
   tree count = NULL_TREE, cycle_label, tmp, omp_clauses;
   stmtblock_t block;
   stmtblock_t body;
-  int simple = 0;
-  bool dovar_found = false;
   gfc_omp_clauses *clauses = code->ext.omp_clauses;
+  gfc_code *outermost;
+  int i, collapse = clauses->collapse;
+  tree dovar_init = NULL_TREE;
 
-  code = code->block->next;
+  if (collapse <= 0)
+    collapse = 1;
+
+  outermost = code = code->block->next;
   gcc_assert (code->op == EXEC_DO);
 
+  init = make_tree_vec (collapse);
+  cond = make_tree_vec (collapse);
+  incr = make_tree_vec (collapse);
+
   if (pblock == NULL)
     {
       gfc_start_block (&block);
@@ -933,107 +943,126 @@ gfc_trans_omp_do (gfc_code *code, stmtbl
     }
 
   omp_clauses = gfc_trans_omp_clauses (pblock, do_clauses, code->loc);
-  if (clauses)
+
+  for (i = 0; i < collapse; i++)
     {
-      gfc_namelist *n;
-      for (n = clauses->lists[OMP_LIST_LASTPRIVATE]; n != NULL; n = n->next)
-	if (code->ext.iterator->var->symtree->n.sym == n->sym)
-	  break;
-      if (n == NULL)
-	for (n = clauses->lists[OMP_LIST_PRIVATE]; n != NULL; n = n->next)
-	  if (code->ext.iterator->var->symtree->n.sym == n->sym)
-	    break;
-      if (n != NULL)
-	dovar_found = true;
-    }
-
-  /* Evaluate all the expressions in the iterator.  */
-  gfc_init_se (&se, NULL);
-  gfc_conv_expr_lhs (&se, code->ext.iterator->var);
-  gfc_add_block_to_block (pblock, &se.pre);
-  dovar = se.expr;
-  type = TREE_TYPE (dovar);
-  gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
-
-  gfc_init_se (&se, NULL);
-  gfc_conv_expr_val (&se, code->ext.iterator->start);
-  gfc_add_block_to_block (pblock, &se.pre);
-  from = gfc_evaluate_now (se.expr, pblock);
-
-  gfc_init_se (&se, NULL);
-  gfc_conv_expr_val (&se, code->ext.iterator->end);
-  gfc_add_block_to_block (pblock, &se.pre);
-  to = gfc_evaluate_now (se.expr, pblock);
-
-  gfc_init_se (&se, NULL);
-  gfc_conv_expr_val (&se, code->ext.iterator->step);
-  gfc_add_block_to_block (pblock, &se.pre);
-  step = gfc_evaluate_now (se.expr, pblock);
-
-  /* Special case simple loops.  */
-  if (integer_onep (step))
-    simple = 1;
-  else if (tree_int_cst_equal (step, integer_minus_one_node))
-    simple = -1;
-
-  /* Loop body.  */
-  if (simple)
-    {
-      init = build2_v (GIMPLE_MODIFY_STMT, dovar, from);
-      cond = build2 (simple > 0 ? LE_EXPR : GE_EXPR, boolean_type_node,
-		     dovar, to);
-      incr = fold_build2 (PLUS_EXPR, type, dovar, step);
-      incr = fold_build2 (GIMPLE_MODIFY_STMT, type, dovar, incr);
-      if (pblock != &block)
+      int simple = 0;
+      bool dovar_found = false;
+
+      if (clauses)
 	{
-	  pushlevel (0);
-	  gfc_start_block (&block);
+	  gfc_namelist *n;
+	  for (n = clauses->lists[OMP_LIST_LASTPRIVATE]; n != NULL;
+	       n = n->next)
+	    if (code->ext.iterator->var->symtree->n.sym == n->sym)
+	      break;
+	  if (n == NULL)
+	    for (n = clauses->lists[OMP_LIST_PRIVATE]; n != NULL; n = n->next)
+	      if (code->ext.iterator->var->symtree->n.sym == n->sym)
+		break;
+	  if (n != NULL)
+	    dovar_found = true;
 	}
-      gfc_start_block (&body);
+
+      /* Evaluate all the expressions in the iterator.  */
+      gfc_init_se (&se, NULL);
+      gfc_conv_expr_lhs (&se, code->ext.iterator->var);
+      gfc_add_block_to_block (pblock, &se.pre);
+      dovar = se.expr;
+      type = TREE_TYPE (dovar);
+      gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
+
+      gfc_init_se (&se, NULL);
+      gfc_conv_expr_val (&se, code->ext.iterator->start);
+      gfc_add_block_to_block (pblock, &se.pre);
+      from = gfc_evaluate_now (se.expr, pblock);
+
+      gfc_init_se (&se, NULL);
+      gfc_conv_expr_val (&se, code->ext.iterator->end);
+      gfc_add_block_to_block (pblock, &se.pre);
+      to = gfc_evaluate_now (se.expr, pblock);
+
+      gfc_init_se (&se, NULL);
+      gfc_conv_expr_val (&se, code->ext.iterator->step);
+      gfc_add_block_to_block (pblock, &se.pre);
+      step = gfc_evaluate_now (se.expr, pblock);
+
+      /* Special case simple loops.  */
+      if (integer_onep (step))
+	simple = 1;
+      else if (tree_int_cst_equal (step, integer_minus_one_node))
+	simple = -1;
+
+      /* Loop body.  */
+      if (simple)
+	{
+	  TREE_VEC_ELT (init, i) = build2_v (GIMPLE_MODIFY_STMT, dovar, from);
+	  TREE_VEC_ELT (cond, i) = build2 (simple > 0 ? LE_EXPR : GE_EXPR,
+					   boolean_type_node, dovar, to);
+	  TREE_VEC_ELT (incr, i) = fold_build2 (PLUS_EXPR, type, dovar, step);
+	  TREE_VEC_ELT (incr, i) = fold_build2 (GIMPLE_MODIFY_STMT, type, dovar,
+						TREE_VEC_ELT (incr, i));
+	}
+      else
+	{
+	  /* STEP is not 1 or -1.  Use:
+	     for (count = 0; count < (to + step - from) / step; count++)
+	       {
+		 dovar = from + count * step;
+		 body;
+	       cycle_label:;
+	       }  */
+	  tmp = fold_build2 (MINUS_EXPR, type, step, from);
+	  tmp = fold_build2 (PLUS_EXPR, type, to, tmp);
+	  tmp = fold_build2 (TRUNC_DIV_EXPR, type, tmp, step);
+	  tmp = gfc_evaluate_now (tmp, pblock);
+	  count = gfc_create_var (type, "count");
+	  TREE_VEC_ELT (init, i) = build2_v (GIMPLE_MODIFY_STMT, count,
+					     build_int_cst (type, 0));
+	  TREE_VEC_ELT (cond, i) = build2 (LT_EXPR, boolean_type_node,
+					   count, tmp);
+	  TREE_VEC_ELT (incr, i) = fold_build2 (PLUS_EXPR, type, count,
+						build_int_cst (type, 1));
+	  TREE_VEC_ELT (incr, i) = fold_build2 (GIMPLE_MODIFY_STMT, type,
+						count, TREE_VEC_ELT (incr, i));
+
+	  /* Initialize DOVAR.  */
+	  tmp = fold_build2 (MULT_EXPR, type, count, step);
+	  tmp = build2 (PLUS_EXPR, type, from, tmp);
+	  dovar_init = tree_cons (dovar, tmp, dovar_init);
+	}
+
+      if (!dovar_found)
+	{
+	  tmp = build_omp_clause (OMP_CLAUSE_PRIVATE);
+	  OMP_CLAUSE_DECL (tmp) = dovar;
+	  omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+	}
+      if (!simple)
+	{
+	  tmp = build_omp_clause (OMP_CLAUSE_PRIVATE);
+	  OMP_CLAUSE_DECL (tmp) = count;
+	  omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+	}
+
+      if (i + 1 < collapse)
+	code = code->block->next;
     }
-  else
+
+  if (pblock != &block)
+    {
+      pushlevel (0);
+      gfc_start_block (&block);
+    }
+
+  gfc_start_block (&body);
+
+  dovar_init = nreverse (dovar_init);
+  while (dovar_init)
     {
-      /* STEP is not 1 or -1.  Use:
-	 for (count = 0; count < (to + step - from) / step; count++)
-	   {
-	     dovar = from + count * step;
-	     body;
-	   cycle_label:;
-	   }  */
-      tmp = fold_build2 (MINUS_EXPR, type, step, from);
-      tmp = fold_build2 (PLUS_EXPR, type, to, tmp);
-      tmp = fold_build2 (TRUNC_DIV_EXPR, type, tmp, step);
-      tmp = gfc_evaluate_now (tmp, pblock);
-      count = gfc_create_var (type, "count");
-      init = build2_v (GIMPLE_MODIFY_STMT, count, build_int_cst (type, 0));
-      cond = build2 (LT_EXPR, boolean_type_node, count, tmp);
-      incr = fold_build2 (PLUS_EXPR, type, count, build_int_cst (type, 1));
-      incr = fold_build2 (GIMPLE_MODIFY_STMT, type, count, incr);
-
-      if (pblock != &block)
-	{
-	  pushlevel (0);
-	  gfc_start_block (&block);
-	}
-      gfc_start_block (&body);
-
-      /* Initialize DOVAR.  */
-      tmp = fold_build2 (MULT_EXPR, type, count, step);
-      tmp = build2 (PLUS_EXPR, type, from, tmp);
-      gfc_add_modify_stmt (&body, dovar, tmp);
-    }
-
-  if (!dovar_found)
-    {
-      tmp = build_omp_clause (OMP_CLAUSE_PRIVATE);
-      OMP_CLAUSE_DECL (tmp) = dovar;
-      omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
-    }
-  if (!simple)
-    {
-      tmp = build_omp_clause (OMP_CLAUSE_PRIVATE);
-      OMP_CLAUSE_DECL (tmp) = count;
-      omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+      gfc_add_modify_stmt (&body, TREE_PURPOSE (dovar_init),
+			   TREE_VALUE (dovar_init));
+      dovar_init = TREE_CHAIN (dovar_init);
     }
 
   /* Cycle statement is implemented with a goto.  Exit statement must not be
--- gcc/fortran/gfortran.h	(revision 131902)
+++ gcc/fortran/gfortran.h	(working copy)
@@ -1972,6 +1972,7 @@ bool gfc_post_options (const char **);
 
 /* iresolve.c */
 const char * gfc_get_string (const char *, ...) ATTRIBUTE_PRINTF_1;
+bool gfc_find_sym_in_expr (gfc_symbol *, gfc_expr *);
 
 /* error.c */
 
--- gcc/fortran/resolve.c	(revision 131902)
+++ gcc/fortran/resolve.c	(working copy)
@@ -4652,8 +4652,8 @@ sym_in_expr (gfc_expr *e, gfc_symbol *sy
   return false;
 }
 
-static bool
-find_sym_in_expr (gfc_symbol *sym, gfc_expr *e)
+bool
+gfc_find_sym_in_expr (gfc_symbol *sym, gfc_expr *e)
 {
   return gfc_traverse_expr (e, sym, sym_in_expr, 0);
 }
@@ -4850,8 +4850,10 @@ check_symbols:
 	  if (sym->ts.type == BT_DERIVED)
 	    continue;
 
-	  if ((ar->start[i] != NULL && find_sym_in_expr (sym, ar->start[i]))
-		 || (ar->end[i] != NULL && find_sym_in_expr (sym, ar->end[i])))
+	  if ((ar->start[i] != NULL
+	       && gfc_find_sym_in_expr (sym, ar->start[i]))
+	      || (ar->end[i] != NULL
+		  && gfc_find_sym_in_expr (sym, ar->end[i])))
 	    {
 	      gfc_error ("'%s' must not appear an the array specification at "
 			 "%L in the same ALLOCATE statement where it is "
@@ -6048,8 +6050,8 @@ resolve_ordinary_assign (gfc_code *code,
 	  {
 	    for (n = 0; n < ref->u.ar.dimen; n++)
 	      if (ref->u.ar.dimen_type[n] == DIMEN_VECTOR
-		    && find_sym_in_expr (lhs->symtree->n.sym,
-					 ref->u.ar.start[n]))
+		  && gfc_find_sym_in_expr (lhs->symtree->n.sym,
+					   ref->u.ar.start[n]))
 		ref->u.ar.start[n]
 			= gfc_get_parentheses (ref->u.ar.start[n]);
 	  }
--- gcc/fortran/types.def	(revision 131899)
+++ gcc/fortran/types.def	(working copy)
@@ -1,4 +1,4 @@
-/* Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007
+/* Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -50,7 +50,8 @@ along with GCC; see the file COPYING3.  
     the type pointed to.  */
 
 DEF_PRIMITIVE_TYPE (BT_VOID, void_type_node)
-DEF_PRIMITIVE_TYPE (BT_BOOL, boolean_type_node)
+DEF_PRIMITIVE_TYPE (BT_BOOL,
+		    (*lang_hooks.types.type_for_size) (BOOL_TYPE_SIZE, 1))
 DEF_PRIMITIVE_TYPE (BT_INT, integer_type_node)
 DEF_PRIMITIVE_TYPE (BT_UINT, unsigned_type_node)
 DEF_PRIMITIVE_TYPE (BT_LONG, long_integer_type_node)
--- gcc/gimplify.c	(revision 131902)
+++ gcc/gimplify.c	(working copy)
@@ -5370,120 +5370,131 @@ gimplify_omp_task (tree *expr_p, tree *p
 static enum gimplify_status
 gimplify_omp_for (tree *expr_p, tree *pre_p)
 {
-  tree for_stmt, decl, var, t;
+  tree for_stmt, decl, var, t, bodylist;
   enum gimplify_status ret = GS_OK;
   tree body, init_decl = NULL_TREE;
+  int i;
 
   for_stmt = *expr_p;
 
   gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p,
 			     ORT_WORKSHARE);
 
-  t = OMP_FOR_INIT (for_stmt);
-  gcc_assert (TREE_CODE (t) == MODIFY_EXPR
-	      || TREE_CODE (t) == GIMPLE_MODIFY_STMT);
-  decl = GENERIC_TREE_OPERAND (t, 0);
-  gcc_assert (DECL_P (decl));
-  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl)));
-
-  /* Make sure the iteration variable is private.  */
-  if (omp_is_private (gimplify_omp_ctxp, decl))
-    omp_notice_variable (gimplify_omp_ctxp, decl, true);
-  else
-    omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN);
-
-  /* If DECL is not a gimple register, create a temporary variable to act as an
-     iteration counter.  This is valid, since DECL cannot be modified in the
-     body of the loop.  */
-  if (!is_gimple_reg (decl))
-    {
-      var = create_tmp_var (TREE_TYPE (decl), get_name (decl));
-      GENERIC_TREE_OPERAND (t, 0) = var;
-
-      init_decl = build_gimple_modify_stmt (decl, var);
-      omp_add_variable (gimplify_omp_ctxp, var, GOVD_PRIVATE | GOVD_SEEN);
-    }
-  else
-    var = decl;
-
   /* If OMP_FOR is re-gimplified, ensure all variables in pre-body
      are noticed.  */
   gimplify_stmt (&OMP_FOR_PRE_BODY (for_stmt));
 
-  ret |= gimplify_expr (&GENERIC_TREE_OPERAND (t, 1),
-			&OMP_FOR_PRE_BODY (for_stmt),
-			NULL, is_gimple_val, fb_rvalue);
-
-  tree_to_gimple_tuple (&OMP_FOR_INIT (for_stmt));
-
-  t = OMP_FOR_COND (for_stmt);
-  gcc_assert (COMPARISON_CLASS_P (t));
-  gcc_assert (GENERIC_TREE_OPERAND (t, 0) == decl);
-  TREE_OPERAND (t, 0) = var;
-
-  ret |= gimplify_expr (&GENERIC_TREE_OPERAND (t, 1),
-			&OMP_FOR_PRE_BODY (for_stmt),
-			NULL, is_gimple_val, fb_rvalue);
-
-  tree_to_gimple_tuple (&OMP_FOR_INCR (for_stmt));
-  t = OMP_FOR_INCR (for_stmt);
-  switch (TREE_CODE (t))
-    {
-    case PREINCREMENT_EXPR:
-    case POSTINCREMENT_EXPR:
-      t = build_int_cst (TREE_TYPE (decl), 1);
-      t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
-      t = build_gimple_modify_stmt (var, t);
-      OMP_FOR_INCR (for_stmt) = t;
-      break;
+  bodylist = alloc_stmt_list ();
 
-    case PREDECREMENT_EXPR:
-    case POSTDECREMENT_EXPR:
-      t = build_int_cst (TREE_TYPE (decl), -1);
-      t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
-      t = build_gimple_modify_stmt (var, t);
-      OMP_FOR_INCR (for_stmt) = t;
-      break;
-      
-    case GIMPLE_MODIFY_STMT:
-      gcc_assert (GIMPLE_STMT_OPERAND (t, 0) == decl);
-      GIMPLE_STMT_OPERAND (t, 0) = var;
+  gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
+	      == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
+  gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
+	      == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
+  for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
+    {
+      t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
+      gcc_assert (TREE_CODE (t) == MODIFY_EXPR
+		  || TREE_CODE (t) == GIMPLE_MODIFY_STMT);
+      decl = GENERIC_TREE_OPERAND (t, 0);
+      gcc_assert (DECL_P (decl));
+      gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl)));
+
+      /* Make sure the iteration variable is private.  */
+      if (omp_is_private (gimplify_omp_ctxp, decl))
+	omp_notice_variable (gimplify_omp_ctxp, decl, true);
+      else
+	omp_add_variable (gimplify_omp_ctxp, decl, GOVD_PRIVATE | GOVD_SEEN);
+
+      /* If DECL is not a gimple register, create a temporary variable to act
+	 as an iteration counter.  This is valid, since DECL cannot be
+	 modified in the body of the loop.  */
+      if (!is_gimple_reg (decl))
+	{
+	  var = create_tmp_var (TREE_TYPE (decl), get_name (decl));
+	  GENERIC_TREE_OPERAND (t, 0) = var;
+
+	  init_decl = build_gimple_modify_stmt (decl, var);
+	  omp_add_variable (gimplify_omp_ctxp, var, GOVD_PRIVATE | GOVD_SEEN);
+	}
+      else
+	var = decl;
+
+      ret |= gimplify_expr (&GENERIC_TREE_OPERAND (t, 1),
+			    &OMP_FOR_PRE_BODY (for_stmt),
+			    NULL, is_gimple_val, fb_rvalue);
+
+      tree_to_gimple_tuple (&TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i));
+
+      t = TREE_VEC_ELT (OMP_FOR_COND (for_stmt), i);
+      gcc_assert (COMPARISON_CLASS_P (t));
+      gcc_assert (GENERIC_TREE_OPERAND (t, 0) == decl);
+      TREE_OPERAND (t, 0) = var;
+
+      ret |= gimplify_expr (&GENERIC_TREE_OPERAND (t, 1),
+			    &OMP_FOR_PRE_BODY (for_stmt),
+			    NULL, is_gimple_val, fb_rvalue);
 
-      t = GIMPLE_STMT_OPERAND (t, 1);
+      tree_to_gimple_tuple (&TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i));
+      t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
       switch (TREE_CODE (t))
 	{
-	case PLUS_EXPR:
-	  if (TREE_OPERAND (t, 1) == decl)
+	case PREINCREMENT_EXPR:
+	case POSTINCREMENT_EXPR:
+	  t = build_int_cst (TREE_TYPE (decl), 1);
+	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
+	  t = build_gimple_modify_stmt (var, t);
+	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  break;
+
+	case PREDECREMENT_EXPR:
+	case POSTDECREMENT_EXPR:
+	  t = build_int_cst (TREE_TYPE (decl), -1);
+	  t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t);
+	  t = build_gimple_modify_stmt (var, t);
+	  TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t;
+	  break;
+
+	case GIMPLE_MODIFY_STMT:
+	  gcc_assert (GIMPLE_STMT_OPERAND (t, 0) == decl);
+	  GIMPLE_STMT_OPERAND (t, 0) = var;
+
+	  t = GIMPLE_STMT_OPERAND (t, 1);
+	  switch (TREE_CODE (t))
 	    {
-	      TREE_OPERAND (t, 1) = TREE_OPERAND (t, 0);
+	    case PLUS_EXPR:
+	      if (TREE_OPERAND (t, 1) == decl)
+		{
+		  TREE_OPERAND (t, 1) = TREE_OPERAND (t, 0);
+		  TREE_OPERAND (t, 0) = var;
+		  break;
+		}
+
+	      /* Fallthru.  */
+	    case MINUS_EXPR:
+	      gcc_assert (TREE_OPERAND (t, 0) == decl);
 	      TREE_OPERAND (t, 0) = var;
 	      break;
+	    default:
+	      gcc_unreachable ();
 	    }
 
-	  /* Fallthru.  */
-	case MINUS_EXPR:
-	  gcc_assert (TREE_OPERAND (t, 0) == decl);
-	  TREE_OPERAND (t, 0) = var;
+	  ret |= gimplify_expr (&TREE_OPERAND (t, 1),
+				&OMP_FOR_PRE_BODY (for_stmt),
+				NULL, is_gimple_val, fb_rvalue);
 	  break;
+
 	default:
 	  gcc_unreachable ();
 	}
 
-      ret |= gimplify_expr (&TREE_OPERAND (t, 1), &OMP_FOR_PRE_BODY (for_stmt),
-			    NULL, is_gimple_val, fb_rvalue);
-      break;
-
-    default:
-      gcc_unreachable ();
+      if (init_decl)
+	append_to_statement_list (init_decl, &bodylist);
     }
 
   body = OMP_FOR_BODY (for_stmt);
   gimplify_to_stmt_list (&body);
-  t = alloc_stmt_list ();
-  if (init_decl)
-    append_to_statement_list (init_decl, &t);
-  append_to_statement_list (body, &t);
-  OMP_FOR_BODY (for_stmt) = t;
+  append_to_statement_list (body, &bodylist);
+  OMP_FOR_BODY (for_stmt) = bodylist;
   gimplify_adjust_omp_clauses (&OMP_FOR_CLAUSES (for_stmt));
 
   return ret == GS_ALL_DONE ? GS_ALL_DONE : GS_ERROR;
--- gcc/c-omp.c	(revision 131902)
+++ gcc/c-omp.c	(working copy)
@@ -384,9 +384,12 @@ c_finish_omp_for (location_t locus, tree
       tree t = make_node (OMP_FOR);
 
       TREE_TYPE (t) = void_type_node;
-      OMP_FOR_INIT (t) = init;
-      OMP_FOR_COND (t) = cond;
-      OMP_FOR_INCR (t) = incr;
+      OMP_FOR_INIT (t) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_INIT (t), 0) = init;
+      OMP_FOR_COND (t) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_COND (t), 0) = cond;
+      OMP_FOR_INCR (t) = make_tree_vec (1);
+      TREE_VEC_ELT (OMP_FOR_INCR (t), 0) = incr;
       OMP_FOR_BODY (t) = body;
       OMP_FOR_PRE_BODY (t) = pre_body;
 
--- gcc/tree-nested.c	(revision 131902)
+++ gcc/tree-nested.c	(working copy)
@@ -672,6 +672,7 @@ walk_omp_for (walk_tree_fn callback, str
 {
   struct walk_stmt_info wi;
   tree t, list = NULL, empty;
+  int i;
 
   walk_body (callback, info, &OMP_FOR_PRE_BODY (for_stmt));
 
@@ -682,36 +683,39 @@ walk_omp_for (walk_tree_fn callback, str
   wi.info = info;
   wi.tsi = tsi_last (list);
 
-  t = OMP_FOR_INIT (for_stmt);
-  gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
-  SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
-  wi.val_only = false;
-  walk_tree (&GIMPLE_STMT_OPERAND (t, 0), callback, &wi, NULL);
-  wi.val_only = true;
-  wi.is_lhs = false;
-  walk_tree (&GIMPLE_STMT_OPERAND (t, 1), callback, &wi, NULL);
-
-  t = OMP_FOR_COND (for_stmt);
-  gcc_assert (COMPARISON_CLASS_P (t));
-  SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
-  wi.val_only = false;
-  walk_tree (&TREE_OPERAND (t, 0), callback, &wi, NULL);
-  wi.val_only = true;
-  wi.is_lhs = false;
-  walk_tree (&TREE_OPERAND (t, 1), callback, &wi, NULL);
-
-  t = OMP_FOR_INCR (for_stmt);
-  gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
-  SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
-  wi.val_only = false;
-  walk_tree (&GIMPLE_STMT_OPERAND (t, 0), callback, &wi, NULL);
-  t = GIMPLE_STMT_OPERAND (t, 1);
-  gcc_assert (BINARY_CLASS_P (t));
-  wi.val_only = false;
-  walk_tree (&TREE_OPERAND (t, 0), callback, &wi, NULL);
-  wi.val_only = true;
-  wi.is_lhs = false;
-  walk_tree (&TREE_OPERAND (t, 1), callback, &wi, NULL);
+  for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
+    {
+      t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
+      gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
+      SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
+      wi.val_only = false;
+      walk_tree (&GIMPLE_STMT_OPERAND (t, 0), callback, &wi, NULL);
+      wi.val_only = true;
+      wi.is_lhs = false;
+      walk_tree (&GIMPLE_STMT_OPERAND (t, 1), callback, &wi, NULL);
+
+      t = TREE_VEC_ELT (OMP_FOR_COND (for_stmt), i);
+      gcc_assert (COMPARISON_CLASS_P (t));
+      SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
+      wi.val_only = false;
+      walk_tree (&TREE_OPERAND (t, 0), callback, &wi, NULL);
+      wi.val_only = true;
+      wi.is_lhs = false;
+      walk_tree (&TREE_OPERAND (t, 1), callback, &wi, NULL);
+
+      t = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
+      gcc_assert (TREE_CODE (t) == GIMPLE_MODIFY_STMT);
+      SET_EXPR_LOCUS (empty, EXPR_LOCUS (t));
+      wi.val_only = false;
+      walk_tree (&GIMPLE_STMT_OPERAND (t, 0), callback, &wi, NULL);
+      t = GIMPLE_STMT_OPERAND (t, 1);
+      gcc_assert (BINARY_CLASS_P (t));
+      wi.val_only = false;
+      walk_tree (&TREE_OPERAND (t, 0), callback, &wi, NULL);
+      wi.val_only = true;
+      wi.is_lhs = false;
+      walk_tree (&TREE_OPERAND (t, 1), callback, &wi, NULL);
+    }
 
   /* Remove empty statement added above from the end of statement list.  */
   tsi_delink (&wi.tsi);
@@ -1199,6 +1203,8 @@ convert_nonlocal_omp_clauses (tree *pcla
 	case OMP_CLAUSE_ORDERED:
 	case OMP_CLAUSE_DEFAULT:
 	case OMP_CLAUSE_COPYIN:
+	case OMP_CLAUSE_COLLAPSE:
+	case OMP_CLAUSE_UNTIED:
 	  break;
 
 	default:
@@ -1496,6 +1502,8 @@ convert_local_omp_clauses (tree *pclause
 	case OMP_CLAUSE_ORDERED:
 	case OMP_CLAUSE_DEFAULT:
 	case OMP_CLAUSE_COPYIN:
+	case OMP_CLAUSE_COLLAPSE:
+	case OMP_CLAUSE_UNTIED:
 	  break;
 
 	default:
--- gcc/c-parser.c	(revision 131902)
+++ gcc/c-parser.c	(working copy)
@@ -6955,6 +6955,8 @@ c_parser_omp_clause_collapse (c_parser *
     }
   c = build_omp_clause (OMP_CLAUSE_COLLAPSE);
   OMP_CLAUSE_COLLAPSE_EXPR (c) = num;
+  OMP_CLAUSE_COLLAPSE_ITERVAR (c) = NULL;
+  OMP_CLAUSE_COLLAPSE_COUNT (c) = NULL;
   OMP_CLAUSE_CHAIN (c) = list;
   return c;
 }
--- gcc/tree-ssa-operands.c	(revision 131902)
+++ gcc/tree-ssa-operands.c	(working copy)
@@ -2292,17 +2292,22 @@ get_expr_operands (tree stmt, tree *expr
 
     case OMP_FOR:
       {
-	tree init = OMP_FOR_INIT (expr);
-	tree cond = OMP_FOR_COND (expr);
-	tree incr = OMP_FOR_INCR (expr);
 	tree c, clauses = OMP_FOR_CLAUSES (stmt);
+	int i;
 
-	get_expr_operands (stmt, &GIMPLE_STMT_OPERAND (init, 0), opf_def);
-	get_expr_operands (stmt, &GIMPLE_STMT_OPERAND (init, 1), opf_use);
-	get_expr_operands (stmt, &TREE_OPERAND (cond, 1), opf_use);
-	get_expr_operands (stmt,
-	                   &TREE_OPERAND (GIMPLE_STMT_OPERAND (incr, 1), 1),
-			   opf_use);
+	for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (expr)); i++)
+	  {
+	    tree init = TREE_VEC_ELT (OMP_FOR_INIT (expr), i);
+	    tree cond = TREE_VEC_ELT (OMP_FOR_COND (expr), i);
+	    tree incr = TREE_VEC_ELT (OMP_FOR_INCR (expr), i);
+
+	    get_expr_operands (stmt, &GIMPLE_STMT_OPERAND (init, 0), opf_def);
+	    get_expr_operands (stmt, &GIMPLE_STMT_OPERAND (init, 1), opf_use);
+	    get_expr_operands (stmt, &TREE_OPERAND (cond, 1), opf_use);
+	    get_expr_operands (stmt,
+			       &TREE_OPERAND (GIMPLE_STMT_OPERAND (incr, 1),
+					      1), opf_use);
+	  }
 
 	c = find_omp_clause (clauses, OMP_CLAUSE_SCHEDULE);
 	if (c)
--- gcc/testsuite/gfortran.dg/gomp/collapse1.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/gomp/collapse1.f90	(revision 0)
@@ -0,0 +1,57 @@
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+subroutine collapse1
+  integer :: i, j, k, a(1:3, 4:6, 5:7)
+  real :: r
+  logical :: l
+  integer, save :: thr
+  !$omp threadprivate (thr)
+  l = .false.
+  a(:, :, :) = 0
+  !$omp parallel do collapse(4) schedule(static, 4) ! { dg-error "not enough DO loops for collapsed" }
+    do i = 1, 3
+      do j = 4, 6
+        do k = 5, 7
+          a(i, j, k) = i + j + k
+        end do
+      end do
+    end do
+  !$omp parallel do collapse(2)
+    do i = 1, 5, 2
+      do j = i + 1, 7, i	! { dg-error "collapsed loops don.t form rectangular iteration space" }
+      end do
+    end do
+  !$omp parallel do collapse(2) shared(j)
+    do i = 1, 3
+      do j = 4, 6		! { dg-error "iteration variable present on clause other than PRIVATE or LASTPRIVATE" }
+      end do
+    end do
+  !$omp parallel do collapse(2)
+    do i = 1, 3
+      do j = 4, 6
+      end do
+      k = 4
+    end do
+  !$omp parallel do collapse(2)
+    do i = 1, 3
+      do			! { dg-error "cannot be a DO WHILE or DO without loop control" }
+      end do
+    end do
+  !$omp parallel do collapse(2)
+    do i = 1, 3
+      do r = 4, 6		! { dg-warning "must be integer" }
+      end do
+    end do
+end subroutine collapse1
+
+subroutine collapse1_2
+  integer :: i
+  !$omp parallel do collapse(2)
+    do i = -6, 6		! { dg-error "cannot be redefined inside loop beginning" }
+      do i = 4, 6		! { dg-error "collapsed loops don.t form rectangular iteration space|cannot be redefined" }
+      end do
+    end do
+end subroutine collapse1_2
+
+! { dg-error "iteration variable must be of type integer" "integer" { target *-*-* } 43 }
--- libgomp/testsuite/libgomp.fortran/collapse2.f90	(revision 0)
+++ libgomp/testsuite/libgomp.fortran/collapse2.f90	(revision 0)
@@ -0,0 +1,53 @@
+! { dg-do run }
+
+program collapse2
+  call test1
+  call test2
+contains
+  subroutine test1
+    integer :: i, j, k, a(1:3, 4:6, 5:7)
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse(4 - 1) schedule(static, 4)
+      do 164 i = 1, 3
+        do 164 j = 4, 6
+          do 164 k = 5, 7
+            a(i, j, k) = i + j + k
+164      end do
+    !$omp parallel do collapse(2) reduction(.or.:l)
+firstdo: do i = 1, 3
+        do j = 4, 6
+          do k = 5, 7
+            if (a(i, j, k) .ne. (i + j + k)) l = .true.
+          end do
+        end do
+      end do firstdo
+    !$omp end parallel do
+    if (l) call abort
+  end subroutine test1
+
+  subroutine test2
+    integer :: a(3,3,3), k, kk, kkk, l, ll, lll
+    !$omp do collapse(3)
+      do 115 k=1,3
+  dokk: do kk=1,3
+          do kkk=1,3
+            a(k,kk,kkk) = 1
+          enddo
+        enddo dokk
+115   continue
+    if (any(a(1:3,1:3,1:3).ne.1)) call abort
+
+    !$omp do collapse(3)
+ dol: do 120 l=1,3
+  doll: do ll=1,3
+          do lll=1,3
+            a(l,ll,lll) = 2
+          enddo
+        enddo doll
+120   end do dol
+    if (any(a(1:3,1:3,1:3).ne.2)) call abort
+  end subroutine test2
+
+end program collapse2
--- libgomp/testsuite/libgomp.fortran/collapse3.f90	(revision 0)
+++ libgomp/testsuite/libgomp.fortran/collapse3.f90	(revision 0)
@@ -0,0 +1,204 @@
+! { dg-do run }
+
+program collapse3
+  call test1
+  call test2 (2, 6, -2, 4, 13, 18)
+  call test3 (2, 6, -2, 4, 13, 18, 1, 1, 1)
+  call test4
+  call test5 (2, 6, -2, 4, 13, 18)
+  call test6 (2, 6, -2, 4, 13, 18, 1, 1, 1)
+contains
+  subroutine test1
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l)
+      do i = 2, 6
+        do j = -2, 4
+          do k = 13, 18
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test1
+
+  subroutine test2(v1, v2, v3, v4, v5, v6)
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    integer :: v1, v2, v3, v4, v5, v6
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (m) reduction (.or.:l)
+      do i = v1, v2
+        do j = v3, v4
+          do k = v5, v6
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test2
+
+  subroutine test3(v1, v2, v3, v4, v5, v6, v7, v8, v9)
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    integer :: v1, v2, v3, v4, v5, v6, v7, v8, v9
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (m) reduction (.or.:l)
+      do i = v1, v2, v7
+        do j = v3, v4, v8
+          do k = v5, v6, v9
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test3
+
+  subroutine test4
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (i, j, k, m) reduction (.or.:l) &
+    !$omp& schedule (dynamic, 5)
+      do i = 2, 6
+        do j = -2, 4
+          do k = 13, 18
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test4
+
+  subroutine test5(v1, v2, v3, v4, v5, v6)
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    integer :: v1, v2, v3, v4, v5, v6
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (m) reduction (.or.:l) &
+    !$omp & schedule (guided)
+      do i = v1, v2
+        do j = v3, v4
+          do k = v5, v6
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test5
+
+  subroutine test6(v1, v2, v3, v4, v5, v6, v7, v8, v9)
+    integer :: i, j, k, a(1:7, -3:5, 12:19), m
+    integer :: v1, v2, v3, v4, v5, v6, v7, v8, v9
+    logical :: l
+    l = .false.
+    a(:, :, :) = 0
+    !$omp parallel do collapse (3) lastprivate (m) reduction (.or.:l) &
+    !$omp & schedule (dynamic)
+      do i = v1, v2, v7
+        do j = v3, v4, v8
+          do k = v5, v6, v9
+            l = l.or.i.lt.2.or.i.gt.6.or.j.lt.-2.or.j.gt.4
+            l = l.or.k.lt.13.or.k.gt.18
+            if (.not.l) a(i, j, k) = a(i, j, k) + 1
+            m = i * 100 + j * 10 + k
+          end do
+        end do
+      end do
+!   if (i.ne.7.or.j.ne.5.or.k.ne.19) call abort
+    if (m.ne.(600+40+18)) call abort
+    do i = 1, 7
+      do j = -3, 5
+        do k = 12, 19
+          if (i.eq.1.or.i.eq.7.or.j.eq.-3.or.j.eq.5.or.k.eq.12.or.k.eq.19) then
+            if (a(i, j, k).ne.0) print *, i, j, k
+          else
+            if (a(i, j, k).ne.1) print *, 'kk', i, j, k, a(i, j, k)
+          end if
+        end do
+      end do
+    end do
+  end subroutine test6
+
+end program collapse3

	Jakub



More information about the Gcc-patches mailing list