This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

XARK Backend with OpenMP


Dear GCC developers.

My name is Josà Manuel AndiÃn FernÃndez and I have just started to work at University of A CoruÃa on porting XARK (eXtensible Compiler for Automatic Recognition of Computational Kernels, http://doi.acm.org/10.1145/1391956.1391959) to GCC. The main goal of this project is to be able to generate parallel code from a sequential code automatically.

One of the firsts tasks to accomplish is generating an OpenMP version from a loop which has previously been marked as parallelizable. So I have implemented a new pass named pass_xark_backend_omp (at tree-xark-backend-omp.c, enabled by -fxark-backend-omp) strongly based on pass_parallelize_loops (at tree-parloops.c). The main differences between them are:
- tree-xark-backend-omp is executed in pass_early_local_passes group, while pass_parallelize_loops is executed in pass_tree_loop, in pass_all_optimizations group. We have chosen this location because SSA form has just been calculated and because we will carry out interprocedural anaysis of several functions at same time.
- I don't create two versions of the loop. I manipulate the old loop, inserting OpenMP directives and another statements to achieve only one parallel loop.
- The CFG which reaches my pass is different, so I make different modifications. But the intermediate representantion which reaches omp_expand_local is almost the same.
- In create_loop_fn, I had to add a call to add_referenced_var(t); after t = build_decl (PARM_DECL, get_identifier (".paral_data_param"), ptr_type_node); (tree-xark-backend-omp.c:920)

But when I apply my pass to the simple example code asig_reg.c, I obtain the following ICE:

$ ../build/bin/gcc -v -Wall -fdump-tree-all-raw -fdump-ipa-all-raw -L../build/lib -Wl,-rpath -Wl,../build/lib -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o asig_reg.bin asig_reg.c
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../src/configure --prefix=/home/jmandion/Programacion/gcc-gnu/build/ --disable-bootstrap --enable-languages=c,fortran
Thread model: posix
gcc version 4.5.0 20090521 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-v' '-Wall' '-fdump-tree-all-raw' '-fdump-ipa-all-raw' '-L../build/lib' '-fxark-debug' '-fxark-backend-omp' '-fverify-ssa-debug' '-o' 'asig_reg.bin' '-mtune=generic'
 /home/jmandion/Programacion/gcc-gnu/build/bin/../libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -quiet -v -iprefix /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/ asig_reg.c -quiet -dumpbase asig_reg.c -mtune=generic -auxbase asig_reg -Wall -version -fdump-tree-all-raw -fdump-ipa-all-raw -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o /tmp/ccgETP9F.s
GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu)
        compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include"
ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include"
ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed"
ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include
 /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed
 /usr/local/include
 /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../include
 /usr/include
End of search list.
GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu)
        compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: ebc62f7555f88405c8bde4c5a1af27a7
asig_reg.c: In function âmain._loopfn.0â:
asig_reg.c:53: error: missing definition
for SSA_NAME: .paral_data_param_2 in statement:
.paral_data_load.5_3 = (struct .paral_data.3 *) .paral_data_param_2;
asig_reg.c:53: internal compiler error: verify_ssa failed
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

I have tried to solve this problem fixing the Call Graph (omp_expand_local don't do it) and printing all the functions across this graph (enabled with -fxark-debug), but it was unsuccesfull. So I have instrumented verify_ssa (tree-ssa.c:542 and 733, enabled by -fverify-ssa-debug) by dumping in file verify_ssa.dump the pass which invokes the function, the function associated to current_function_decl in that moment, the referenced_vars and the immediate uses. Furthermore, I write a line each time a basic block is processed. With this information and some dummy passes (pass_xark_print and pass_xark_ipa_print, also enabled by -fxark-debug), I have seen the compilation process ends before reaching the ipa_passes group because the referenced_vars and the immediate_uses are empty when analyzing main._loopfn.0. But just "a moment" before, verify_ssa checks succesfully this function too. 
I attach to this message a patch tree-xark-backend-omp.diff and the example asig_reg.c

Thanks in advance,

Josà Manuel AndiÃn FernÃndez
Computer Architecture Group
Departement of Electronics and Systems
University of A CoruÃa
Spain
http://gac.des.udc.es/index_en.html
#include <stdio.h>
#include <unistd.h>

#define MAX 1048576

int main() {

	int A[MAX];
	int i;

	for (i = 0; i < MAX; ++i) {
		A[i] = 2;
	}

	printf("A[3] = %d\n", A[3]);

	return 0;

}
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h	(revisiÃn: 147760)
+++ gcc/tree-pass.h	(copia de trabajo)
@@ -400,6 +400,11 @@
 extern struct gimple_opt_pass pass_local_pure_const;
 extern struct gimple_opt_pass pass_tracer;
 
+/* XARK Passes */
+extern struct gimple_opt_pass pass_xark_backend_omp;
+extern struct gimple_opt_pass pass_xark_print;
+extern struct simple_ipa_opt_pass pass_xark_ipa_print;
+
 /* IPA Passes */
 extern struct ipa_opt_pass_d pass_ipa_inline;
 extern struct ipa_opt_pass_d pass_ipa_cp;
Index: gcc/tree-xark-backend-omp.c
===================================================================
--- gcc/tree-xark-backend-omp.c	(revisiÃn: 0)
+++ gcc/tree-xark-backend-omp.c	(revisiÃn: 0)
@@ -0,0 +1,1485 @@
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "rtl.h"
+#include "tree-flow.h"
+#include "cfgloop.h"
+#include "ggc.h"
+#include "tree-data-ref.h"
+#include "diagnostic.h"
+#include "tree-pass.h"
+#include "tree-scalar-evolution.h"
+#include "hashtab.h"
+#include "langhooks.h"
+#include "tree-vectorizer.h"
+#include "cgraph.h"
+#include "tree-dump.h"
+
+#define MAX_CHAR_LOOPFN_NAME    100
+
+void
+dump_dom_info_state(enum cdi_direction, FILE *);
+void
+dump_cgraph_state(FILE *);
+void
+dump_functions_across_cgraph(FILE*);
+void
+hello_from(const char *, FILE *);
+void
+bye_from(const char *, FILE *);
+void
+print_bbs(FILE *);
+void
+print_loops_info(FILE *);
+void
+my_print_loops(FILE *);
+bool
+is_parallelizable_loop(struct loop *);
+void
+parallelize_loop(struct loop * loop);
+
+void
+dump_dom_info_state(enum cdi_direction dir, FILE * f)
+{
+  enum dom_state state = dom_info_state(dir);
+  fprintf(f, "dom_info_state(");
+  if (dir == CDI_DOMINATORS)
+    {
+      fprintf(f, "CDI_DOMINATORS");
+    }
+  else if (dir == CDI_POST_DOMINATORS)
+    {
+      fprintf(f, "CDI_POST_DOMINATORS");
+    }
+  else
+    {
+      fprintf(f, "Impossible!!!");
+    }
+  fprintf(f, "): ");
+  if (state == DOM_NONE)
+    {
+      fprintf(f, "DOM_NONE");
+    }
+  else if (state == DOM_NO_FAST_QUERY)
+    {
+      fprintf(f, "DOM_NO_FAST_QUERY");
+    }
+  else if (state == DOM_OK)
+    {
+      fprintf(f, "DOM_OK");
+    }
+  else
+    {
+      fprintf(f, "Impossible!!!");
+    }
+  fprintf(f, "\n");
+}
+
+void
+dump_cgraph_state(FILE * f)
+{
+  fprintf(f, "cgraph state is ");
+  switch (cgraph_state)
+    {
+  case CGRAPH_STATE_CONSTRUCTION:
+    fprintf(f, "CGRAPH_STATE_CONSTRUCTION");
+    break;
+  case CGRAPH_STATE_IPA:
+    fprintf(f, "CGRAPH_STATE_IPA");
+    break;
+  case CGRAPH_STATE_IPA_SSA:
+    fprintf(f, "CGRAPH_STATE_IPA_SSA");
+    break;
+  case CGRAPH_STATE_EXPANSION:
+    fprintf(f, "CGRAPH_STATE_EXPANSION");
+    break;
+  case CGRAPH_STATE_FINISHED:
+    fprintf(f, "CGRAPH_STATE_FINISHED");
+    break;
+  default:
+    gcc_unreachable();
+    }
+  fprintf(f, "\n");
+  fflush(f);
+}
+
+void
+dump_functions_across_cgraph(FILE* f)
+{
+  struct cgraph_node *node;
+
+  for (node = cgraph_nodes; node; node = node->next)
+    {
+      tree temp_fn;
+
+      temp_fn = current_function_decl;
+      current_function_decl = node->decl;
+      push_cfun(DECL_STRUCT_FUNCTION (node->decl));
+
+      fprintf(f, "* Dump of function %s *\n",
+          lang_hooks.decl_printable_name(current_function_decl, 2));
+
+      dump_function_to_file(current_function_decl, f, 0);
+
+      pop_cfun();
+      current_function_decl = temp_fn;
+    }
+}
+
+void
+hello_from(const char * function_name, FILE * f)
+{
+  fprintf(f, "* Hello from %s!!! *\n", function_name);
+  fflush(f);
+}
+
+void
+bye_from(const char * function_name, FILE * f)
+{
+  fprintf(f, "* Bye from %s!!! *\n", function_name);
+  fflush(f);
+}
+
+/* Adapted from tree-blocking */
+
+void
+print_bbs(FILE * f)
+{
+  basic_block bb, prevbb, nextbb;
+  unsigned int i;
+
+  hello_from("print_bbs", f);
+
+  FOR_EACH_BB (bb)
+    {
+      prevbb = bb->prev_bb;
+      nextbb = bb->next_bb;
+      fprintf(f, "-------- BASIC BLOCK %d ---------\n", bb->index);
+      if (get_immediate_dominator(CDI_DOMINATORS, bb) != NULL)
+        {
+          fprintf(f, "--- Immediate dominator is %d ---\n",
+              get_immediate_dominator(CDI_DOMINATORS, bb)->index);
+        }
+      gimple_dump_bb(bb, f, 2, TDF_RAW);
+      fprintf(f, "> EDGE_COUNT (bb->preds): %d basic blocks:\n", EDGE_COUNT (bb->preds));
+
+      for (i = 0; i < (EDGE_COUNT (bb->preds)); i++)
+        {
+
+          prevbb = EDGE_PRED (bb, i)->src;
+          fprintf(f, "  - Previous basic block number %d is bb %d\n", i,
+              prevbb->index);
+        }
+
+      fprintf(f, "> EDGE_COUNT (bb->succs): %d basic blocks : \n", EDGE_COUNT (bb->succs));
+
+      for (i = 0; i < (EDGE_COUNT (bb->succs)); i++)
+        {
+
+          nextbb = EDGE_SUCC (bb, i)->dest;
+          fprintf(f, "  - Next basic block number %d is bb %d \n", i,
+              nextbb->index);
+        }
+
+      fprintf(f, "\n");
+    }
+
+  fflush(f);
+  return;
+}
+
+/* Adapted from tree-blocking */
+
+void
+print_loops_info(FILE * f)
+{
+  loop_iterator li;
+  struct loop *loop;
+  basic_block *bbs;
+  edge edge;
+  unsigned int j;
+
+  hello_from("print_loops_info", f);
+
+  verify_flow_info();
+
+  fprintf(f, "NUMBER OF LOOPS = %d \n\n", number_of_loops());
+
+  FOR_EACH_LOOP (li, loop, 0)
+    {
+      fprintf(f, "--------");
+      fprintf(f, " LOOP NUMBER %d ", loop->num);
+      fprintf(f, "--------\n");
+      if (loop->next != NULL)
+        {
+          /* Link to the next (sibling) loop.  */
+          fprintf(f, " Next loop: %d \n", loop->next->num);
+        }
+      else
+        {
+          fprintf(f, " There isn't next loop \n");
+        }
+      if (loop->inner != NULL)
+        {
+          /* The first inner (child) loop or NULL if innermost loop.  */
+          fprintf(f, " Inner loop: %d \n", loop->inner->num);
+        }
+      else
+        {
+          fprintf(f, " There isn't inner loop \n");
+        }
+
+      if (loop_outer(loop) != NULL)
+        {
+          /* The outer (parent) loop or NULL if outermost loop.  */
+          fprintf(f, " Outer loop %d \n", loop_outer(loop)->num);
+        }
+      else
+        {
+          fprintf(f, " This is the outermost loop \n");
+        }
+
+      /* The loop nesting depth.  */
+      fprintf(f, " Loop depth %d \n", loop_depth(loop));
+      fprintf(f, "--- Body in dom order --- \n");
+      bbs = get_loop_body_in_dom_order(loop);
+      if (f)
+        {
+          for (j = 0; j < loop->num_nodes; j++)
+            {
+              gimple_dump_bb(bbs[j], f, 2, TDF_RAW);
+            }
+          fprintf(f, "--- Header --- \n");
+          gimple_dump_bb(loop->header, f, 2, TDF_RAW);
+
+          fprintf(f, "--- Latch --- \n");
+          gimple_dump_bb(loop->latch, f, 2, TDF_RAW);
+
+          fprintf(f, "\n");
+          edge = BRANCH_EDGE (loop->header);
+          fprintf(f, " Flag of branch edge: %d \n", edge->flags);
+          fprintf(f, " Loop init: ");
+          print_gimple_stmt(f, last_stmt(edge->dest->prev_bb), 0, TDF_RAW);
+          fprintf(f, " Loop step: ");
+          print_gimple_stmt(f, last_stmt(loop->latch), 0, TDF_RAW);
+          fprintf(f, " Loop limit: ");
+          print_generic_expr(f, gimple_cond_rhs(last_stmt(loop->header)),
+              TDF_RAW);
+          fprintf(f, "\n");
+
+          free(bbs);
+        }
+    }
+  bye_from("print_loops_info", f);
+
+  return;
+}
+
+void
+my_print_loops(FILE * file)
+{
+  hello_from("my_print_loops", file);
+  print_loops(file, 3);
+  bye_from("my_print_loops", file);
+}
+
+/* Copied from tree-parloops.c
+   Element of hashtable of names to copy.  */
+
+struct name_to_copy_elt
+{
+  unsigned version; /* The version of the name to copy.  */
+  tree new_name; /* The new name used in the copy.  */
+  tree field; /* The field of the structure used to pass the
+   value.  */
+};
+
+/* Copied from tree-parloops.c
+   Equality and hash functions for hashtab code.  */
+
+static int
+name_to_copy_elt_eq(const void *aa, const void *bb)
+{
+  const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa;
+  const struct name_to_copy_elt *b = (const struct name_to_copy_elt *) bb;
+
+  return a->version == b->version;
+}
+
+/* Copied from tree-parloops.c */
+
+static hashval_t
+name_to_copy_elt_hash(const void *aa)
+{
+  const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa;
+
+  return (hashval_t) a->version;
+}
+
+/* Copied from tree-parloops.c
+   Assigns the address of OBJ in TYPE to an ssa name, and returns this name.
+   The assignment statement is placed on edge ENTRY.  DECL_ADDRESS maps decls
+   to their addresses that can be reused.  The address of OBJ is known to
+   be invariant in the whole function.  */
+
+static tree
+take_address_of(tree obj, tree type, edge entry, htab_t decl_address)
+{
+  int uid;
+  void **dslot;
+  struct int_tree_map ielt, *nielt;
+  tree *var_p, name, bvar, addr;
+  gimple stmt;
+  gimple_seq stmts;
+
+  /* Since the address of OBJ is invariant, the trees may be shared.
+   Avoid rewriting unrelated parts of the code.  */
+  obj = unshare_expr(obj);
+  for (var_p = &obj; handled_component_p(*var_p); var_p = &TREE_OPERAND (*var_p, 0))
+    continue;
+  uid = DECL_UID (*var_p);
+
+  ielt.uid = uid;
+  dslot = htab_find_slot_with_hash(decl_address, &ielt, uid, INSERT);
+  if (!*dslot)
+    {
+      addr = build_addr(*var_p, current_function_decl);
+      bvar = create_tmp_var(TREE_TYPE (addr), get_name(*var_p));
+      add_referenced_var(bvar);
+      stmt = gimple_build_assign (bvar, addr);
+      name = make_ssa_name(bvar, stmt);
+      gimple_assign_set_lhs(stmt, name);
+      gsi_insert_on_edge_immediate(entry, stmt);
+
+      nielt = XNEW (struct int_tree_map);
+      nielt->uid = uid;
+      nielt->to = name;
+      *dslot = nielt;
+    }
+  else
+    name = ((struct int_tree_map *) *dslot)->to;
+
+  if (var_p != &obj)
+    {
+      *var_p = build1 (INDIRECT_REF, TREE_TYPE (*var_p), name);
+      name = force_gimple_operand(build_addr(obj, current_function_decl),
+          &stmts, true, NULL_TREE);
+          if (!gimple_seq_empty_p (stmts))
+          gsi_insert_seq_on_edge_immediate (entry, stmts);
+        }
+
+      if (TREE_TYPE (name) != type)
+        {
+          name = force_gimple_operand (fold_convert (type, name), &stmts, true,
+              NULL_TREE);
+          if (!gimple_seq_empty_p (stmts))
+          gsi_insert_seq_on_edge_immediate (entry, stmts);
+        }
+
+      return name;
+    }
+
+/* Copied from tree-parloops.c */
+
+struct elv_data
+{
+  struct walk_stmt_info info;
+  edge entry;
+  htab_t decl_address;
+  bool changed;
+};
+
+/* Copied from tree-parloops.c
+   Eliminates references to local variables in *TP out of the single
+   entry single exit region starting at DTA->ENTRY.
+   DECL_ADDRESS contains addresses of the references that had their
+   address taken already.  If the expression is changed, CHANGED is
+   set to true.  Callback for walk_tree.  */
+
+static tree
+eliminate_local_variables_1(tree *tp, int *walk_subtrees, void *data)
+{
+  struct elv_data * const dta = (struct elv_data *) data;
+  tree t = *tp, var, addr, addr_type, type, obj;
+
+  if (DECL_P (t))
+    {
+      *walk_subtrees = 0;
+
+      if (!SSA_VAR_P (t) || DECL_EXTERNAL (t))
+        return NULL_TREE;
+
+      type = TREE_TYPE (t);
+      addr_type = build_pointer_type(type);
+      addr = take_address_of(t, addr_type, dta->entry, dta->decl_address);
+      *tp = build1 (INDIRECT_REF, TREE_TYPE (*tp), addr);
+
+      dta->changed = true;
+      return NULL_TREE;
+    }
+
+  if (TREE_CODE (t) == ADDR_EXPR)
+    {
+      /* ADDR_EXPR may appear in two contexts:
+       -- as a gimple operand, when the address taken is a function invariant
+       -- as gimple rhs, when the resulting address in not a function
+       invariant
+       We do not need to do anything special in the latter case (the base of
+       the memory reference whose address is taken may be replaced in the
+       DECL_P case).  The former case is more complicated, as we need to
+       ensure that the new address is still a gimple operand.  Thus, it
+       is not sufficient to replace just the base of the memory reference --
+       we need to move the whole computation of the address out of the
+       loop.  */
+      if (!is_gimple_val(t))
+        return NULL_TREE;
+
+      *walk_subtrees = 0;
+      obj = TREE_OPERAND (t, 0);
+      var = get_base_address(obj);
+      if (!var || !SSA_VAR_P (var) || DECL_EXTERNAL (var))
+        return NULL_TREE;
+
+      addr_type = TREE_TYPE (t);
+      addr = take_address_of(obj, addr_type, dta->entry, dta->decl_address);
+      *tp = addr;
+
+      dta->changed = true;
+      return NULL_TREE;
+    }
+
+  if (!EXPR_P (t))
+    *walk_subtrees = 0;
+
+  return NULL_TREE;
+}
+
+/* Copied from tree-parloops.c
+   Moves the references to local variables in STMT out of the single
+   entry single exit region starting at ENTRY.  DECL_ADDRESS contains
+   addresses of the references that had their address taken
+   already.  */
+
+static void
+eliminate_local_variables_stmt(edge entry, gimple stmt, htab_t decl_address)
+{
+  struct elv_data dta;
+
+  memset(&dta.info, '\0', sizeof(dta.info));
+  dta.entry = entry;
+  dta.decl_address = decl_address;
+  dta.changed = false;
+
+  walk_gimple_op(stmt, eliminate_local_variables_1, &dta.info);
+
+  if (dta.changed)
+    update_stmt(stmt);
+
+}
+
+/* Copied from tree-parloops.c
+   Eliminates the references to local variables from the single entry
+   single exit region between the ENTRY and EXIT edges.
+
+   This includes:
+   1) Taking address of a local variable -- these are moved out of the
+   region (and temporary variable is created to hold the address if
+   necessary).
+
+   2) Dereferencing a local variable -- these are replaced with indirect
+   references.  */
+
+static void
+eliminate_local_variables(edge entry, edge exit)
+{
+  basic_block bb;
+  VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3);
+  unsigned i;
+  gimple_stmt_iterator gsi;
+  htab_t decl_address = htab_create(10, int_tree_map_hash, int_tree_map_eq,
+      free);
+  basic_block entry_bb = entry->src;
+  basic_block exit_bb = exit->dest;
+
+  gather_blocks_in_sese_region(entry_bb, exit_bb, &body);
+
+  for (i = 0; VEC_iterate (basic_block, body, i, bb); i++)
+    if (bb != entry_bb && bb != exit_bb)
+      for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+        eliminate_local_variables_stmt(entry, gsi_stmt(gsi), decl_address);
+
+  htab_delete(decl_address);
+  VEC_free (basic_block, heap, body);
+}
+
+/* Copied from tree-parloops.c
+   Returns true if expression EXPR is not defined between ENTRY and
+   EXIT, i.e. if all its operands are defined outside of the region.  */
+
+static bool
+expr_invariant_in_region_p(edge entry, edge exit, tree expr)
+{
+
+  basic_block entry_bb = entry->src;
+  basic_block exit_bb = exit->dest;
+  basic_block def_bb;
+
+  if (is_gimple_min_invariant (expr))
+    return true;
+
+  if (TREE_CODE (expr) == SSA_NAME)
+    {
+      def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
+      if (def_bb
+	  && dominated_by_p (CDI_DOMINATORS, def_bb, entry_bb)
+	  && !dominated_by_p (CDI_DOMINATORS, def_bb, exit_bb))
+	return false;
+
+      return true;
+    }
+
+  return false;
+}
+
+/* Copied from tree-parloops.c
+   If COPY_NAME_P is true, creates and returns a duplicate of NAME.
+   The copies are stored to NAME_COPIES, if NAME was already duplicated,
+   its duplicate stored in NAME_COPIES is returned.
+
+   Regardless of COPY_NAME_P, the decl used as a base of the ssa name is also
+   duplicated, storing the copies in DECL_COPIES.  */
+
+static tree
+separate_decls_in_region_name(tree name, htab_t name_copies,
+    htab_t decl_copies, bool copy_name_p)
+{
+  tree copy, var, var_copy;
+  unsigned idx, uid, nuid;
+  struct int_tree_map ielt, *nielt;
+  struct name_to_copy_elt elt, *nelt;
+  void **slot, **dslot;
+
+  if (TREE_CODE (name) != SSA_NAME)
+    return name;
+
+  idx = SSA_NAME_VERSION (name);
+  elt.version = idx;
+  slot = htab_find_slot_with_hash(name_copies, &elt, idx, copy_name_p ? INSERT
+      : NO_INSERT);
+  if (slot && *slot)
+    return ((struct name_to_copy_elt *) *slot)->new_name;
+
+  var = SSA_NAME_VAR (name);
+  uid = DECL_UID (var);
+  ielt.uid = uid;
+  dslot = htab_find_slot_with_hash(decl_copies, &ielt, uid, INSERT);
+  if (!*dslot)
+    {
+      var_copy = create_tmp_var(TREE_TYPE (var), get_name(var));
+      DECL_GIMPLE_REG_P (var_copy) = DECL_GIMPLE_REG_P (var);
+      add_referenced_var(var_copy);
+      nielt = XNEW (struct int_tree_map);
+      nielt->uid = uid;
+      nielt->to = var_copy;
+      *dslot = nielt;
+
+      /* Ensure that when we meet this decl next time, we won't duplicate
+       it again.  */
+      nuid = DECL_UID (var_copy);
+      ielt.uid = nuid;
+      dslot = htab_find_slot_with_hash(decl_copies, &ielt, nuid, INSERT);
+      gcc_assert (!*dslot);
+      nielt = XNEW (struct int_tree_map);
+      nielt->uid = nuid;
+      nielt->to = var_copy;
+      *dslot = nielt;
+    }
+  else
+    var_copy = ((struct int_tree_map *) *dslot)->to;
+
+  if (copy_name_p)
+    {
+      copy = duplicate_ssa_name(name, NULL);
+      nelt = XNEW (struct name_to_copy_elt);
+      nelt->version = idx;
+      nelt->new_name = copy;
+      nelt->field = NULL_TREE;
+      *slot = nelt;
+    }
+  else
+    {
+      gcc_assert (!slot);
+      copy = name;
+    }
+
+  SSA_NAME_VAR (copy) = var_copy;
+
+  return copy;
+}
+
+/* Copied from tree-parloops.c
+   Finds the ssa names used in STMT that are defined outside the
+   region between ENTRY and EXIT and replaces such ssa names with
+   their duplicates.  The duplicates are stored to NAME_COPIES.  Base
+   decls of all ssa names used in STMT (including those defined in
+   LOOP) are replaced with the new temporary variables; the
+   replacement decls are stored in DECL_COPIES.  */
+
+static void
+separate_decls_in_region_stmt(edge entry, edge exit, gimple stmt,
+    htab_t name_copies, htab_t decl_copies)
+{
+  use_operand_p use;
+  def_operand_p def;
+  ssa_op_iter oi;
+  tree name, copy;
+  bool copy_name_p;
+
+  mark_virtual_ops_for_renaming(stmt);
+
+  FOR_EACH_PHI_OR_STMT_DEF (def, stmt, oi, SSA_OP_DEF)
+    {
+      name = DEF_FROM_PTR (def);
+      gcc_assert (TREE_CODE (name) == SSA_NAME);
+      copy = separate_decls_in_region_name(name, name_copies, decl_copies,
+          false);
+      gcc_assert (copy == name);
+    }
+
+  FOR_EACH_PHI_OR_STMT_USE (use, stmt, oi, SSA_OP_USE)
+    {
+      name = USE_FROM_PTR (use);
+      if (TREE_CODE (name) != SSA_NAME)
+        continue;
+
+      copy_name_p = expr_invariant_in_region_p(entry, exit, name);
+      copy = separate_decls_in_region_name(name, name_copies, decl_copies,
+          copy_name_p);
+      SET_USE (use, copy);
+    }
+
+}
+
+/* Copied from tree-parloops.c
+   Callback for htab_traverse.  Adds a field corresponding to a ssa name
+   described in SLOT. The type is passed in DATA.  */
+
+static int
+add_field_for_name(void **slot, void *data)
+{
+  struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot;
+  tree type = (tree) data;
+  tree name = ssa_name (elt->version);
+  tree var = SSA_NAME_VAR (name);
+  tree field = build_decl (FIELD_DECL, DECL_NAME (var), TREE_TYPE (var));
+
+  insert_field_into_struct(type, field);
+  elt->field = field;
+
+  return 1;
+}
+
+/* Copied from tree-parloops.c */
+
+struct clsn_data
+{
+  tree store;
+  tree load;
+
+  basic_block store_bb;
+  basic_block load_bb;
+};
+
+/* Copied from tree-parloops.c
+   Callback for htab_traverse.  Creates loads to a field of LOAD in LOAD_BB and
+   store to a field of STORE in STORE_BB for the ssa name and its duplicate
+   specified in SLOT.  */
+
+static int
+create_loads_and_stores_for_name(void **slot, void *data)
+{
+  struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot;
+  struct clsn_data * const clsn_data = (struct clsn_data *) data;
+  tree t;
+  gimple stmt;
+  gimple_stmt_iterator gsi;
+  tree type = TREE_TYPE (elt->new_name);
+  tree struct_type = TREE_TYPE (TREE_TYPE (clsn_data->load));
+  tree load_struct;
+
+  gsi = gsi_last_bb(clsn_data->store_bb);
+  t = build3 (COMPONENT_REF, type, clsn_data->store, elt->field, NULL_TREE);
+  stmt = gimple_build_assign (t, ssa_name (elt->version));
+  mark_virtual_ops_for_renaming(stmt);
+  gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+
+  gsi = gsi_last_bb(clsn_data->load_bb);
+  load_struct = fold_build1 (INDIRECT_REF, struct_type, clsn_data->load);
+  t = build3 (COMPONENT_REF, type, load_struct, elt->field, NULL_TREE);
+  stmt = gimple_build_assign (elt->new_name, t);
+  SSA_NAME_DEF_STMT (elt->new_name) = stmt;
+  gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+
+  return 1;
+}
+
+/* Copied from tree-parloops.c
+   Moves all the variables used in LOOP and defined outside of it (including
+   the initial values of loop phi nodes, and *PER_THREAD if it is a ssa
+   name) to a structure created for this purpose.  The code
+
+   while (1)
+     {
+       use (a);
+       use (b);
+     }
+
+   is transformed this way:
+
+   bb0:
+   old.a = a;
+   old.b = b;
+
+   bb1:
+   a' = new->a;
+   b' = new->b;
+   while (1)
+     {
+       use (a');
+       use (b');
+     }
+
+   `old' is stored to *ARG_STRUCT and `new' is stored to NEW_ARG_STRUCT.  The
+   pointer `new' is intentionally not initialized (the loop will be split to a
+   separate function later, and `new' will be initialized from its arguments).
+   LD_ST_DATA holds information about the shared data structure used to pass
+   information among the threads.  It is initialized here, and 
+   gen_parallel_loop will pass it to create_call_for_reduction that 
+   needs this information.  REDUCTION_LIST describes the reductions 
+   in LOOP.  */
+
+static void
+separate_decls_in_region(edge entry, edge exit, tree *arg_struct,
+    tree *new_arg_struct)
+
+{
+  struct clsn_data clsn_data_struct;
+  struct clsn_data *ld_st_data = &clsn_data_struct;
+  basic_block bb1 = split_edge(entry);
+  basic_block bb0 = single_pred(bb1);
+  htab_t name_copies = htab_create(10, name_to_copy_elt_hash,
+      name_to_copy_elt_eq, free);
+  htab_t decl_copies =
+      htab_create(10, int_tree_map_hash, int_tree_map_eq, free);
+  unsigned i;
+  tree type, type_name, nvar;
+  gimple_stmt_iterator gsi;
+  VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3);
+  basic_block bb;
+  basic_block entry_bb = bb1;
+  basic_block exit_bb = exit->dest;
+
+  entry = single_succ_edge(entry_bb);
+  gather_blocks_in_sese_region(entry_bb, exit_bb, &body);
+
+  for (i = 0; VEC_iterate (basic_block, body, i, bb); i++)
+    {
+      if (bb != entry_bb && bb != exit_bb)
+        {
+          for (gsi = gsi_start_phis(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+            separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi),
+                name_copies, decl_copies);
+
+          for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+            separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi),
+                name_copies, decl_copies);
+        }
+    }
+
+  VEC_free (basic_block, heap, body);
+
+  if (htab_elements(name_copies) == 0)
+    {
+      /* It may happen that there is nothing to copy (if there are only
+       loop carried and external variables in the loop).  */
+      *arg_struct = NULL;
+      *new_arg_struct = NULL;
+    }
+  else
+    {
+      /* Create the type for the structure to store the ssa names to.  */
+      type = lang_hooks.types.make_type(RECORD_TYPE);
+      type_name = build_decl (TYPE_DECL, create_tmp_var_name (".paral_data"),
+          type);
+      TYPE_NAME (type) = type_name;
+
+      htab_traverse(name_copies, add_field_for_name, type);
+      layout_type(type);
+
+      /* Create the loads and stores.  */
+      *arg_struct = create_tmp_var(type, ".paral_data_store");
+      add_referenced_var(*arg_struct);
+      nvar = create_tmp_var(build_pointer_type(type), ".paral_data_load");
+      add_referenced_var(nvar);
+      *new_arg_struct = make_ssa_name(nvar, NULL);
+
+      ld_st_data->store = *arg_struct;
+      ld_st_data->load = *new_arg_struct;
+      ld_st_data->store_bb = bb0;
+      ld_st_data->load_bb = bb1;
+
+      htab_traverse(name_copies, create_loads_and_stores_for_name, ld_st_data);
+
+    }
+
+  htab_delete(decl_copies);
+  htab_delete(name_copies);
+}
+
+/* Copied from tree-parloops.c
+   Bitmap containing uids of functions created by parallelization.  We cannot
+   allocate it from the default obstack, as it must live across compilation
+   of several functions; we make it gc allocated instead.  */
+
+static GTY(()) bitmap parallelized_functions;
+
+/* Adapted from tree-parloops.c: added comments and add_referenced_var
+   Creates and returns an empty function that will receive the body of
+   a parallelized loop.  */
+
+static tree
+create_loop_fn(void)
+{
+  char buf[MAX_CHAR_LOOPFN_NAME];
+  char *tname;
+  tree decl, type, name, t;
+  struct function *act_cfun = cfun;
+  static unsigned loopfn_num;
+
+  snprintf(buf, MAX_CHAR_LOOPFN_NAME, "%s.$loopfn", current_function_name());
+  ASM_FORMAT_PRIVATE_NAME(tname, buf, loopfn_num++);
+  clean_symbol_name(tname);
+  name = get_identifier(tname);
+  type = build_function_type_list(void_type_node,ptr_type_node, NULL_TREE);
+
+  decl = build_decl (FUNCTION_DECL, name, type);
+
+  if (!parallelized_functions)
+    parallelized_functions = BITMAP_GGC_ALLOC ();
+  bitmap_set_bit(parallelized_functions, DECL_UID (decl));
+
+  /* Nonzero if function has been defined */
+  TREE_STATIC (decl) = 1;
+
+  /* Nonzero if the name is used in its scope */
+  TREE_USED (decl) = 1;
+
+  /* Represents a compiler-generated entity. */
+  DECL_ARTIFICIAL (decl) = 1;
+
+  /* Nonzero for a given ..._DECL node means that the name of this node
+   should be ignored for symbolic debug purposes.  */
+  DECL_IGNORED_P (decl) = 0;
+
+  /* Nonzero if name is accessible from outside this module */
+  TREE_PUBLIC (decl) = 0;
+
+  /* In a FUNCTION_DECL, nonzero if the function cannot be inlined.  */
+  DECL_UNINLINABLE (decl) = 1;
+
+  /* In a VAR_DECL or FUNCTION_DECL, nonzero means external reference:
+   do not allocate storage, and refer to a definition elsewhere.  Note that
+   this does not necessarily imply the entity represented by NODE
+   has no program source-level definition in this translation unit.  For
+   example, for a FUNCTION_DECL, DECL_SAVED_TREE may be non-NULL and
+   DECL_EXTERNAL may be true simultaneously; that can be the case for
+   a C99 "extern inline" function.  */
+  DECL_EXTERNAL (decl) = 0;
+
+  /* This points to either the FUNCTION_DECL for the containing function
+   the RECORD_TYPE or UNION_TYPE for the containing type, or
+   NULL_TREE or a TRANSLATION_UNIT_DECL if the given decl has "file
+   scope".  */
+  DECL_CONTEXT (decl) = NULL_TREE;
+
+  /* Holds the tree of BINDINGs. */
+  DECL_INITIAL (decl) = make_node (BLOCK);
+
+  t = build_decl (RESULT_DECL, NULL_TREE, void_type_node);
+  /* TODO: Is add_referenced_var(t) needed? */
+  DECL_ARTIFICIAL (t) = 1;
+  DECL_IGNORED_P (t) = 1;
+  /* In FUNCTION_DECL, holds the decl for the return value.  */
+  DECL_RESULT (decl) = t;
+
+  t = build_decl (PARM_DECL, get_identifier (".paral_data_param"),
+      ptr_type_node);
+  /* TODO: Why add_referenced_var(t) is needed? */
+  add_referenced_var(t);
+  DECL_ARTIFICIAL (t) = 1;
+  /* For a PARM_DECL, records the data type used to pass the argument,
+   which may be different from the type seen in the program.  */
+  DECL_ARG_TYPE (t) = ptr_type_node;
+
+  DECL_CONTEXT (t) = decl;
+  TREE_USED (t) = 1;
+  DECL_ARGUMENTS (decl) = t;
+
+  allocate_struct_function(decl, false);
+
+  /* The call to allocate_struct_function clobbers CFUN, so we need to restore
+   it.  */
+  set_cfun(act_cfun);
+
+  /*
+   * ALTERNATIVE CODE:
+   *
+   * push_struct_function (decl);
+   * pop_cfun ();
+   */
+
+  return decl;
+}
+
+/* Dummy analysis */
+bool
+is_parallelizable_loop(struct loop * loop)
+{
+  bool analysis_result = false;
+
+  if (flag_xark_debug && dump_file)
+    {
+      hello_from("is_parallelizable_loop", dump_file);
+      fprintf(dump_file, "flag_openmp: %d\n", flag_openmp);
+      fprintf(dump_file, "flag_tree_parallelize_loops: %d\n",
+          flag_tree_parallelize_loops);
+      fprintf(dump_file,
+          "IDENTIFIER_POINTER(DECL_NAME(current_function_decl)): %s\n",
+          IDENTIFIER_POINTER(DECL_NAME(current_function_decl)));
+      fprintf(dump_file, "loop->num: %d\n", loop->num);
+    }
+
+  if (!(flag_openmp) && (flag_tree_parallelize_loops <= 1) && (strcmp(
+      IDENTIFIER_POINTER(DECL_NAME(current_function_decl)), "main") == 0) && (loop->num == 1))
+    {
+      analysis_result = true;
+    }
+  else
+    {
+      analysis_result = false;
+    }
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "analysis_result: %d\n", analysis_result);
+      bye_from("is_parallelizable_loop", dump_file);
+    }
+
+  return analysis_result;
+}
+
+/* Adapted from tree-parloops.c:gen_parallel_loop and tree-parloops.c:create_parallel_loop */
+void
+parallelize_loop(struct loop * loop)
+{
+
+  basic_block bb_latch, bb_header;
+  VEC (edge, heap) *exits;
+  edge exit;
+  unsigned i;
+  edge true_edge, false_edge;
+  edge lpe, lle;
+  edge start;
+  edge region_entry, region_exit;
+  tree arg_struct, new_arg_struct;
+  static tree loop_fn;
+  basic_block bb_omp_parallel_head, bb_omp_parallel_return;
+  gimple_stmt_iterator gsi;
+  gimple stmt_omp_parallel_head, cond_stmt, stmt;
+  tree param;
+  tree cvar, cvar_init, initvar, cvar_next, cvar_base, type;
+  gimple phi;
+  basic_block for_bb, ex_bb;
+  edge guard, end, e;
+  tree t;
+  gimple for_stmt;
+  tree old_current_function_decl;
+  struct cgraph_node * node;
+
+  bb_latch = loop->latch;
+  bb_header = loop->header;
+
+  if (flag_xark_debug && dump_file)
+    {
+      hello_from("parallelize_loop", dump_file);
+      fprintf(dump_file, "Loop number %d (header = %d, latch = %d)\n",
+          loop->num, bb_header->index, bb_latch->index);
+
+      exits = get_loop_exit_edges(loop);
+      for (i = 0; VEC_iterate (edge, exits, i, exit); i++)
+        {
+          fprintf(dump_file,
+              "Loop edge exit number %d -> src: bb%d dest: bb%d\n", i,
+              exit->src->index, exit->dest->index);
+        }
+      VEC_free (edge, heap, exits);
+    }
+
+  extract_true_false_edges_from_block(loop->header, &true_edge, &false_edge);
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "Edges from block loop->header:\n");
+      fprintf(dump_file, "  True: src: bb%d dest: bb%d\n",
+          true_edge->src->index, true_edge->dest->index);
+      fprintf(dump_file, "  False: src: bb%d dest: bb%d\n",
+          false_edge->src->index, false_edge->dest->index);
+    }
+
+  lpe = loop_preheader_edge(loop);
+  lle = loop_latch_edge(loop);
+
+  if (flag_xark_debug && dump_file)
+    {
+      my_print_loops(dump_file);
+      fprintf(dump_file, "Loop %d is going to be cancelled\n", loop->num);
+    }
+  /* FIXME: Do I want to cancel all loop tree or only outermost? */
+  cancel_loop_tree(loop);
+
+  if (flag_xark_debug && dump_file)
+    {
+      my_print_loops(dump_file);
+      dump_dom_info_state(CDI_DOMINATORS, dump_file);
+      dump_dom_info_state(CDI_POST_DOMINATORS, dump_file);
+    }
+
+  /* FIXME: It will fail if lpe->src has various predecessors
+   * create_empty_bb()? */
+  start = single_pred_edge(lpe->src);
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "start edge src: bb%d dest: bb%d\n",
+          start->src->index, start->dest->index);
+    }
+  split_edge(start);
+  if (flag_xark_debug && dump_file)
+    {
+
+      fprintf(dump_file, "start_edge (after splitting) src: bb%d dest: bb%d\n",
+          start->src->index, start->dest->index);
+    }
+
+  region_entry = single_succ_edge(start->dest);
+  region_exit = false_edge;
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file,
+          "region_entry edge (after splitting) src: bb%d dest: bb%d\n",
+          region_entry->src->index, region_entry->dest->index);
+      my_print_loops(dump_file);
+    }
+
+  eliminate_local_variables(start, region_exit);
+
+  separate_decls_in_region(region_entry, region_exit, &arg_struct,
+      &new_arg_struct);
+
+  if (flag_xark_debug && dump_file)
+    {
+      my_print_loops(dump_file);
+
+      fprintf(dump_file, "true edge: src: bb%d dest: bb%d\n",
+          true_edge->src->index, true_edge->dest->index);
+      fprintf(dump_file, "false edge: src: bb%d dest: bb%d\n",
+          false_edge->src->index, false_edge->dest->index);
+      fprintf(dump_file, "lpe edge: src: bb%d dest: bb%d\n", lpe->src->index,
+          lpe->dest->index);
+      fprintf(dump_file, "lle edge: src: bb%d dest: bb%d\n", lle->src->index,
+          lle->dest->index);
+      fprintf(dump_file, "start edge: src: bb%d dest: bb%d\n",
+          start->src->index, start->dest->index);
+      fprintf(dump_file, "region_entry edge: src: bb%d dest: bb%d\n",
+          region_entry->src->index, region_entry->dest->index);
+      fprintf(dump_file, "region_exit edge: src: bb%d dest: bb%d\n",
+          region_exit->src->index, region_exit->dest->index);
+    }
+
+  loop_fn = create_loop_fn();
+  bb_omp_parallel_head = region_entry->src;
+  gsi = gsi_last_bb(bb_omp_parallel_head);
+  stmt_omp_parallel_head = gimple_build_omp_parallel(NULL, NULL, loop_fn,
+      arg_struct);
+  gsi_insert_after(&gsi, stmt_omp_parallel_head, GSI_NEW_STMT);
+
+  if (arg_struct)
+    {
+      gsi = gsi_after_labels(region_entry->dest);
+
+      param = make_ssa_name(DECL_ARGUMENTS (loop_fn), NULL);
+      stmt = gimple_build_assign (param, build_fold_addr_expr (arg_struct));
+      gsi_insert_before(&gsi, stmt, GSI_SAME_STMT);
+      SSA_NAME_DEF_STMT (param) = stmt;
+
+      stmt = gimple_build_assign (new_arg_struct,
+          fold_convert (TREE_TYPE (new_arg_struct), param));
+      gsi_insert_before(&gsi, stmt, GSI_SAME_STMT);
+      SSA_NAME_DEF_STMT (new_arg_struct) = stmt;
+    }
+
+  /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_PARALLEL.  */
+  /* TODO: Would it be enough with split_edge()?
+   * At this moment, I don't have memory tags */
+  bb_omp_parallel_return = split_loop_exit_edge(region_exit);
+  verify_loop_closed_ssa();
+  gsi = gsi_last_bb(bb_omp_parallel_return);
+  gsi_insert_after(&gsi, gimple_build_omp_return(false), GSI_NEW_STMT);
+
+  /* Extract data for GIMPLE_OMP_FOR */
+  gcc_assert(bb_header == region_exit->src);
+  cond_stmt = last_stmt(bb_header);
+
+  /* Iteration var */
+  cvar = gimple_cond_lhs(cond_stmt);
+
+  cvar_base = SSA_NAME_VAR (cvar);
+  /* Phi which defines cvar*/
+  phi = SSA_NAME_DEF_STMT (cvar);
+  /* We get the initial value from lpe
+   * FIXME: I'm not propagating the value of initial assignment. */
+  cvar_init = PHI_ARG_DEF_FROM_EDGE (phi, lpe);
+
+  initvar = make_ssa_name(cvar_base, NULL);
+  SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, lpe),
+      initvar);
+  /* We get the new value from lle */
+  cvar_next = PHI_ARG_DEF_FROM_EDGE (phi, lle);
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "\n");
+      fprintf(dump_file, "cvar: ");
+      print_generic_expr(dump_file, cvar, 0);
+      fprintf(dump_file, "\n");
+      fprintf(dump_file, "cvar: ");
+      print_generic_expr(dump_file, cvar, 0);
+      fprintf(dump_file, "\n");
+      fprintf(dump_file, "cvar_init: ");
+      print_generic_expr(dump_file, cvar_init, 0);
+      fprintf(dump_file, "\n");
+      fprintf(dump_file, "initvar: ");
+      print_generic_expr(dump_file, initvar, 0);
+      fprintf(dump_file, "\n");
+      fprintf(dump_file, "cvar_next: ");
+      print_generic_expr(dump_file, cvar_next, 0);
+      fprintf(dump_file, "\n");
+    }
+
+  /* Removes incr stmt */
+  gsi = gsi_last_bb(bb_latch);
+  gcc_assert (gsi_stmt (gsi) == SSA_NAME_DEF_STMT (cvar_next));
+  gsi_remove(&gsi, true);
+
+  /* Prepare cfg.  */
+  if (flag_xark_debug && dump_file)
+    {
+      my_print_loops(dump_file);
+    }
+  for_bb = split_edge(lpe);
+  ex_bb = split_loop_exit_edge(region_exit);
+  gcc_assert (false_edge == region_exit);
+
+  guard = make_edge(for_bb, ex_bb, 0);
+  single_succ_edge(bb_latch)->flags = 0;
+  /* Edge between latch and ex_bb */
+  end = make_edge(bb_latch, ex_bb, EDGE_FALLTHRU);
+
+  if (flag_xark_debug && dump_file)
+    {
+      my_print_loops(dump_file);
+    }
+
+  for (gsi = gsi_start_phis(ex_bb); !gsi_end_p(gsi); gsi_next(&gsi))
+    {
+      phi = gsi_stmt(gsi);
+      /* From header */
+      stmt = SSA_NAME_DEF_STMT (PHI_ARG_DEF_FROM_EDGE (phi, false_edge));
+      /* From for_bb if it doesn't enter in loop */
+      add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lpe), guard);
+      /* From latch */
+      add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lle),
+      end);
+    }
+
+  e = redirect_edge_and_branch(false_edge, true_edge->dest);
+  PENDING_STMT (e) = NULL;
+
+  /* Emit GIMPLE_OMP_FOR.  */
+  gimple_cond_set_lhs(cond_stmt, cvar_base);
+  type = TREE_TYPE (cvar);
+  t = build_omp_clause(OMP_CLAUSE_SCHEDULE);
+  /* TODO: Isn't it the default? */
+  OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+
+  for_stmt = gimple_build_omp_for(NULL, t, 1, NULL);
+  gimple_omp_for_set_index(for_stmt, 0, initvar);
+  gimple_omp_for_set_initial(for_stmt, 0, cvar_init);
+  gimple_omp_for_set_final(for_stmt, 0, gimple_cond_rhs(cond_stmt));
+  gimple_omp_for_set_cond(for_stmt, 0, gimple_cond_code(cond_stmt));
+  gimple_omp_for_set_incr(for_stmt, 0, build2 (PLUS_EXPR, type,
+      cvar_base,
+      build_int_cst (type, 1)));
+
+  gsi = gsi_last_bb(for_bb);
+  gsi_insert_after(&gsi, for_stmt, GSI_NEW_STMT);
+  SSA_NAME_DEF_STMT (initvar) = for_stmt;
+
+  /* Emit GIMPLE_OMP_CONTINUE.  */
+  gsi = gsi_last_bb(bb_latch);
+  stmt = gimple_build_omp_continue(cvar_next, cvar);
+  gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+  SSA_NAME_DEF_STMT (cvar_next) = stmt;
+
+  /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_FOR.  */
+  gsi = gsi_last_bb(ex_bb);
+  gsi_insert_after(&gsi, gimple_build_omp_return(true), GSI_NEW_STMT);
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "\nBefore omp_expand_local\n");
+      my_print_loops(dump_file);
+
+      dump_cgraph(dump_file);
+      dump_cgraph_state(dump_file);
+
+    }
+
+  /* OpenMP expansion */
+  omp_expand_local(bb_omp_parallel_head);
+
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "\nAfter omp_expand_local\n");
+      my_print_loops(dump_file);
+      dump_cgraph(dump_file);
+      dump_cgraph_state(dump_file);
+    }
+
+  /* At this point, cgraph is innaccurate.
+   * But it doesn't seem to be necessary. */
+  if (flag_xark_debug && dump_file)
+    {
+      fprintf(dump_file, "\nRebuilding cgraph\n");
+    }
+
+  /* Rebuild cgraph_edges with cfun */
+  rebuild_cgraph_edges();
+
+  if (flag_xark_debug && dump_file)
+    {
+      dump_cgraph(dump_file);
+      dump_cgraph_state(dump_file);
+      fprintf(dump_file, "*--*\n");
+    }
+
+  /* Rebuild cgraph_edges with loop_fn */
+  push_cfun(DECL_STRUCT_FUNCTION (loop_fn));
+  old_current_function_decl = current_function_decl;
+  current_function_decl = loop_fn;
+
+  node = cgraph_node(loop_fn);
+  node->analyzed = true;
+  node->lowered = true;
+  if (flag_xark_debug && dump_file)
+    {
+      dump_cgraph_node(dump_file, cgraph_node(loop_fn));
+      fprintf(dump_file, "*--*\n");
+    }
+  rebuild_cgraph_edges();
+  current_function_decl = old_current_function_decl;
+  pop_cfun();
+
+  if (flag_xark_debug && dump_file)
+    {
+      dump_cgraph(dump_file);
+      dump_cgraph_state(dump_file);
+    }
+  verify_cgraph();
+
+  /* Check for unreachable nodes */
+  cgraph_remove_unreachable_nodes(false, dump_file);
+  if (flag_xark_debug && dump_file)
+    {
+      dump_cgraph(dump_file);
+      dump_cgraph_state(dump_file);
+      fprintf(dump_file, "*--*\n");
+      dump_functions_across_cgraph(dump_file);
+    }
+
+  update_ssa(TODO_update_ssa);
+  verify_ssa(true);
+
+  verify_flow_info();
+  verify_dominators(CDI_DOMINATORS);
+  verify_loop_structure();
+  verify_loop_closed_ssa();
+
+}
+
+unsigned int
+execute_xark_backend_omp(void)
+{
+
+  loop_iterator li;
+  struct loop * loop;
+
+  if (flag_xark_debug && dump_file)
+    {
+      hello_from("execute_xark_backend_omp", dump_file);
+      /* print_bbs(dump_file); */
+    }
+
+  loop_optimizer_init(LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+  if (flag_xark_debug && dump_file)
+    {
+      /* print_loops_info(dump_file); */
+      my_print_loops(dump_file);
+    }
+
+  FOR_EACH_LOOP(li, loop, 0)
+    {
+      if (is_parallelizable_loop(loop))
+        {
+          parallelize_loop(loop);
+        }
+    }
+
+  loop_optimizer_finalize();
+
+  update_ssa(TODO_update_ssa);
+  verify_ssa(true);
+  if (flag_xark_debug && dump_file)
+    {
+      bye_from("execute_xark_backend_omp", dump_file);
+    }
+
+  /*
+   * In tree-parloops.c
+   *
+   * return TODO_cleanup_cfg | TODO_rebuild_alias;
+   */
+  return 0;
+}
+
+unsigned int
+execute_xark_print(void)
+{
+
+  if (dump_file)
+  {
+    hello_from("execute_xark_print", dump_file);
+
+    verify_ssa(true);
+
+    dump_cgraph(dump_file);
+    dump_cgraph_state(dump_file);
+    dump_functions_across_cgraph(dump_file);
+
+    bye_from("execute_xark_print", dump_file);
+  }
+
+  return 0;
+}
+
+unsigned int
+execute_xark_ipa_print(void)
+{
+
+  if (dump_file)
+  {
+    hello_from("execute_xark_ipa_print", dump_file);
+
+    dump_cgraph(dump_file);
+    dump_cgraph_state(dump_file);
+    dump_functions_across_cgraph(dump_file);
+
+    bye_from("execute_xark_ipa_print", dump_file);
+  }
+
+  return 0;
+
+}
+
+static bool
+gate_xark_backend_omp(void)
+{
+  return flag_xark_backend_omp;
+}
+
+static bool
+gate_xark_print(void)
+{
+  return flag_xark_debug;
+}
+
+static bool
+gate_xark_ipa_print(void)
+{
+  return flag_xark_debug;
+}
+
+struct gimple_opt_pass pass_xark_backend_omp =
+  {
+    { GIMPLE_PASS, "xark-backend-omp", /* name */
+    gate_xark_backend_omp, /* gate */
+    execute_xark_backend_omp, /* execute */
+    NULL, /* sub */
+    NULL, /* next */
+    0, /* static_pass_number */
+    0, /* tv_id */
+    0, /* properties_required */
+    0, /* properties_provided */
+    0, /* properties_destroyed */
+    0, /* todo_flags_start */
+    0
+    /* | TODO_cleanup_cfg | TODO_update_ssa | TODO_rebuild_alias */
+    | TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts | TODO_dump_func /* todo_flags_finish */
+    } };
+
+struct gimple_opt_pass pass_xark_print =
+  {
+    { GIMPLE_PASS, "xark-print", /* name */
+    gate_xark_print, /* gate */
+    execute_xark_print, /* execute */
+    NULL, /* sub */
+    NULL, /* next */
+    0, /* static_pass_number */
+    0, /* tv_id */
+    0, /* properties_required */
+    0, /* properties_provided */
+    0, /* properties_destroyed */
+    0, /* todo_flags_start */
+    0 | TODO_dump_func /* todo_flags_finish */
+    } };
+
+struct simple_ipa_opt_pass pass_xark_ipa_print =
+  {
+    { SIMPLE_IPA_PASS, "xark-ipa-print", /* name */
+    gate_xark_ipa_print, /* gate */
+    execute_xark_ipa_print, /* execute */
+    NULL, /* sub */
+    NULL, /* next */
+    0, /* static_pass_number */
+    0, /* tv_id */
+    0, /* properties_required */
+    0, /* properties_provided */
+    0, /* properties_destroyed */
+    0, /* todo_flags_start */
+    0 /* todo_flags_finish */
+    } };
+
+#include "gt-tree-xark-backend-omp.h"
Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revisiÃn: 147760)
+++ gcc/common.opt	(copia de trabajo)
@@ -1274,6 +1274,18 @@
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+fxark-debug
+Common Report Var(flag_xark_debug)
+XARK debug information
+
+fxark-backend-omp
+Common Report Var(flag_xark_backend_omp)
+Generates OpenMP code based on XARK recognition
+
+fverify-ssa-debug
+Common Report Var(flag_verify_ssa_debug)
+verify_ssa debug information
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1) Optimization
 Compile whole compilation unit at a time
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c	(revisiÃn: 147760)
+++ gcc/tree-ssa.c	(copia de trabajo)
@@ -539,7 +539,30 @@
   tree op;
   enum dom_state orig_dom_state = dom_info_state (CDI_DOMINATORS);
   bitmap names_defined_in_bb = BITMAP_ALLOC (NULL);
+  FILE *verify_ssa_dump_file = NULL;
+ 
+  if (flag_verify_ssa_debug)
+    {
+      verify_ssa_dump_file = fopen("verify_ssa.dump", "a");
 
+      fprintf(verify_ssa_dump_file,
+          "*-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-*\n");
+      fprintf(verify_ssa_dump_file,
+          "*-*      verify_ssa:print_current_pass      *-*\n");
+      print_current_pass(verify_ssa_dump_file);
+      fprintf(verify_ssa_dump_file,
+          "*-*  verify_ssa:dump_current_function_decl  *-*\n");
+      dump_function_to_file(current_function_decl, verify_ssa_dump_file,
+          TDF_RAW);
+      fprintf(verify_ssa_dump_file,
+          "*-*     verify_ssa:dump_referenced_vars     *-*\n");
+      dump_referenced_vars(verify_ssa_dump_file);
+      fprintf(verify_ssa_dump_file,
+          "*-*     verify_ssa:dump_immediate_uses      *-*\n");
+      dump_immediate_uses(verify_ssa_dump_file);
+      fflush(verify_ssa_dump_file);
+    }
+
   gcc_assert (!need_ssa_update_p (cfun));
 
   verify_stmts ();
@@ -706,6 +729,16 @@
 	}
 
       bitmap_clear (names_defined_in_bb);
+
+      if (flag_verify_ssa_debug)
+        {
+          fprintf(verify_ssa_dump_file, "*-* verify_ssa:");
+          fprintf(verify_ssa_dump_file, "bb%d ", bb->index);
+          fprintf(verify_ssa_dump_file, "of %s ", current_function_name());
+          fprintf(verify_ssa_dump_file, "funtion processed *-*\n");
+          fflush(verify_ssa_dump_file);
+        }
+
     }
 
   free (definition_block);
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h	(revisiÃn: 147760)
+++ gcc/tree-flow.h	(copia de trabajo)
@@ -586,6 +586,12 @@
 extern void phinodes_print_statistics (void);
 #endif
 
+/* XARK Passes */
+/* In tree-xark-backend-omp.c */
+extern unsigned int execute_xark_backend_omp(void);
+extern unsigned int execute_xark_print(void);
+extern unsigned int execute_xark_ipa_print(void);
+
 /* In gimple-low.c  */
 extern void record_vars_into (tree, tree);
 extern void record_vars (tree);
Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in	(revisiÃn: 147760)
+++ gcc/Makefile.in	(copia de trabajo)
@@ -1286,6 +1286,7 @@
         tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	tree-xark-backend-omp.o \
 	tree.o \
 	value-prof.o \
 	var-tracking.o \
@@ -2199,6 +2200,13 @@
    $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) $(GGC_H) \
    $(BASIC_BLOCK_H) tree-ssa-propagate.h $(FLAGS_H) $(TREE_DUMP_H) \
    $(CFGLOOP_H) tree-chrec.h $(TIMEVAR_H) $(TOPLEV_H) intl.h
+
+tree-xark-backend-omp.o : tree-xark-backend-omp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
+   $(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_H) \
+   $(TREE_FLOW_H) $(TIMEVAR_H) $(FLAGS_H) $(EXPR_H) $(TOPLEV_H) tree-pass.h \
+   $(GGC_H) except.h $(SPLAY_TREE_H) $(OPTABS_H) $(CFGLOOP_H) $(CGRAPH_H)\
+   tree-iterator.h gt-tree-xark-backend-omp.h
+
 tree-cfg.o : tree-cfg.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
    $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(FLAGS_H) output.h \
    $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
@@ -3328,6 +3336,7 @@
   $(srcdir)/gimple.h $(srcdir)/gimple.c \
   $(srcdir)/tree-mudflap.c $(srcdir)/tree-flow.h \
   $(srcdir)/tree-ssanames.c $(srcdir)/tree-eh.c $(srcdir)/tree-ssa-address.c \
+  $(srcdir)/tree-xark-backend-omp.c \
   $(srcdir)/tree-cfg.c \
   $(srcdir)/tree-dfa.c \
   $(srcdir)/tree-iterator.c $(srcdir)/gimplify.c \
Index: gcc/passes.c
===================================================================
--- gcc/passes.c	(revisiÃn: 147760)
+++ gcc/passes.c	(copia de trabajo)
@@ -544,6 +544,11 @@
 
       NEXT_PASS (pass_referenced_vars);
       NEXT_PASS (pass_build_ssa);
+      /* XARK starts here */
+      NEXT_PASS (pass_xark_print);
+      NEXT_PASS (pass_xark_backend_omp);
+      NEXT_PASS (pass_xark_print);
+
       NEXT_PASS (pass_early_warn_uninitialized);
       NEXT_PASS (pass_all_early_optimizations);
 	{
@@ -568,7 +573,9 @@
       NEXT_PASS (pass_release_ssa_names);
       NEXT_PASS (pass_rebuild_cgraph_edges);
       NEXT_PASS (pass_inline_parameters);
+      NEXT_PASS (pass_xark_print);
     }
+  NEXT_PASS (pass_xark_ipa_print);
   NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_matrix_reorg);
   NEXT_PASS (pass_ipa_cp);

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]