This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
XARK Backend with OpenMP
- From: Josà Manuel AndiÃn FernÃndez <jandion at udc dot es>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 22 May 2009 18:58:51 +0200
- Subject: XARK Backend with OpenMP
Dear GCC developers.
My name is Josà Manuel AndiÃn FernÃndez and I have just started to work at University of A CoruÃa on porting XARK (eXtensible Compiler for Automatic Recognition of Computational Kernels, http://doi.acm.org/10.1145/1391956.1391959) to GCC. The main goal of this project is to be able to generate parallel code from a sequential code automatically.
One of the firsts tasks to accomplish is generating an OpenMP version from a loop which has previously been marked as parallelizable. So I have implemented a new pass named pass_xark_backend_omp (at tree-xark-backend-omp.c, enabled by -fxark-backend-omp) strongly based on pass_parallelize_loops (at tree-parloops.c). The main differences between them are:
- tree-xark-backend-omp is executed in pass_early_local_passes group, while pass_parallelize_loops is executed in pass_tree_loop, in pass_all_optimizations group. We have chosen this location because SSA form has just been calculated and because we will carry out interprocedural anaysis of several functions at same time.
- I don't create two versions of the loop. I manipulate the old loop, inserting OpenMP directives and another statements to achieve only one parallel loop.
- The CFG which reaches my pass is different, so I make different modifications. But the intermediate representantion which reaches omp_expand_local is almost the same.
- In create_loop_fn, I had to add a call to add_referenced_var(t); after t = build_decl (PARM_DECL, get_identifier (".paral_data_param"), ptr_type_node); (tree-xark-backend-omp.c:920)
But when I apply my pass to the simple example code asig_reg.c, I obtain the following ICE:
$ ../build/bin/gcc -v -Wall -fdump-tree-all-raw -fdump-ipa-all-raw -L../build/lib -Wl,-rpath -Wl,../build/lib -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o asig_reg.bin asig_reg.c
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../src/configure --prefix=/home/jmandion/Programacion/gcc-gnu/build/ --disable-bootstrap --enable-languages=c,fortran
Thread model: posix
gcc version 4.5.0 20090521 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-v' '-Wall' '-fdump-tree-all-raw' '-fdump-ipa-all-raw' '-L../build/lib' '-fxark-debug' '-fxark-backend-omp' '-fverify-ssa-debug' '-o' 'asig_reg.bin' '-mtune=generic'
/home/jmandion/Programacion/gcc-gnu/build/bin/../libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -quiet -v -iprefix /home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/ asig_reg.c -quiet -dumpbase asig_reg.c -mtune=generic -auxbase asig_reg -Wall -version -fdump-tree-all-raw -fdump-ipa-all-raw -fxark-debug -fxark-backend-omp -fverify-ssa-debug -o /tmp/ccgETP9F.s
GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu)
compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include"
ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include"
ignoring duplicate directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed"
ignoring nonexistent directory "/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../lib/gcc/i686-pc-linux-gnu/4.5.0/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include
/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/i686-pc-linux-gnu/4.5.0/include-fixed
/usr/local/include
/home/jmandion/Programacion/gcc-gnu/build/bin/../lib/gcc/../../include
/usr/include
End of search list.
GNU C (GCC) version 4.5.0 20090521 (experimental) (i686-pc-linux-gnu)
compiled by GNU C version 4.3.3, GMP version 4.2.4, MPFR version 2.4.0
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: ebc62f7555f88405c8bde4c5a1af27a7
asig_reg.c: In function âmain._loopfn.0â:
asig_reg.c:53: error: missing definition
for SSA_NAME: .paral_data_param_2 in statement:
.paral_data_load.5_3 = (struct .paral_data.3 *) .paral_data_param_2;
asig_reg.c:53: internal compiler error: verify_ssa failed
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
I have tried to solve this problem fixing the Call Graph (omp_expand_local don't do it) and printing all the functions across this graph (enabled with -fxark-debug), but it was unsuccesfull. So I have instrumented verify_ssa (tree-ssa.c:542 and 733, enabled by -fverify-ssa-debug) by dumping in file verify_ssa.dump the pass which invokes the function, the function associated to current_function_decl in that moment, the referenced_vars and the immediate uses. Furthermore, I write a line each time a basic block is processed. With this information and some dummy passes (pass_xark_print and pass_xark_ipa_print, also enabled by -fxark-debug), I have seen the compilation process ends before reaching the ipa_passes group because the referenced_vars and the immediate_uses are empty when analyzing main._loopfn.0. But just "a moment" before, verify_ssa checks succesfully this function too.
I attach to this message a patch tree-xark-backend-omp.diff and the example asig_reg.c
Thanks in advance,
Josà Manuel AndiÃn FernÃndez
Computer Architecture Group
Departement of Electronics and Systems
University of A CoruÃa
Spain
http://gac.des.udc.es/index_en.html
#include <stdio.h>
#include <unistd.h>
#define MAX 1048576
int main() {
int A[MAX];
int i;
for (i = 0; i < MAX; ++i) {
A[i] = 2;
}
printf("A[3] = %d\n", A[3]);
return 0;
}
Index: gcc/tree-pass.h
===================================================================
--- gcc/tree-pass.h (revisiÃn: 147760)
+++ gcc/tree-pass.h (copia de trabajo)
@@ -400,6 +400,11 @@
extern struct gimple_opt_pass pass_local_pure_const;
extern struct gimple_opt_pass pass_tracer;
+/* XARK Passes */
+extern struct gimple_opt_pass pass_xark_backend_omp;
+extern struct gimple_opt_pass pass_xark_print;
+extern struct simple_ipa_opt_pass pass_xark_ipa_print;
+
/* IPA Passes */
extern struct ipa_opt_pass_d pass_ipa_inline;
extern struct ipa_opt_pass_d pass_ipa_cp;
Index: gcc/tree-xark-backend-omp.c
===================================================================
--- gcc/tree-xark-backend-omp.c (revisiÃn: 0)
+++ gcc/tree-xark-backend-omp.c (revisiÃn: 0)
@@ -0,0 +1,1485 @@
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "rtl.h"
+#include "tree-flow.h"
+#include "cfgloop.h"
+#include "ggc.h"
+#include "tree-data-ref.h"
+#include "diagnostic.h"
+#include "tree-pass.h"
+#include "tree-scalar-evolution.h"
+#include "hashtab.h"
+#include "langhooks.h"
+#include "tree-vectorizer.h"
+#include "cgraph.h"
+#include "tree-dump.h"
+
+#define MAX_CHAR_LOOPFN_NAME 100
+
+void
+dump_dom_info_state(enum cdi_direction, FILE *);
+void
+dump_cgraph_state(FILE *);
+void
+dump_functions_across_cgraph(FILE*);
+void
+hello_from(const char *, FILE *);
+void
+bye_from(const char *, FILE *);
+void
+print_bbs(FILE *);
+void
+print_loops_info(FILE *);
+void
+my_print_loops(FILE *);
+bool
+is_parallelizable_loop(struct loop *);
+void
+parallelize_loop(struct loop * loop);
+
+void
+dump_dom_info_state(enum cdi_direction dir, FILE * f)
+{
+ enum dom_state state = dom_info_state(dir);
+ fprintf(f, "dom_info_state(");
+ if (dir == CDI_DOMINATORS)
+ {
+ fprintf(f, "CDI_DOMINATORS");
+ }
+ else if (dir == CDI_POST_DOMINATORS)
+ {
+ fprintf(f, "CDI_POST_DOMINATORS");
+ }
+ else
+ {
+ fprintf(f, "Impossible!!!");
+ }
+ fprintf(f, "): ");
+ if (state == DOM_NONE)
+ {
+ fprintf(f, "DOM_NONE");
+ }
+ else if (state == DOM_NO_FAST_QUERY)
+ {
+ fprintf(f, "DOM_NO_FAST_QUERY");
+ }
+ else if (state == DOM_OK)
+ {
+ fprintf(f, "DOM_OK");
+ }
+ else
+ {
+ fprintf(f, "Impossible!!!");
+ }
+ fprintf(f, "\n");
+}
+
+void
+dump_cgraph_state(FILE * f)
+{
+ fprintf(f, "cgraph state is ");
+ switch (cgraph_state)
+ {
+ case CGRAPH_STATE_CONSTRUCTION:
+ fprintf(f, "CGRAPH_STATE_CONSTRUCTION");
+ break;
+ case CGRAPH_STATE_IPA:
+ fprintf(f, "CGRAPH_STATE_IPA");
+ break;
+ case CGRAPH_STATE_IPA_SSA:
+ fprintf(f, "CGRAPH_STATE_IPA_SSA");
+ break;
+ case CGRAPH_STATE_EXPANSION:
+ fprintf(f, "CGRAPH_STATE_EXPANSION");
+ break;
+ case CGRAPH_STATE_FINISHED:
+ fprintf(f, "CGRAPH_STATE_FINISHED");
+ break;
+ default:
+ gcc_unreachable();
+ }
+ fprintf(f, "\n");
+ fflush(f);
+}
+
+void
+dump_functions_across_cgraph(FILE* f)
+{
+ struct cgraph_node *node;
+
+ for (node = cgraph_nodes; node; node = node->next)
+ {
+ tree temp_fn;
+
+ temp_fn = current_function_decl;
+ current_function_decl = node->decl;
+ push_cfun(DECL_STRUCT_FUNCTION (node->decl));
+
+ fprintf(f, "* Dump of function %s *\n",
+ lang_hooks.decl_printable_name(current_function_decl, 2));
+
+ dump_function_to_file(current_function_decl, f, 0);
+
+ pop_cfun();
+ current_function_decl = temp_fn;
+ }
+}
+
+void
+hello_from(const char * function_name, FILE * f)
+{
+ fprintf(f, "* Hello from %s!!! *\n", function_name);
+ fflush(f);
+}
+
+void
+bye_from(const char * function_name, FILE * f)
+{
+ fprintf(f, "* Bye from %s!!! *\n", function_name);
+ fflush(f);
+}
+
+/* Adapted from tree-blocking */
+
+void
+print_bbs(FILE * f)
+{
+ basic_block bb, prevbb, nextbb;
+ unsigned int i;
+
+ hello_from("print_bbs", f);
+
+ FOR_EACH_BB (bb)
+ {
+ prevbb = bb->prev_bb;
+ nextbb = bb->next_bb;
+ fprintf(f, "-------- BASIC BLOCK %d ---------\n", bb->index);
+ if (get_immediate_dominator(CDI_DOMINATORS, bb) != NULL)
+ {
+ fprintf(f, "--- Immediate dominator is %d ---\n",
+ get_immediate_dominator(CDI_DOMINATORS, bb)->index);
+ }
+ gimple_dump_bb(bb, f, 2, TDF_RAW);
+ fprintf(f, "> EDGE_COUNT (bb->preds): %d basic blocks:\n", EDGE_COUNT (bb->preds));
+
+ for (i = 0; i < (EDGE_COUNT (bb->preds)); i++)
+ {
+
+ prevbb = EDGE_PRED (bb, i)->src;
+ fprintf(f, " - Previous basic block number %d is bb %d\n", i,
+ prevbb->index);
+ }
+
+ fprintf(f, "> EDGE_COUNT (bb->succs): %d basic blocks : \n", EDGE_COUNT (bb->succs));
+
+ for (i = 0; i < (EDGE_COUNT (bb->succs)); i++)
+ {
+
+ nextbb = EDGE_SUCC (bb, i)->dest;
+ fprintf(f, " - Next basic block number %d is bb %d \n", i,
+ nextbb->index);
+ }
+
+ fprintf(f, "\n");
+ }
+
+ fflush(f);
+ return;
+}
+
+/* Adapted from tree-blocking */
+
+void
+print_loops_info(FILE * f)
+{
+ loop_iterator li;
+ struct loop *loop;
+ basic_block *bbs;
+ edge edge;
+ unsigned int j;
+
+ hello_from("print_loops_info", f);
+
+ verify_flow_info();
+
+ fprintf(f, "NUMBER OF LOOPS = %d \n\n", number_of_loops());
+
+ FOR_EACH_LOOP (li, loop, 0)
+ {
+ fprintf(f, "--------");
+ fprintf(f, " LOOP NUMBER %d ", loop->num);
+ fprintf(f, "--------\n");
+ if (loop->next != NULL)
+ {
+ /* Link to the next (sibling) loop. */
+ fprintf(f, " Next loop: %d \n", loop->next->num);
+ }
+ else
+ {
+ fprintf(f, " There isn't next loop \n");
+ }
+ if (loop->inner != NULL)
+ {
+ /* The first inner (child) loop or NULL if innermost loop. */
+ fprintf(f, " Inner loop: %d \n", loop->inner->num);
+ }
+ else
+ {
+ fprintf(f, " There isn't inner loop \n");
+ }
+
+ if (loop_outer(loop) != NULL)
+ {
+ /* The outer (parent) loop or NULL if outermost loop. */
+ fprintf(f, " Outer loop %d \n", loop_outer(loop)->num);
+ }
+ else
+ {
+ fprintf(f, " This is the outermost loop \n");
+ }
+
+ /* The loop nesting depth. */
+ fprintf(f, " Loop depth %d \n", loop_depth(loop));
+ fprintf(f, "--- Body in dom order --- \n");
+ bbs = get_loop_body_in_dom_order(loop);
+ if (f)
+ {
+ for (j = 0; j < loop->num_nodes; j++)
+ {
+ gimple_dump_bb(bbs[j], f, 2, TDF_RAW);
+ }
+ fprintf(f, "--- Header --- \n");
+ gimple_dump_bb(loop->header, f, 2, TDF_RAW);
+
+ fprintf(f, "--- Latch --- \n");
+ gimple_dump_bb(loop->latch, f, 2, TDF_RAW);
+
+ fprintf(f, "\n");
+ edge = BRANCH_EDGE (loop->header);
+ fprintf(f, " Flag of branch edge: %d \n", edge->flags);
+ fprintf(f, " Loop init: ");
+ print_gimple_stmt(f, last_stmt(edge->dest->prev_bb), 0, TDF_RAW);
+ fprintf(f, " Loop step: ");
+ print_gimple_stmt(f, last_stmt(loop->latch), 0, TDF_RAW);
+ fprintf(f, " Loop limit: ");
+ print_generic_expr(f, gimple_cond_rhs(last_stmt(loop->header)),
+ TDF_RAW);
+ fprintf(f, "\n");
+
+ free(bbs);
+ }
+ }
+ bye_from("print_loops_info", f);
+
+ return;
+}
+
+void
+my_print_loops(FILE * file)
+{
+ hello_from("my_print_loops", file);
+ print_loops(file, 3);
+ bye_from("my_print_loops", file);
+}
+
+/* Copied from tree-parloops.c
+ Element of hashtable of names to copy. */
+
+struct name_to_copy_elt
+{
+ unsigned version; /* The version of the name to copy. */
+ tree new_name; /* The new name used in the copy. */
+ tree field; /* The field of the structure used to pass the
+ value. */
+};
+
+/* Copied from tree-parloops.c
+ Equality and hash functions for hashtab code. */
+
+static int
+name_to_copy_elt_eq(const void *aa, const void *bb)
+{
+ const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa;
+ const struct name_to_copy_elt *b = (const struct name_to_copy_elt *) bb;
+
+ return a->version == b->version;
+}
+
+/* Copied from tree-parloops.c */
+
+static hashval_t
+name_to_copy_elt_hash(const void *aa)
+{
+ const struct name_to_copy_elt *a = (const struct name_to_copy_elt *) aa;
+
+ return (hashval_t) a->version;
+}
+
+/* Copied from tree-parloops.c
+ Assigns the address of OBJ in TYPE to an ssa name, and returns this name.
+ The assignment statement is placed on edge ENTRY. DECL_ADDRESS maps decls
+ to their addresses that can be reused. The address of OBJ is known to
+ be invariant in the whole function. */
+
+static tree
+take_address_of(tree obj, tree type, edge entry, htab_t decl_address)
+{
+ int uid;
+ void **dslot;
+ struct int_tree_map ielt, *nielt;
+ tree *var_p, name, bvar, addr;
+ gimple stmt;
+ gimple_seq stmts;
+
+ /* Since the address of OBJ is invariant, the trees may be shared.
+ Avoid rewriting unrelated parts of the code. */
+ obj = unshare_expr(obj);
+ for (var_p = &obj; handled_component_p(*var_p); var_p = &TREE_OPERAND (*var_p, 0))
+ continue;
+ uid = DECL_UID (*var_p);
+
+ ielt.uid = uid;
+ dslot = htab_find_slot_with_hash(decl_address, &ielt, uid, INSERT);
+ if (!*dslot)
+ {
+ addr = build_addr(*var_p, current_function_decl);
+ bvar = create_tmp_var(TREE_TYPE (addr), get_name(*var_p));
+ add_referenced_var(bvar);
+ stmt = gimple_build_assign (bvar, addr);
+ name = make_ssa_name(bvar, stmt);
+ gimple_assign_set_lhs(stmt, name);
+ gsi_insert_on_edge_immediate(entry, stmt);
+
+ nielt = XNEW (struct int_tree_map);
+ nielt->uid = uid;
+ nielt->to = name;
+ *dslot = nielt;
+ }
+ else
+ name = ((struct int_tree_map *) *dslot)->to;
+
+ if (var_p != &obj)
+ {
+ *var_p = build1 (INDIRECT_REF, TREE_TYPE (*var_p), name);
+ name = force_gimple_operand(build_addr(obj, current_function_decl),
+ &stmts, true, NULL_TREE);
+ if (!gimple_seq_empty_p (stmts))
+ gsi_insert_seq_on_edge_immediate (entry, stmts);
+ }
+
+ if (TREE_TYPE (name) != type)
+ {
+ name = force_gimple_operand (fold_convert (type, name), &stmts, true,
+ NULL_TREE);
+ if (!gimple_seq_empty_p (stmts))
+ gsi_insert_seq_on_edge_immediate (entry, stmts);
+ }
+
+ return name;
+ }
+
+/* Copied from tree-parloops.c */
+
+struct elv_data
+{
+ struct walk_stmt_info info;
+ edge entry;
+ htab_t decl_address;
+ bool changed;
+};
+
+/* Copied from tree-parloops.c
+ Eliminates references to local variables in *TP out of the single
+ entry single exit region starting at DTA->ENTRY.
+ DECL_ADDRESS contains addresses of the references that had their
+ address taken already. If the expression is changed, CHANGED is
+ set to true. Callback for walk_tree. */
+
+static tree
+eliminate_local_variables_1(tree *tp, int *walk_subtrees, void *data)
+{
+ struct elv_data * const dta = (struct elv_data *) data;
+ tree t = *tp, var, addr, addr_type, type, obj;
+
+ if (DECL_P (t))
+ {
+ *walk_subtrees = 0;
+
+ if (!SSA_VAR_P (t) || DECL_EXTERNAL (t))
+ return NULL_TREE;
+
+ type = TREE_TYPE (t);
+ addr_type = build_pointer_type(type);
+ addr = take_address_of(t, addr_type, dta->entry, dta->decl_address);
+ *tp = build1 (INDIRECT_REF, TREE_TYPE (*tp), addr);
+
+ dta->changed = true;
+ return NULL_TREE;
+ }
+
+ if (TREE_CODE (t) == ADDR_EXPR)
+ {
+ /* ADDR_EXPR may appear in two contexts:
+ -- as a gimple operand, when the address taken is a function invariant
+ -- as gimple rhs, when the resulting address in not a function
+ invariant
+ We do not need to do anything special in the latter case (the base of
+ the memory reference whose address is taken may be replaced in the
+ DECL_P case). The former case is more complicated, as we need to
+ ensure that the new address is still a gimple operand. Thus, it
+ is not sufficient to replace just the base of the memory reference --
+ we need to move the whole computation of the address out of the
+ loop. */
+ if (!is_gimple_val(t))
+ return NULL_TREE;
+
+ *walk_subtrees = 0;
+ obj = TREE_OPERAND (t, 0);
+ var = get_base_address(obj);
+ if (!var || !SSA_VAR_P (var) || DECL_EXTERNAL (var))
+ return NULL_TREE;
+
+ addr_type = TREE_TYPE (t);
+ addr = take_address_of(obj, addr_type, dta->entry, dta->decl_address);
+ *tp = addr;
+
+ dta->changed = true;
+ return NULL_TREE;
+ }
+
+ if (!EXPR_P (t))
+ *walk_subtrees = 0;
+
+ return NULL_TREE;
+}
+
+/* Copied from tree-parloops.c
+ Moves the references to local variables in STMT out of the single
+ entry single exit region starting at ENTRY. DECL_ADDRESS contains
+ addresses of the references that had their address taken
+ already. */
+
+static void
+eliminate_local_variables_stmt(edge entry, gimple stmt, htab_t decl_address)
+{
+ struct elv_data dta;
+
+ memset(&dta.info, '\0', sizeof(dta.info));
+ dta.entry = entry;
+ dta.decl_address = decl_address;
+ dta.changed = false;
+
+ walk_gimple_op(stmt, eliminate_local_variables_1, &dta.info);
+
+ if (dta.changed)
+ update_stmt(stmt);
+
+}
+
+/* Copied from tree-parloops.c
+ Eliminates the references to local variables from the single entry
+ single exit region between the ENTRY and EXIT edges.
+
+ This includes:
+ 1) Taking address of a local variable -- these are moved out of the
+ region (and temporary variable is created to hold the address if
+ necessary).
+
+ 2) Dereferencing a local variable -- these are replaced with indirect
+ references. */
+
+static void
+eliminate_local_variables(edge entry, edge exit)
+{
+ basic_block bb;
+ VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3);
+ unsigned i;
+ gimple_stmt_iterator gsi;
+ htab_t decl_address = htab_create(10, int_tree_map_hash, int_tree_map_eq,
+ free);
+ basic_block entry_bb = entry->src;
+ basic_block exit_bb = exit->dest;
+
+ gather_blocks_in_sese_region(entry_bb, exit_bb, &body);
+
+ for (i = 0; VEC_iterate (basic_block, body, i, bb); i++)
+ if (bb != entry_bb && bb != exit_bb)
+ for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+ eliminate_local_variables_stmt(entry, gsi_stmt(gsi), decl_address);
+
+ htab_delete(decl_address);
+ VEC_free (basic_block, heap, body);
+}
+
+/* Copied from tree-parloops.c
+ Returns true if expression EXPR is not defined between ENTRY and
+ EXIT, i.e. if all its operands are defined outside of the region. */
+
+static bool
+expr_invariant_in_region_p(edge entry, edge exit, tree expr)
+{
+
+ basic_block entry_bb = entry->src;
+ basic_block exit_bb = exit->dest;
+ basic_block def_bb;
+
+ if (is_gimple_min_invariant (expr))
+ return true;
+
+ if (TREE_CODE (expr) == SSA_NAME)
+ {
+ def_bb = gimple_bb (SSA_NAME_DEF_STMT (expr));
+ if (def_bb
+ && dominated_by_p (CDI_DOMINATORS, def_bb, entry_bb)
+ && !dominated_by_p (CDI_DOMINATORS, def_bb, exit_bb))
+ return false;
+
+ return true;
+ }
+
+ return false;
+}
+
+/* Copied from tree-parloops.c
+ If COPY_NAME_P is true, creates and returns a duplicate of NAME.
+ The copies are stored to NAME_COPIES, if NAME was already duplicated,
+ its duplicate stored in NAME_COPIES is returned.
+
+ Regardless of COPY_NAME_P, the decl used as a base of the ssa name is also
+ duplicated, storing the copies in DECL_COPIES. */
+
+static tree
+separate_decls_in_region_name(tree name, htab_t name_copies,
+ htab_t decl_copies, bool copy_name_p)
+{
+ tree copy, var, var_copy;
+ unsigned idx, uid, nuid;
+ struct int_tree_map ielt, *nielt;
+ struct name_to_copy_elt elt, *nelt;
+ void **slot, **dslot;
+
+ if (TREE_CODE (name) != SSA_NAME)
+ return name;
+
+ idx = SSA_NAME_VERSION (name);
+ elt.version = idx;
+ slot = htab_find_slot_with_hash(name_copies, &elt, idx, copy_name_p ? INSERT
+ : NO_INSERT);
+ if (slot && *slot)
+ return ((struct name_to_copy_elt *) *slot)->new_name;
+
+ var = SSA_NAME_VAR (name);
+ uid = DECL_UID (var);
+ ielt.uid = uid;
+ dslot = htab_find_slot_with_hash(decl_copies, &ielt, uid, INSERT);
+ if (!*dslot)
+ {
+ var_copy = create_tmp_var(TREE_TYPE (var), get_name(var));
+ DECL_GIMPLE_REG_P (var_copy) = DECL_GIMPLE_REG_P (var);
+ add_referenced_var(var_copy);
+ nielt = XNEW (struct int_tree_map);
+ nielt->uid = uid;
+ nielt->to = var_copy;
+ *dslot = nielt;
+
+ /* Ensure that when we meet this decl next time, we won't duplicate
+ it again. */
+ nuid = DECL_UID (var_copy);
+ ielt.uid = nuid;
+ dslot = htab_find_slot_with_hash(decl_copies, &ielt, nuid, INSERT);
+ gcc_assert (!*dslot);
+ nielt = XNEW (struct int_tree_map);
+ nielt->uid = nuid;
+ nielt->to = var_copy;
+ *dslot = nielt;
+ }
+ else
+ var_copy = ((struct int_tree_map *) *dslot)->to;
+
+ if (copy_name_p)
+ {
+ copy = duplicate_ssa_name(name, NULL);
+ nelt = XNEW (struct name_to_copy_elt);
+ nelt->version = idx;
+ nelt->new_name = copy;
+ nelt->field = NULL_TREE;
+ *slot = nelt;
+ }
+ else
+ {
+ gcc_assert (!slot);
+ copy = name;
+ }
+
+ SSA_NAME_VAR (copy) = var_copy;
+
+ return copy;
+}
+
+/* Copied from tree-parloops.c
+ Finds the ssa names used in STMT that are defined outside the
+ region between ENTRY and EXIT and replaces such ssa names with
+ their duplicates. The duplicates are stored to NAME_COPIES. Base
+ decls of all ssa names used in STMT (including those defined in
+ LOOP) are replaced with the new temporary variables; the
+ replacement decls are stored in DECL_COPIES. */
+
+static void
+separate_decls_in_region_stmt(edge entry, edge exit, gimple stmt,
+ htab_t name_copies, htab_t decl_copies)
+{
+ use_operand_p use;
+ def_operand_p def;
+ ssa_op_iter oi;
+ tree name, copy;
+ bool copy_name_p;
+
+ mark_virtual_ops_for_renaming(stmt);
+
+ FOR_EACH_PHI_OR_STMT_DEF (def, stmt, oi, SSA_OP_DEF)
+ {
+ name = DEF_FROM_PTR (def);
+ gcc_assert (TREE_CODE (name) == SSA_NAME);
+ copy = separate_decls_in_region_name(name, name_copies, decl_copies,
+ false);
+ gcc_assert (copy == name);
+ }
+
+ FOR_EACH_PHI_OR_STMT_USE (use, stmt, oi, SSA_OP_USE)
+ {
+ name = USE_FROM_PTR (use);
+ if (TREE_CODE (name) != SSA_NAME)
+ continue;
+
+ copy_name_p = expr_invariant_in_region_p(entry, exit, name);
+ copy = separate_decls_in_region_name(name, name_copies, decl_copies,
+ copy_name_p);
+ SET_USE (use, copy);
+ }
+
+}
+
+/* Copied from tree-parloops.c
+ Callback for htab_traverse. Adds a field corresponding to a ssa name
+ described in SLOT. The type is passed in DATA. */
+
+static int
+add_field_for_name(void **slot, void *data)
+{
+ struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot;
+ tree type = (tree) data;
+ tree name = ssa_name (elt->version);
+ tree var = SSA_NAME_VAR (name);
+ tree field = build_decl (FIELD_DECL, DECL_NAME (var), TREE_TYPE (var));
+
+ insert_field_into_struct(type, field);
+ elt->field = field;
+
+ return 1;
+}
+
+/* Copied from tree-parloops.c */
+
+struct clsn_data
+{
+ tree store;
+ tree load;
+
+ basic_block store_bb;
+ basic_block load_bb;
+};
+
+/* Copied from tree-parloops.c
+ Callback for htab_traverse. Creates loads to a field of LOAD in LOAD_BB and
+ store to a field of STORE in STORE_BB for the ssa name and its duplicate
+ specified in SLOT. */
+
+static int
+create_loads_and_stores_for_name(void **slot, void *data)
+{
+ struct name_to_copy_elt * const elt = (struct name_to_copy_elt *) *slot;
+ struct clsn_data * const clsn_data = (struct clsn_data *) data;
+ tree t;
+ gimple stmt;
+ gimple_stmt_iterator gsi;
+ tree type = TREE_TYPE (elt->new_name);
+ tree struct_type = TREE_TYPE (TREE_TYPE (clsn_data->load));
+ tree load_struct;
+
+ gsi = gsi_last_bb(clsn_data->store_bb);
+ t = build3 (COMPONENT_REF, type, clsn_data->store, elt->field, NULL_TREE);
+ stmt = gimple_build_assign (t, ssa_name (elt->version));
+ mark_virtual_ops_for_renaming(stmt);
+ gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+
+ gsi = gsi_last_bb(clsn_data->load_bb);
+ load_struct = fold_build1 (INDIRECT_REF, struct_type, clsn_data->load);
+ t = build3 (COMPONENT_REF, type, load_struct, elt->field, NULL_TREE);
+ stmt = gimple_build_assign (elt->new_name, t);
+ SSA_NAME_DEF_STMT (elt->new_name) = stmt;
+ gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+
+ return 1;
+}
+
+/* Copied from tree-parloops.c
+ Moves all the variables used in LOOP and defined outside of it (including
+ the initial values of loop phi nodes, and *PER_THREAD if it is a ssa
+ name) to a structure created for this purpose. The code
+
+ while (1)
+ {
+ use (a);
+ use (b);
+ }
+
+ is transformed this way:
+
+ bb0:
+ old.a = a;
+ old.b = b;
+
+ bb1:
+ a' = new->a;
+ b' = new->b;
+ while (1)
+ {
+ use (a');
+ use (b');
+ }
+
+ `old' is stored to *ARG_STRUCT and `new' is stored to NEW_ARG_STRUCT. The
+ pointer `new' is intentionally not initialized (the loop will be split to a
+ separate function later, and `new' will be initialized from its arguments).
+ LD_ST_DATA holds information about the shared data structure used to pass
+ information among the threads. It is initialized here, and
+ gen_parallel_loop will pass it to create_call_for_reduction that
+ needs this information. REDUCTION_LIST describes the reductions
+ in LOOP. */
+
+static void
+separate_decls_in_region(edge entry, edge exit, tree *arg_struct,
+ tree *new_arg_struct)
+
+{
+ struct clsn_data clsn_data_struct;
+ struct clsn_data *ld_st_data = &clsn_data_struct;
+ basic_block bb1 = split_edge(entry);
+ basic_block bb0 = single_pred(bb1);
+ htab_t name_copies = htab_create(10, name_to_copy_elt_hash,
+ name_to_copy_elt_eq, free);
+ htab_t decl_copies =
+ htab_create(10, int_tree_map_hash, int_tree_map_eq, free);
+ unsigned i;
+ tree type, type_name, nvar;
+ gimple_stmt_iterator gsi;
+ VEC (basic_block, heap) *body = VEC_alloc (basic_block, heap, 3);
+ basic_block bb;
+ basic_block entry_bb = bb1;
+ basic_block exit_bb = exit->dest;
+
+ entry = single_succ_edge(entry_bb);
+ gather_blocks_in_sese_region(entry_bb, exit_bb, &body);
+
+ for (i = 0; VEC_iterate (basic_block, body, i, bb); i++)
+ {
+ if (bb != entry_bb && bb != exit_bb)
+ {
+ for (gsi = gsi_start_phis(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+ separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi),
+ name_copies, decl_copies);
+
+ for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi))
+ separate_decls_in_region_stmt(entry, exit, gsi_stmt(gsi),
+ name_copies, decl_copies);
+ }
+ }
+
+ VEC_free (basic_block, heap, body);
+
+ if (htab_elements(name_copies) == 0)
+ {
+ /* It may happen that there is nothing to copy (if there are only
+ loop carried and external variables in the loop). */
+ *arg_struct = NULL;
+ *new_arg_struct = NULL;
+ }
+ else
+ {
+ /* Create the type for the structure to store the ssa names to. */
+ type = lang_hooks.types.make_type(RECORD_TYPE);
+ type_name = build_decl (TYPE_DECL, create_tmp_var_name (".paral_data"),
+ type);
+ TYPE_NAME (type) = type_name;
+
+ htab_traverse(name_copies, add_field_for_name, type);
+ layout_type(type);
+
+ /* Create the loads and stores. */
+ *arg_struct = create_tmp_var(type, ".paral_data_store");
+ add_referenced_var(*arg_struct);
+ nvar = create_tmp_var(build_pointer_type(type), ".paral_data_load");
+ add_referenced_var(nvar);
+ *new_arg_struct = make_ssa_name(nvar, NULL);
+
+ ld_st_data->store = *arg_struct;
+ ld_st_data->load = *new_arg_struct;
+ ld_st_data->store_bb = bb0;
+ ld_st_data->load_bb = bb1;
+
+ htab_traverse(name_copies, create_loads_and_stores_for_name, ld_st_data);
+
+ }
+
+ htab_delete(decl_copies);
+ htab_delete(name_copies);
+}
+
+/* Copied from tree-parloops.c
+ Bitmap containing uids of functions created by parallelization. We cannot
+ allocate it from the default obstack, as it must live across compilation
+ of several functions; we make it gc allocated instead. */
+
+static GTY(()) bitmap parallelized_functions;
+
+/* Adapted from tree-parloops.c: added comments and add_referenced_var
+ Creates and returns an empty function that will receive the body of
+ a parallelized loop. */
+
+static tree
+create_loop_fn(void)
+{
+ char buf[MAX_CHAR_LOOPFN_NAME];
+ char *tname;
+ tree decl, type, name, t;
+ struct function *act_cfun = cfun;
+ static unsigned loopfn_num;
+
+ snprintf(buf, MAX_CHAR_LOOPFN_NAME, "%s.$loopfn", current_function_name());
+ ASM_FORMAT_PRIVATE_NAME(tname, buf, loopfn_num++);
+ clean_symbol_name(tname);
+ name = get_identifier(tname);
+ type = build_function_type_list(void_type_node,ptr_type_node, NULL_TREE);
+
+ decl = build_decl (FUNCTION_DECL, name, type);
+
+ if (!parallelized_functions)
+ parallelized_functions = BITMAP_GGC_ALLOC ();
+ bitmap_set_bit(parallelized_functions, DECL_UID (decl));
+
+ /* Nonzero if function has been defined */
+ TREE_STATIC (decl) = 1;
+
+ /* Nonzero if the name is used in its scope */
+ TREE_USED (decl) = 1;
+
+ /* Represents a compiler-generated entity. */
+ DECL_ARTIFICIAL (decl) = 1;
+
+ /* Nonzero for a given ..._DECL node means that the name of this node
+ should be ignored for symbolic debug purposes. */
+ DECL_IGNORED_P (decl) = 0;
+
+ /* Nonzero if name is accessible from outside this module */
+ TREE_PUBLIC (decl) = 0;
+
+ /* In a FUNCTION_DECL, nonzero if the function cannot be inlined. */
+ DECL_UNINLINABLE (decl) = 1;
+
+ /* In a VAR_DECL or FUNCTION_DECL, nonzero means external reference:
+ do not allocate storage, and refer to a definition elsewhere. Note that
+ this does not necessarily imply the entity represented by NODE
+ has no program source-level definition in this translation unit. For
+ example, for a FUNCTION_DECL, DECL_SAVED_TREE may be non-NULL and
+ DECL_EXTERNAL may be true simultaneously; that can be the case for
+ a C99 "extern inline" function. */
+ DECL_EXTERNAL (decl) = 0;
+
+ /* This points to either the FUNCTION_DECL for the containing function
+ the RECORD_TYPE or UNION_TYPE for the containing type, or
+ NULL_TREE or a TRANSLATION_UNIT_DECL if the given decl has "file
+ scope". */
+ DECL_CONTEXT (decl) = NULL_TREE;
+
+ /* Holds the tree of BINDINGs. */
+ DECL_INITIAL (decl) = make_node (BLOCK);
+
+ t = build_decl (RESULT_DECL, NULL_TREE, void_type_node);
+ /* TODO: Is add_referenced_var(t) needed? */
+ DECL_ARTIFICIAL (t) = 1;
+ DECL_IGNORED_P (t) = 1;
+ /* In FUNCTION_DECL, holds the decl for the return value. */
+ DECL_RESULT (decl) = t;
+
+ t = build_decl (PARM_DECL, get_identifier (".paral_data_param"),
+ ptr_type_node);
+ /* TODO: Why add_referenced_var(t) is needed? */
+ add_referenced_var(t);
+ DECL_ARTIFICIAL (t) = 1;
+ /* For a PARM_DECL, records the data type used to pass the argument,
+ which may be different from the type seen in the program. */
+ DECL_ARG_TYPE (t) = ptr_type_node;
+
+ DECL_CONTEXT (t) = decl;
+ TREE_USED (t) = 1;
+ DECL_ARGUMENTS (decl) = t;
+
+ allocate_struct_function(decl, false);
+
+ /* The call to allocate_struct_function clobbers CFUN, so we need to restore
+ it. */
+ set_cfun(act_cfun);
+
+ /*
+ * ALTERNATIVE CODE:
+ *
+ * push_struct_function (decl);
+ * pop_cfun ();
+ */
+
+ return decl;
+}
+
+/* Dummy analysis */
+bool
+is_parallelizable_loop(struct loop * loop)
+{
+ bool analysis_result = false;
+
+ if (flag_xark_debug && dump_file)
+ {
+ hello_from("is_parallelizable_loop", dump_file);
+ fprintf(dump_file, "flag_openmp: %d\n", flag_openmp);
+ fprintf(dump_file, "flag_tree_parallelize_loops: %d\n",
+ flag_tree_parallelize_loops);
+ fprintf(dump_file,
+ "IDENTIFIER_POINTER(DECL_NAME(current_function_decl)): %s\n",
+ IDENTIFIER_POINTER(DECL_NAME(current_function_decl)));
+ fprintf(dump_file, "loop->num: %d\n", loop->num);
+ }
+
+ if (!(flag_openmp) && (flag_tree_parallelize_loops <= 1) && (strcmp(
+ IDENTIFIER_POINTER(DECL_NAME(current_function_decl)), "main") == 0) && (loop->num == 1))
+ {
+ analysis_result = true;
+ }
+ else
+ {
+ analysis_result = false;
+ }
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "analysis_result: %d\n", analysis_result);
+ bye_from("is_parallelizable_loop", dump_file);
+ }
+
+ return analysis_result;
+}
+
+/* Adapted from tree-parloops.c:gen_parallel_loop and tree-parloops.c:create_parallel_loop */
+void
+parallelize_loop(struct loop * loop)
+{
+
+ basic_block bb_latch, bb_header;
+ VEC (edge, heap) *exits;
+ edge exit;
+ unsigned i;
+ edge true_edge, false_edge;
+ edge lpe, lle;
+ edge start;
+ edge region_entry, region_exit;
+ tree arg_struct, new_arg_struct;
+ static tree loop_fn;
+ basic_block bb_omp_parallel_head, bb_omp_parallel_return;
+ gimple_stmt_iterator gsi;
+ gimple stmt_omp_parallel_head, cond_stmt, stmt;
+ tree param;
+ tree cvar, cvar_init, initvar, cvar_next, cvar_base, type;
+ gimple phi;
+ basic_block for_bb, ex_bb;
+ edge guard, end, e;
+ tree t;
+ gimple for_stmt;
+ tree old_current_function_decl;
+ struct cgraph_node * node;
+
+ bb_latch = loop->latch;
+ bb_header = loop->header;
+
+ if (flag_xark_debug && dump_file)
+ {
+ hello_from("parallelize_loop", dump_file);
+ fprintf(dump_file, "Loop number %d (header = %d, latch = %d)\n",
+ loop->num, bb_header->index, bb_latch->index);
+
+ exits = get_loop_exit_edges(loop);
+ for (i = 0; VEC_iterate (edge, exits, i, exit); i++)
+ {
+ fprintf(dump_file,
+ "Loop edge exit number %d -> src: bb%d dest: bb%d\n", i,
+ exit->src->index, exit->dest->index);
+ }
+ VEC_free (edge, heap, exits);
+ }
+
+ extract_true_false_edges_from_block(loop->header, &true_edge, &false_edge);
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "Edges from block loop->header:\n");
+ fprintf(dump_file, " True: src: bb%d dest: bb%d\n",
+ true_edge->src->index, true_edge->dest->index);
+ fprintf(dump_file, " False: src: bb%d dest: bb%d\n",
+ false_edge->src->index, false_edge->dest->index);
+ }
+
+ lpe = loop_preheader_edge(loop);
+ lle = loop_latch_edge(loop);
+
+ if (flag_xark_debug && dump_file)
+ {
+ my_print_loops(dump_file);
+ fprintf(dump_file, "Loop %d is going to be cancelled\n", loop->num);
+ }
+ /* FIXME: Do I want to cancel all loop tree or only outermost? */
+ cancel_loop_tree(loop);
+
+ if (flag_xark_debug && dump_file)
+ {
+ my_print_loops(dump_file);
+ dump_dom_info_state(CDI_DOMINATORS, dump_file);
+ dump_dom_info_state(CDI_POST_DOMINATORS, dump_file);
+ }
+
+ /* FIXME: It will fail if lpe->src has various predecessors
+ * create_empty_bb()? */
+ start = single_pred_edge(lpe->src);
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "start edge src: bb%d dest: bb%d\n",
+ start->src->index, start->dest->index);
+ }
+ split_edge(start);
+ if (flag_xark_debug && dump_file)
+ {
+
+ fprintf(dump_file, "start_edge (after splitting) src: bb%d dest: bb%d\n",
+ start->src->index, start->dest->index);
+ }
+
+ region_entry = single_succ_edge(start->dest);
+ region_exit = false_edge;
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file,
+ "region_entry edge (after splitting) src: bb%d dest: bb%d\n",
+ region_entry->src->index, region_entry->dest->index);
+ my_print_loops(dump_file);
+ }
+
+ eliminate_local_variables(start, region_exit);
+
+ separate_decls_in_region(region_entry, region_exit, &arg_struct,
+ &new_arg_struct);
+
+ if (flag_xark_debug && dump_file)
+ {
+ my_print_loops(dump_file);
+
+ fprintf(dump_file, "true edge: src: bb%d dest: bb%d\n",
+ true_edge->src->index, true_edge->dest->index);
+ fprintf(dump_file, "false edge: src: bb%d dest: bb%d\n",
+ false_edge->src->index, false_edge->dest->index);
+ fprintf(dump_file, "lpe edge: src: bb%d dest: bb%d\n", lpe->src->index,
+ lpe->dest->index);
+ fprintf(dump_file, "lle edge: src: bb%d dest: bb%d\n", lle->src->index,
+ lle->dest->index);
+ fprintf(dump_file, "start edge: src: bb%d dest: bb%d\n",
+ start->src->index, start->dest->index);
+ fprintf(dump_file, "region_entry edge: src: bb%d dest: bb%d\n",
+ region_entry->src->index, region_entry->dest->index);
+ fprintf(dump_file, "region_exit edge: src: bb%d dest: bb%d\n",
+ region_exit->src->index, region_exit->dest->index);
+ }
+
+ loop_fn = create_loop_fn();
+ bb_omp_parallel_head = region_entry->src;
+ gsi = gsi_last_bb(bb_omp_parallel_head);
+ stmt_omp_parallel_head = gimple_build_omp_parallel(NULL, NULL, loop_fn,
+ arg_struct);
+ gsi_insert_after(&gsi, stmt_omp_parallel_head, GSI_NEW_STMT);
+
+ if (arg_struct)
+ {
+ gsi = gsi_after_labels(region_entry->dest);
+
+ param = make_ssa_name(DECL_ARGUMENTS (loop_fn), NULL);
+ stmt = gimple_build_assign (param, build_fold_addr_expr (arg_struct));
+ gsi_insert_before(&gsi, stmt, GSI_SAME_STMT);
+ SSA_NAME_DEF_STMT (param) = stmt;
+
+ stmt = gimple_build_assign (new_arg_struct,
+ fold_convert (TREE_TYPE (new_arg_struct), param));
+ gsi_insert_before(&gsi, stmt, GSI_SAME_STMT);
+ SSA_NAME_DEF_STMT (new_arg_struct) = stmt;
+ }
+
+ /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_PARALLEL. */
+ /* TODO: Would it be enough with split_edge()?
+ * At this moment, I don't have memory tags */
+ bb_omp_parallel_return = split_loop_exit_edge(region_exit);
+ verify_loop_closed_ssa();
+ gsi = gsi_last_bb(bb_omp_parallel_return);
+ gsi_insert_after(&gsi, gimple_build_omp_return(false), GSI_NEW_STMT);
+
+ /* Extract data for GIMPLE_OMP_FOR */
+ gcc_assert(bb_header == region_exit->src);
+ cond_stmt = last_stmt(bb_header);
+
+ /* Iteration var */
+ cvar = gimple_cond_lhs(cond_stmt);
+
+ cvar_base = SSA_NAME_VAR (cvar);
+ /* Phi which defines cvar*/
+ phi = SSA_NAME_DEF_STMT (cvar);
+ /* We get the initial value from lpe
+ * FIXME: I'm not propagating the value of initial assignment. */
+ cvar_init = PHI_ARG_DEF_FROM_EDGE (phi, lpe);
+
+ initvar = make_ssa_name(cvar_base, NULL);
+ SET_USE (PHI_ARG_DEF_PTR_FROM_EDGE (phi, lpe),
+ initvar);
+ /* We get the new value from lle */
+ cvar_next = PHI_ARG_DEF_FROM_EDGE (phi, lle);
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "\n");
+ fprintf(dump_file, "cvar: ");
+ print_generic_expr(dump_file, cvar, 0);
+ fprintf(dump_file, "\n");
+ fprintf(dump_file, "cvar: ");
+ print_generic_expr(dump_file, cvar, 0);
+ fprintf(dump_file, "\n");
+ fprintf(dump_file, "cvar_init: ");
+ print_generic_expr(dump_file, cvar_init, 0);
+ fprintf(dump_file, "\n");
+ fprintf(dump_file, "initvar: ");
+ print_generic_expr(dump_file, initvar, 0);
+ fprintf(dump_file, "\n");
+ fprintf(dump_file, "cvar_next: ");
+ print_generic_expr(dump_file, cvar_next, 0);
+ fprintf(dump_file, "\n");
+ }
+
+ /* Removes incr stmt */
+ gsi = gsi_last_bb(bb_latch);
+ gcc_assert (gsi_stmt (gsi) == SSA_NAME_DEF_STMT (cvar_next));
+ gsi_remove(&gsi, true);
+
+ /* Prepare cfg. */
+ if (flag_xark_debug && dump_file)
+ {
+ my_print_loops(dump_file);
+ }
+ for_bb = split_edge(lpe);
+ ex_bb = split_loop_exit_edge(region_exit);
+ gcc_assert (false_edge == region_exit);
+
+ guard = make_edge(for_bb, ex_bb, 0);
+ single_succ_edge(bb_latch)->flags = 0;
+ /* Edge between latch and ex_bb */
+ end = make_edge(bb_latch, ex_bb, EDGE_FALLTHRU);
+
+ if (flag_xark_debug && dump_file)
+ {
+ my_print_loops(dump_file);
+ }
+
+ for (gsi = gsi_start_phis(ex_bb); !gsi_end_p(gsi); gsi_next(&gsi))
+ {
+ phi = gsi_stmt(gsi);
+ /* From header */
+ stmt = SSA_NAME_DEF_STMT (PHI_ARG_DEF_FROM_EDGE (phi, false_edge));
+ /* From for_bb if it doesn't enter in loop */
+ add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lpe), guard);
+ /* From latch */
+ add_phi_arg(phi, PHI_ARG_DEF_FROM_EDGE (stmt, lle),
+ end);
+ }
+
+ e = redirect_edge_and_branch(false_edge, true_edge->dest);
+ PENDING_STMT (e) = NULL;
+
+ /* Emit GIMPLE_OMP_FOR. */
+ gimple_cond_set_lhs(cond_stmt, cvar_base);
+ type = TREE_TYPE (cvar);
+ t = build_omp_clause(OMP_CLAUSE_SCHEDULE);
+ /* TODO: Isn't it the default? */
+ OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+
+ for_stmt = gimple_build_omp_for(NULL, t, 1, NULL);
+ gimple_omp_for_set_index(for_stmt, 0, initvar);
+ gimple_omp_for_set_initial(for_stmt, 0, cvar_init);
+ gimple_omp_for_set_final(for_stmt, 0, gimple_cond_rhs(cond_stmt));
+ gimple_omp_for_set_cond(for_stmt, 0, gimple_cond_code(cond_stmt));
+ gimple_omp_for_set_incr(for_stmt, 0, build2 (PLUS_EXPR, type,
+ cvar_base,
+ build_int_cst (type, 1)));
+
+ gsi = gsi_last_bb(for_bb);
+ gsi_insert_after(&gsi, for_stmt, GSI_NEW_STMT);
+ SSA_NAME_DEF_STMT (initvar) = for_stmt;
+
+ /* Emit GIMPLE_OMP_CONTINUE. */
+ gsi = gsi_last_bb(bb_latch);
+ stmt = gimple_build_omp_continue(cvar_next, cvar);
+ gsi_insert_after(&gsi, stmt, GSI_NEW_STMT);
+ SSA_NAME_DEF_STMT (cvar_next) = stmt;
+
+ /* Emit GIMPLE_OMP_RETURN for GIMPLE_OMP_FOR. */
+ gsi = gsi_last_bb(ex_bb);
+ gsi_insert_after(&gsi, gimple_build_omp_return(true), GSI_NEW_STMT);
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "\nBefore omp_expand_local\n");
+ my_print_loops(dump_file);
+
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+
+ }
+
+ /* OpenMP expansion */
+ omp_expand_local(bb_omp_parallel_head);
+
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "\nAfter omp_expand_local\n");
+ my_print_loops(dump_file);
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ }
+
+ /* At this point, cgraph is innaccurate.
+ * But it doesn't seem to be necessary. */
+ if (flag_xark_debug && dump_file)
+ {
+ fprintf(dump_file, "\nRebuilding cgraph\n");
+ }
+
+ /* Rebuild cgraph_edges with cfun */
+ rebuild_cgraph_edges();
+
+ if (flag_xark_debug && dump_file)
+ {
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ fprintf(dump_file, "*--*\n");
+ }
+
+ /* Rebuild cgraph_edges with loop_fn */
+ push_cfun(DECL_STRUCT_FUNCTION (loop_fn));
+ old_current_function_decl = current_function_decl;
+ current_function_decl = loop_fn;
+
+ node = cgraph_node(loop_fn);
+ node->analyzed = true;
+ node->lowered = true;
+ if (flag_xark_debug && dump_file)
+ {
+ dump_cgraph_node(dump_file, cgraph_node(loop_fn));
+ fprintf(dump_file, "*--*\n");
+ }
+ rebuild_cgraph_edges();
+ current_function_decl = old_current_function_decl;
+ pop_cfun();
+
+ if (flag_xark_debug && dump_file)
+ {
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ }
+ verify_cgraph();
+
+ /* Check for unreachable nodes */
+ cgraph_remove_unreachable_nodes(false, dump_file);
+ if (flag_xark_debug && dump_file)
+ {
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ fprintf(dump_file, "*--*\n");
+ dump_functions_across_cgraph(dump_file);
+ }
+
+ update_ssa(TODO_update_ssa);
+ verify_ssa(true);
+
+ verify_flow_info();
+ verify_dominators(CDI_DOMINATORS);
+ verify_loop_structure();
+ verify_loop_closed_ssa();
+
+}
+
+unsigned int
+execute_xark_backend_omp(void)
+{
+
+ loop_iterator li;
+ struct loop * loop;
+
+ if (flag_xark_debug && dump_file)
+ {
+ hello_from("execute_xark_backend_omp", dump_file);
+ /* print_bbs(dump_file); */
+ }
+
+ loop_optimizer_init(LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+ if (flag_xark_debug && dump_file)
+ {
+ /* print_loops_info(dump_file); */
+ my_print_loops(dump_file);
+ }
+
+ FOR_EACH_LOOP(li, loop, 0)
+ {
+ if (is_parallelizable_loop(loop))
+ {
+ parallelize_loop(loop);
+ }
+ }
+
+ loop_optimizer_finalize();
+
+ update_ssa(TODO_update_ssa);
+ verify_ssa(true);
+ if (flag_xark_debug && dump_file)
+ {
+ bye_from("execute_xark_backend_omp", dump_file);
+ }
+
+ /*
+ * In tree-parloops.c
+ *
+ * return TODO_cleanup_cfg | TODO_rebuild_alias;
+ */
+ return 0;
+}
+
+unsigned int
+execute_xark_print(void)
+{
+
+ if (dump_file)
+ {
+ hello_from("execute_xark_print", dump_file);
+
+ verify_ssa(true);
+
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ dump_functions_across_cgraph(dump_file);
+
+ bye_from("execute_xark_print", dump_file);
+ }
+
+ return 0;
+}
+
+unsigned int
+execute_xark_ipa_print(void)
+{
+
+ if (dump_file)
+ {
+ hello_from("execute_xark_ipa_print", dump_file);
+
+ dump_cgraph(dump_file);
+ dump_cgraph_state(dump_file);
+ dump_functions_across_cgraph(dump_file);
+
+ bye_from("execute_xark_ipa_print", dump_file);
+ }
+
+ return 0;
+
+}
+
+static bool
+gate_xark_backend_omp(void)
+{
+ return flag_xark_backend_omp;
+}
+
+static bool
+gate_xark_print(void)
+{
+ return flag_xark_debug;
+}
+
+static bool
+gate_xark_ipa_print(void)
+{
+ return flag_xark_debug;
+}
+
+struct gimple_opt_pass pass_xark_backend_omp =
+ {
+ { GIMPLE_PASS, "xark-backend-omp", /* name */
+ gate_xark_backend_omp, /* gate */
+ execute_xark_backend_omp, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ 0, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0
+ /* | TODO_cleanup_cfg | TODO_update_ssa | TODO_rebuild_alias */
+ | TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts | TODO_dump_func /* todo_flags_finish */
+ } };
+
+struct gimple_opt_pass pass_xark_print =
+ {
+ { GIMPLE_PASS, "xark-print", /* name */
+ gate_xark_print, /* gate */
+ execute_xark_print, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ 0, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0 | TODO_dump_func /* todo_flags_finish */
+ } };
+
+struct simple_ipa_opt_pass pass_xark_ipa_print =
+ {
+ { SIMPLE_IPA_PASS, "xark-ipa-print", /* name */
+ gate_xark_ipa_print, /* gate */
+ execute_xark_ipa_print, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ 0, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0 /* todo_flags_finish */
+ } };
+
+#include "gt-tree-xark-backend-omp.h"
Index: gcc/common.opt
===================================================================
--- gcc/common.opt (revisiÃn: 147760)
+++ gcc/common.opt (copia de trabajo)
@@ -1274,6 +1274,18 @@
Common Report Var(flag_tree_vrp) Init(0) Optimization
Perform Value Range Propagation on trees
+fxark-debug
+Common Report Var(flag_xark_debug)
+XARK debug information
+
+fxark-backend-omp
+Common Report Var(flag_xark_backend_omp)
+Generates OpenMP code based on XARK recognition
+
+fverify-ssa-debug
+Common Report Var(flag_verify_ssa_debug)
+verify_ssa debug information
+
funit-at-a-time
Common Report Var(flag_unit_at_a_time) Init(1) Optimization
Compile whole compilation unit at a time
Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c (revisiÃn: 147760)
+++ gcc/tree-ssa.c (copia de trabajo)
@@ -539,7 +539,30 @@
tree op;
enum dom_state orig_dom_state = dom_info_state (CDI_DOMINATORS);
bitmap names_defined_in_bb = BITMAP_ALLOC (NULL);
+ FILE *verify_ssa_dump_file = NULL;
+
+ if (flag_verify_ssa_debug)
+ {
+ verify_ssa_dump_file = fopen("verify_ssa.dump", "a");
+ fprintf(verify_ssa_dump_file,
+ "*-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-* *-*\n");
+ fprintf(verify_ssa_dump_file,
+ "*-* verify_ssa:print_current_pass *-*\n");
+ print_current_pass(verify_ssa_dump_file);
+ fprintf(verify_ssa_dump_file,
+ "*-* verify_ssa:dump_current_function_decl *-*\n");
+ dump_function_to_file(current_function_decl, verify_ssa_dump_file,
+ TDF_RAW);
+ fprintf(verify_ssa_dump_file,
+ "*-* verify_ssa:dump_referenced_vars *-*\n");
+ dump_referenced_vars(verify_ssa_dump_file);
+ fprintf(verify_ssa_dump_file,
+ "*-* verify_ssa:dump_immediate_uses *-*\n");
+ dump_immediate_uses(verify_ssa_dump_file);
+ fflush(verify_ssa_dump_file);
+ }
+
gcc_assert (!need_ssa_update_p (cfun));
verify_stmts ();
@@ -706,6 +729,16 @@
}
bitmap_clear (names_defined_in_bb);
+
+ if (flag_verify_ssa_debug)
+ {
+ fprintf(verify_ssa_dump_file, "*-* verify_ssa:");
+ fprintf(verify_ssa_dump_file, "bb%d ", bb->index);
+ fprintf(verify_ssa_dump_file, "of %s ", current_function_name());
+ fprintf(verify_ssa_dump_file, "funtion processed *-*\n");
+ fflush(verify_ssa_dump_file);
+ }
+
}
free (definition_block);
Index: gcc/tree-flow.h
===================================================================
--- gcc/tree-flow.h (revisiÃn: 147760)
+++ gcc/tree-flow.h (copia de trabajo)
@@ -586,6 +586,12 @@
extern void phinodes_print_statistics (void);
#endif
+/* XARK Passes */
+/* In tree-xark-backend-omp.c */
+extern unsigned int execute_xark_backend_omp(void);
+extern unsigned int execute_xark_print(void);
+extern unsigned int execute_xark_ipa_print(void);
+
/* In gimple-low.c */
extern void record_vars_into (tree, tree);
extern void record_vars (tree);
Index: gcc/Makefile.in
===================================================================
--- gcc/Makefile.in (revisiÃn: 147760)
+++ gcc/Makefile.in (copia de trabajo)
@@ -1286,6 +1286,7 @@
tree-vect-slp.o \
tree-vectorizer.o \
tree-vrp.o \
+ tree-xark-backend-omp.o \
tree.o \
value-prof.o \
var-tracking.o \
@@ -2199,6 +2200,13 @@
$(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) $(GGC_H) \
$(BASIC_BLOCK_H) tree-ssa-propagate.h $(FLAGS_H) $(TREE_DUMP_H) \
$(CFGLOOP_H) tree-chrec.h $(TIMEVAR_H) $(TOPLEV_H) intl.h
+
+tree-xark-backend-omp.o : tree-xark-backend-omp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
+ $(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_H) \
+ $(TREE_FLOW_H) $(TIMEVAR_H) $(FLAGS_H) $(EXPR_H) $(TOPLEV_H) tree-pass.h \
+ $(GGC_H) except.h $(SPLAY_TREE_H) $(OPTABS_H) $(CFGLOOP_H) $(CGRAPH_H)\
+ tree-iterator.h gt-tree-xark-backend-omp.h
+
tree-cfg.o : tree-cfg.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(FLAGS_H) output.h \
$(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
@@ -3328,6 +3336,7 @@
$(srcdir)/gimple.h $(srcdir)/gimple.c \
$(srcdir)/tree-mudflap.c $(srcdir)/tree-flow.h \
$(srcdir)/tree-ssanames.c $(srcdir)/tree-eh.c $(srcdir)/tree-ssa-address.c \
+ $(srcdir)/tree-xark-backend-omp.c \
$(srcdir)/tree-cfg.c \
$(srcdir)/tree-dfa.c \
$(srcdir)/tree-iterator.c $(srcdir)/gimplify.c \
Index: gcc/passes.c
===================================================================
--- gcc/passes.c (revisiÃn: 147760)
+++ gcc/passes.c (copia de trabajo)
@@ -544,6 +544,11 @@
NEXT_PASS (pass_referenced_vars);
NEXT_PASS (pass_build_ssa);
+ /* XARK starts here */
+ NEXT_PASS (pass_xark_print);
+ NEXT_PASS (pass_xark_backend_omp);
+ NEXT_PASS (pass_xark_print);
+
NEXT_PASS (pass_early_warn_uninitialized);
NEXT_PASS (pass_all_early_optimizations);
{
@@ -568,7 +573,9 @@
NEXT_PASS (pass_release_ssa_names);
NEXT_PASS (pass_rebuild_cgraph_edges);
NEXT_PASS (pass_inline_parameters);
+ NEXT_PASS (pass_xark_print);
}
+ NEXT_PASS (pass_xark_ipa_print);
NEXT_PASS (pass_ipa_increase_alignment);
NEXT_PASS (pass_ipa_matrix_reorg);
NEXT_PASS (pass_ipa_cp);