This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] update edge profile info in nvptx.c


The recent basic block profiling changes broke a couple of libgomp
OpenACC execution tests involving reductions with nvptx offloading. For
gang and worker reductions, the nvptx BE updates the original reduction
variable using a lock-free atomic algorithm. This lock-free algorithm
utilizes a polling loop to check the state of the variable being
updated. This loop introduced a new basic block edge, but it wasn't
assigned a branch probability. Because of the highly threaded nature of
CUDA accelerators, I set the branch probability for that edge as even.

Similarly, for nvptx vector reductions, when it comes time to initialize
the reduction variable, the nvptx BE constructs a branch so that only
vector lanes 1 to vector_length-1 are initialized the the default value
for a given reduction type, where vector lane 0 retains the original
value of the reduction variable. For similar reason to the gang and
worker reductions, I set the probability of the new edge introduced for
the vector reduction to even.

Is this OK for trunk?

Cesar
2017-07-13  Cesar Philippidis  <cesar@codesourcery.com>

	gcc
	* config/nvptx/nvptx.c (nvptx_lockless_update): Update edge
	profiling information.
	(nvptx_lockfull_update): Likewise.
	(nvptx_goacc_reduction_init): Likewise.


diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index c8847a5dbba..3a24bd375ca 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4985,6 +4985,7 @@ nvptx_lockless_update (location_t loc, gimple_stmt_iterator *gsi,
 
   post_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
   edge loop_edge = make_edge (loop_bb, loop_bb, EDGE_FALSE_VALUE);
+  loop_edge->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, loop_bb, pre_bb);
   set_immediate_dominator (CDI_DOMINATORS, post_bb, loop_bb);
 
@@ -5057,7 +5058,8 @@ nvptx_lockfull_update (location_t loc, gimple_stmt_iterator *gsi,
   
   /* Create the lock loop ... */
   locked_edge->flags ^= EDGE_TRUE_VALUE | EDGE_FALLTHRU;
-  make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  edge e = make_edge (lock_bb, lock_bb, EDGE_FALSE_VALUE);
+  e->probability = profile_probability::even ();
   set_immediate_dominator (CDI_DOMINATORS, lock_bb, entry_bb);
   set_immediate_dominator (CDI_DOMINATORS, update_bb, lock_bb);
 
@@ -5211,6 +5213,7 @@ nvptx_goacc_reduction_init (gcall *call)
       
       /* Create false edge from call_bb to dst_bb.  */
       edge nop_edge = make_edge (call_bb, dst_bb, EDGE_FALSE_VALUE);
+      nop_edge->probability = profile_probability::even ();
 
       /* Create phi node in dst block.  */
       gphi *phi = create_phi_node (lhs, dst_bb);

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]