This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[og7] backport fix for PR84952
- From: Cesar Philippidis <cesar at codesourcery dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: Tom de Vries <tom at codesourcery dot com>
- Date: Tue, 20 Mar 2018 11:27:53 -0700
- Subject: [og7] backport fix for PR84952
I've applied this patch to openacc-gcc-7-branch which backports Tom's
fix for the nvptx bar.sync placement bug in PR84952. This patch also
reverts some changes I introduced in git revision 7445a4d40.
Tom's patch didn't apply cleanly because of the recent I renamed
nvptx_wsync to nvptx_cta_sync so that function can be used for both
large vector_lengths along with workers. Other than that, I didn't have
to make any changes to his patch.
Cesar
2018-03-20 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (nvptx_single): Revert changes from
7445a4d40.
Backport from trunk:
2018-03-20 Tom de Vries <tom@codesourcery.com>
PR target/84952
* config/nvptx/nvptx.c (nvptx_single): Don't neuter bar.sync.
(nvptx_process_pars): Emit bar.sync asap and alap.
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 070d236fa87..b7e3f59fed7 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -3988,7 +3988,9 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
while (true)
{
/* Find first insn of from block. */
- while (head != BB_END (from) && !INSN_P (head))
+ while (head != BB_END (from)
+ && (!INSN_P (head)
+ || recog_memoized (head) == CODE_FOR_nvptx_barsync))
head = NEXT_INSN (head);
if (from == to)
@@ -4037,6 +4039,7 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
{
default:
break;
+ case CODE_FOR_nvptx_barsync:
case CODE_FOR_nvptx_fork:
case CODE_FOR_nvptx_forked:
case CODE_FOR_nvptx_joining:
@@ -4056,15 +4059,6 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
return;
}
- /* NVPTX_BARSYNC barriers are placed immediately before NVPTX_JOIN
- in order to ensure that all of the threads in a CTA reach the
- barrier. Don't nueter BLOCK if head is NVPTX_BARSYNC and tail is
- NVPTX_JOIN. */
- if (from == to
- && recog_memoized (head) == CODE_FOR_nvptx_barsync
- && recog_memoized (tail) == CODE_FOR_nvptx_join)
- return;
-
/* Insert the vector test inside the worker test. */
unsigned mode;
rtx_insn *before = tail;
@@ -4112,17 +4106,7 @@ nvptx_single (unsigned mask, basic_block from, basic_block to)
br = gen_br_true (pred, label);
else
br = gen_br_true_uni (pred, label);
-
- if (recog_memoized (head) == CODE_FOR_nvptx_forked
- && recog_memoized (NEXT_INSN (head)) == CODE_FOR_nvptx_barsync)
- {
- head = NEXT_INSN (head);
- emit_insn_after (br, head);
- }
- else if (recog_memoized (head) == CODE_FOR_nvptx_barsync)
- emit_insn_after (br, head);
- else
- emit_insn_before (br, head);
+ emit_insn_before (br, head);
LABEL_NUSES (label)++;
if (tail_branch)
@@ -4348,7 +4332,7 @@ nvptx_process_pars (parallel *par)
if (!empty || !is_call)
{
/* Insert begin and end synchronizations. */
- emit_insn_after (nvptx_cta_sync (false), par->forked_insn);
+ emit_insn_before (nvptx_cta_sync (false), par->forked_insn);
emit_insn_before (nvptx_cta_sync (true), par->join_insn);
}
}