This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Richard Biener <rguenther at suse dot de>, Jeff Law <law at redhat dot com>, Eric Botcazou <ebotcazou at adacore dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 25 Apr 2017 21:31:00 +0200
- Subject: [PATCH] Improve RTL ifcvt for empty else_bb (PR rtl-optimization/80491)
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jakub at redhat dot com
- Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 2A9ECC0467F1
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2A9ECC0467F1
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
Hi!
The following patch is a partial fix for PR80491, an improvement for
if-conversion if there is empty else_bb.
What we do right now is in that case we only look at the immediately
preceeding (non-debug/non-note) instruction before cond_earliest
and if it is not the set of x, we just turn it into attempt to
use previous value of x rather than whatever it has been initialized to.
On the testcases from the PR, we have:
(insn 7 4 8 2 (set (reg:DI 97 [ _15+8 ])
(const_int 0 [0])) 81 {*movdi_internal}
(nil))
(insn 8 7 9 2 (set (reg:DI 104 [ a_11(D)->low ])
(mem:DI (reg/v/f:DI 101 [ a ]) [2 a_11(D)->low+0 S8 A64])) 81 {*movdi_internal}
(nil))
(insn 9 8 10 2 (set (reg:DI 105 [ b_12(D)->low ])
(mem:DI (reg/v/f:DI 102 [ b ]) [2 b_12(D)->low+0 S8 A64])) 81 {*movdi_internal}
(nil))
(insn 10 9 11 2 (parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:DI (reg:DI 104 [ a_11(D)->low ])
(reg:DI 105 [ b_12(D)->low ]))
(reg:DI 104 [ a_11(D)->low ])))
(set (reg:DI 103)
(plus:DI (reg:DI 104 [ a_11(D)->low ])
(reg:DI 105 [ b_12(D)->low ])))
]) 316 {*adddi3_cc_overflow_1}
(expr_list:REG_DEAD (reg:DI 105 [ b_12(D)->low ])
(expr_list:REG_DEAD (reg:DI 104 [ a_11(D)->low ])
(nil))))
(jump_insn 11 10 14 2 (set (pc)
(if_then_else (ltu (reg:CCC 17 flags)
(const_int 0 [0]))
(label_ref 14)
(pc))) 617 {*jcc_1}
(expr_list:REG_DEAD (reg:CCC 17 flags)
(int_list:REG_BR_PROB 4 (nil)))
-> 14)
(code_label 14 11 35 3 3 (nil) [1 uses])
(note 35 14 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 35 16 3 (set (reg:DI 97 [ _15+8 ])
(const_int 1 [0x1])) 81 {*movdi_internal}
(nil))
(code_label 16 15 36 4 2 (nil) [0 uses])
If insn 7 would come after insn 9 (which doesn't change the behavior, as
insn 8 and insn 9 don't clobber pseudo 97 and const_int 0 is constant),
we'd turn that into a setcc pattern, but otherwise we fail.
This patch let us search for x's setter earlier in the bb.
During testing I found that modified_in_p/modified_in_between_p don't
actually take into account calls that could change MEMs, so the patch
handles that too.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
During bootstrap/regtest it seems to trigger over 10000 times, but
my statistics collection only noted cases where it found an earlier setter
and the noce_process_if_block has been successful, has not attempted to
check if insn_b/set_b was NULL whether it would otherwise fail (but even
in cases where it doesn't fail perhaps with the patch it can generate
better code).
2017-04-25 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/80491
* ifcvt.c (noce_process_if_block): When looking for x setter
with missing else_bb, don't check only the insn right before
cond_earliest, but look for the last insn that x is modified in
within the same bb.
--- gcc/ifcvt.c.jj 2017-04-24 19:28:02.151975658 +0200
+++ gcc/ifcvt.c 2017-04-25 16:22:16.760928057 +0200
@@ -3440,7 +3440,36 @@ noce_process_if_block (struct noce_if_in
}
else
{
- insn_b = prev_nonnote_nondebug_insn (if_info->cond_earliest);
+ bool call_crossed = false;
+ insn_b = if_info->cond_earliest;
+ do
+ {
+ insn_b = prev_nonnote_nondebug_insn (insn_b);
+ if (!insn_b
+ || (BLOCK_FOR_INSN (insn_b)
+ != BLOCK_FOR_INSN (if_info->cond_earliest))
+ || modified_in_p (x, insn_b))
+ break;
+ /* modified_in_p does not consider calls changing MEMs,
+ so punt on calls if x or SET_SRC (set_b) are MEMs that
+ could be changed by calls. */
+ if (CALL_P (insn_b))
+ {
+ call_crossed = true;
+ if (MEM_P (x) && !MEM_READONLY_P (x))
+ break;
+ }
+ }
+ while (1);
+ if (!call_crossed)
+ for (rtx_insn *insn_c = if_info->cond_earliest; insn_c != jump;
+ insn_c = NEXT_INSN (insn_c))
+ if (CALL_P (insn_c))
+ {
+ call_crossed = true;
+ break;
+ }
+
/* We're going to be moving the evaluation of B down from above
COND_EARLIEST to JUMP. Make sure the relevant data is still
intact. */
@@ -3468,6 +3497,29 @@ noce_process_if_block (struct noce_if_in
insn_b = NULL;
set_b = NULL_RTX;
}
+ else if (call_crossed)
+ {
+ subrtx_iterator::array_type array;
+ /* Punt if there are any non-readonly MEMs and there are any
+ calls in between insn_b and jump. Even for
+ RTL_CONST_OR_PURE_CALL_P calls the call argument stack slots
+ might be modified, so unless we are prepared to carefully
+ analyze what might be modified and what can't, just punt. */
+ FOR_EACH_SUBRTX (iter, array, SET_SRC (set_b), ALL)
+ if (MEM_P (*iter) && !MEM_READONLY_P (*iter))
+ {
+ insn_b = NULL;
+ set_b = NULL_RTX;
+ break;
+ }
+ FOR_EACH_SUBRTX (iter, array, x, ALL)
+ if (MEM_P (*iter) && !MEM_READONLY_P (*iter))
+ {
+ insn_b = NULL;
+ set_b = NULL_RTX;
+ break;
+ }
+ }
}
/* If x has side effects then only the if-then-else form is safe to
Jakub