[PATCH 2/4] [ifcvt] optimize x=c ? (y op z) : y by RISC-V Zicond like insns
Jeff Law
jeffreyalaw@gmail.com
Wed Nov 29 05:26:35 GMT 2023
On 11/27/23 19:32, Fei Gao wrote:
> op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT]
>
> SIGN_EXTEND, ZERO_EXTEND and SUBREG has been considered
> to support SImode in 64-bit machine.
Let's defer these for now. We're supposed to be wrapping up work that
was posted before stage1 closed. If these opcodes were trivial to
support, then I would let them through, but SUBREGs for example can be
problematical as their semantics can be complex.
>
> Conditional op, if zero
> rd = (rc == 0) ? (rs1 op rs2) : rs1
> -->
> czero.nez rd, rs2, rc
> op rd, rs1, rd
>
> Conditional op, if non-zero
> rd = (rc != 0) ? (rs1 op rs2) : rs1
> -->
> czero.eqz rd, rs2, rc
> op rd, rs1, rd
>
> Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com>
>
> gcc/ChangeLog:
>
> * ifcvt.cc (noce_try_cond_zero_arith):handler for condtional zero based ifcvt
> (noce_emit_czero): helper for noce_try_cond_zero_arith
> (noce_cond_zero_binary_op_supported): check supported OPs for condtional zero based ifcvt
> (get_base_reg): get base reg of a subreg or the reg itself
> (noce_bbs_ok_for_cond_zero_arith): check if BBs are OK for condtional zero based ifcvt
> (noce_process_if_block): add noce_try_cond_zero_arith
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zicond_ifcvt_opt.c: New test.
> ---
> gcc/ifcvt.cc | 210 ++++++
> .../gcc.target/riscv/zicond_ifcvt_opt.c | 682 ++++++++++++++++++
> 2 files changed, 892 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c
>
> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index a0af553b9ff..8f6a0e7f5fe 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -787,6 +787,7 @@ static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
> static bool noce_try_minmax (struct noce_if_info *);
> static bool noce_try_abs (struct noce_if_info *);
> static bool noce_try_sign_mask (struct noce_if_info *);
> +static int noce_try_cond_zero_arith (struct noce_if_info *);
>
> /* Return the comparison code for reversed condition for IF_INFO,
> or UNKNOWN if reversing the condition is not possible. */
> @@ -1831,6 +1832,40 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
> return NULL_RTX;
> }
>
> +/* Emit a conditional zero, returning TARGET or NULL_RTX upon failure.
> + IF_INFO describes the if-conversion scenario under consideration.
> + CZERO_CODE selects the condition (EQ/NE).
> + NON_ZERO_OP is the nonzero operand of the conditional move
> + TARGET is the desired output register. */
> +
> +static rtx
> +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code,
> + rtx non_zero_op, rtx target)
[ ... ]
The code you wrote is safe in that if constructs a suitable if-then-else
as a single object, starts a new sequence the uses emit_insn to put that
object onto a sequence. Then you extract that one and only one insn
from the sequence and validate it can be recognized.
In cases where you want to do something like this and know you're going
to emit one and only one insn you can use 'make_insn_raw' without
generating a new sequence.
I would suggest you replace all the code starting with start_sequence()
and ending with end_sequence () (inclusive) with something like
insn = make_insn_raw (set);
if (recog_memoized (insn) >= 0)
{
emit_insn (insn);
return target;
}
return NULL_RTX;
Note that in the future (gcc-15) when this code is generalized to use
the expander it will potentially generate multiple insns at which point
we'll have to put them on a sequence and validate they all are
recognizable. But we'll tackle that at the appropriate time.
> +
> + return false;
> +}
> +
> +/* Helper function to return REG itself or inner expression of a SUBREG,
> + otherwise NULL_RTX for other RTX_CODE. */
> +
> +static rtx
> +get_base_reg (rtx exp)
> +{
> + if (REG_P (exp))
> + return exp;
> + else if (SUBREG_P (exp))
> + return SUBREG_REG (exp);
> +
> + return NULL_RTX;
> +
I would advise against handling subregs at this point. What you're
doing is probably too simplistic given the semantics of subregs.
> +
> + /* Strip sign_extend if any. */
> + if (GET_CODE (a) == SIGN_EXTEND || GET_CODE (a) == ZERO_EXTEND)
> + bin_exp = XEXP (a, 0);
> + else
> + bin_exp = a;
Similarly while I do think we're going to want to handle extensions,
let's not try and add them at this point. We want to get this wrapped
up & integrated so that everyone can move their focus to bugfixing for
gcc-14.
> +
> +/* Try to covert if-then-else with conditional zero,
> + returning TURE on success or FALSE on failure.
> + IF_INFO describes the if-conversion scenario under consideration. */
> +
> +static int
> +noce_try_cond_zero_arith (struct noce_if_info *if_info)
> +{
> + rtx target, a;
> + rtx_insn *seq;
> + machine_mode mode = GET_MODE (if_info->x);
> + rtx common = NULL_RTX;
> + enum rtx_code czero_code = UNKNOWN;
> + rtx non_zero_op = NULL_RTX;
> + rtx *to_replace = NULL;
> +
> + if (!noce_bbs_ok_for_cond_zero_arith (if_info, &common, &czero_code, &a,
> + &to_replace))
> + return false;
> +
> + /* Disallow CONST_INT currently for simplicity*/
> + if (to_replace == NULL || !REG_P (*to_replace))
> + return false;If you're not going to allow CONST_INT reject them in the
ok_for_condzero_arith routine. Presumably you're rejecting constants
because zicond doesn't allow them in the arms. We'll want to handle
them in the future and can do so easily once we're using the expander.
> +
> + non_zero_op = *to_replace;
> +
> + start_sequence ();
> +
> + /* If x is used in both input and out like x = c ? x + z : x,
> + use a new reg to avoid modifying x */
> + if (common && rtx_equal_p (common, if_info->x))
> + target = gen_reg_rtx (mode);
> + else
> + target = if_info->x;
> +
> + target = noce_emit_czero (if_info, czero_code, non_zero_op, target);
> + if (!target || !to_replace)
> + {
> + end_sequence ();
> + return false;
> + }
> +
> + *to_replace = target;
> + emit_insn (gen_rtx_SET (if_info->x, a));
This isn't necessarily safe. Use emit_move_insn instead.
This is pretty close. Hopefully one more iteration and it can be
integrated.
jeff
More information about the Gcc-patches
mailing list