[PATCH] RISC-V: Using merge approach to optimize repeating sequence in vec_init
Robin Dapp
rdapp.gcc@gmail.com
Fri May 12 13:08:27 GMT 2023
Hi,
in general LGTM, just minor nits and comments.
> - void set_len_and_policy (rtx len, bool force_vlmax = false)
> - {
> - bool vlmax_p = force_vlmax;
> - gcc_assert (has_dest);
> + void set_len_and_policy (rtx len, bool force_vlmax = false, bool ta_p = true,
> + bool ma_p = true)
> + {
> + bool vlmax_p = force_vlmax;
> + gcc_assert (has_dest);
Indentation?
> m_inner_mode = GET_MODE_INNER (mode);
> - m_inner_size = GET_MODE_BITSIZE (m_inner_mode).to_constant ();
> + m_inner_size = GET_MODE_BITSIZE (m_inner_mode);
> + m_inner_units = GET_MODE_SIZE (m_inner_mode);
I find it a bit misleading to call this units here. Granted it's an inner
mode (i.e. referring to "bytes") but in the context of vector modes I'm likely
to think of a vector "unit" or lane. What about m_inner_size_bytes or
m_inner_size_units?
> +bool
> +rvv_builder::repeating_sequence_use_merge_profitable_p ()
> +{
> + return repeating_sequence_p (0, full_nelts ().to_constant (), npatterns ())
> + && inner_units () <= UNITS_PER_WORD
> + && 3 * npatterns () < full_nelts ().to_constant ();
> +}
Appreciate the explanatory comment and number of instructions is good for
now. In the future and given the different uarchs we will want a proper
costing comparison.
> +/* Get the mask for merge approach.
> +
> + Consider such following case:
> + {a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b}
> + To merge "a", the mask should be 1010....
> + To merge "a", the mask should be 0101....
> +*/
Second line should be "b".
> +/* Emit merge instruction. */
> +
> +static void
> +emit_merge_op (rtx dest, rtx src1, rtx src2, rtx mask)
> +{
> + insn_expander<8> e;
> + machine_mode mode = GET_MODE (dest);
> + e.set_dest_and_mask (NULL_RTX, dest, GET_MODE (mask), true, true);
> + e.add_input_operand (src1, mode);
> + if (VECTOR_MODE_P (GET_MODE (src2)))
> + e.add_input_operand (src2, mode);
> + else
> + e.add_input_operand (src2, GET_MODE_INNER (mode));
> +
> + e.add_input_operand (mask, GET_MODE (mask));
> + e.set_len_and_policy (NULL_RTX, true, true, false);
> + if (VECTOR_MODE_P (GET_MODE (src2)))
> + e.expand (code_for_pred_merge (mode), false);
> + else
> + e.expand (code_for_pred_merge_scalar (mode), false);
> +}
Looks a lot like binop. Might need another round of wrappers
soon :)
Regards
Robin
More information about the Gcc-patches
mailing list