This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [AArch64][SVE] Utilize ASRD instruction for division and remainder


Yuliang Wang <Yuliang.Wang@arm.com> writes:
> +;; Unpredicated arithmetic right shift for division by power-of-2.
> +(define_expand "sdiv_pow2<mode>3"
> +  [(set (match_operand:SVE_I 0 "register_operand" "")
> +	(unspec:SVE_I
> +	  [(match_dup 3)
> +	   (unspec:SVE_I
> +	     [(match_operand:SVE_I 1 "register_operand" "")
> +	      (match_operand 2 "aarch64_simd_rshift_imm")]
> +	    UNSPEC_ASRD)]
> +	 UNSPEC_PRED_X))]
> +  "TARGET_SVE"
> +  {
> +    operands[3] = aarch64_ptrue_reg (<VPRED>mode);
> +  }
> +)

Sorry for not noticing last time, but: define_expands shouldn't have
constraints, so it's better to drop the empty "" from the match_operands.

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 4ace224a8ff5ed4fafed10a69ef00ffb2d7d8c39..009b8f8db74c7a3bef996ceaba58123f6558221c 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1446,6 +1446,10 @@ of bytes.
>  Target supports both signed and unsigned multiply-high-with-round-and-scale
>  operations on vectors of half-words.
>  
> +@item vect_sdiv_pow2_si
> +Target supports signed division by constant power-of-2 operations
> +on vectors of words.

"4 bytes" is more accurate than "words".  The problem with "word" is
that it depends on context; e.g. although the AArch64 ISA uses "word"
to mean 32 bits, its words are really 64 bits as far as GCC is concerned.
(It's also worth noting that "4 bytes" isn't necessarily 32 bits, since GCC
supports 16-bit and 32-bit bytes.)

All a biit pedantic, sorry.

> +/* { dg-final { scan-assembler-not {\tand\t%} } } */
> +

Stray newline at end of file.

> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index 414bf80003b9192806f79afed9393f9ef4750a7d..dedee87ced3d4cbed18fe4144282e5b4330a113d 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -6192,6 +6192,14 @@ proc check_effective_target_vect_mulhrs_hi {} {
>  		   && [check_effective_target_aarch64_sve2] }]
>  }
>  
> +# Return 1 if the target plus current options supports signed division
> +# by power-of-2 operations on vectors of half-words.

"words" rather than "half-words", although as above 4 bytes is more
accurate.

> +
> +proc check_effective_target_vect_sdiv_pow2_si {} {
> +    return [expr { [istarget aarch64*-*-*]
> +		   && [check_effective_target_aarch64_sve] }]
> +}
> +
>  # Return 1 if the target plus current options supports a vector
>  # demotion (packing) of shorts (to chars) and ints (to shorts) 
>  # using modulo arithmetic, 0 otherwise.
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index 2f86f9e4fc7039add1b1d7b82574cb8262eb4ba4..f09e9d54701ebee0382742d20d4f5a0db84110de 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -2925,6 +2925,38 @@ vect_recog_divmod_pattern (stmt_vec_info stmt_vinfo, tree *type_out)
>        /* Pattern detected.  */
>        vect_pattern_detected ("vect_recog_divmod_pattern", last_stmt);
>  
> +      *type_out = vectype;
> +
> +      /* Check if the target supports this internal function.  */
> +      internal_fn ifn = IFN_DIV_POW2;
> +      if (direct_internal_fn_supported_p (ifn, vectype, OPTIMIZE_FOR_SPEED))
> +	{
> +	  tree shift = build_int_cst (itype, tree_log2 (oprnd1));
> +
> +	  tree var_div = vect_recog_temp_ssa_var (itype, NULL);
> +	  gimple *div_stmt = gimple_build_call_internal (ifn, 2, oprnd0, shift);
> +	  gimple_call_set_lhs (div_stmt, var_div);
> +
> +	  if (rhs_code == TRUNC_MOD_EXPR)
> +	    {
> +	      append_pattern_def_seq (stmt_vinfo, div_stmt);
> +	      def_stmt
> +		= gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
> +				       LSHIFT_EXPR, var_div, shift);
> +	      append_pattern_def_seq (stmt_vinfo, def_stmt);
> +	      pattern_stmt
> +		= gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
> +				       MINUS_EXPR, oprnd0,
> +				       gimple_assign_lhs (def_stmt));
> +	    }
> +	  else
> +	    {
> +	      pattern_stmt = div_stmt;
> +	      gimple_set_location (pattern_stmt, gimple_location (last_stmt));
> +	    }

Again, sorry for not noticing last time, but we should do this outside
the "else".  Although the TRUNC_MOD_EXPR is no longer being calculated as
a modulus operation, the final MINUS_EXPR is still giving the same result.

> +          return pattern_stmt;

The indentation of this line uses 8 spaces instead of tabs.

OK otherwise, thanks.

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]