This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Neon intrinsics] Literal vector construction through vcombine is poor


On 16/06/17 22:08, Michael Collison wrote:
> This patch improves code generation for literal vector construction by expanding and exposing the pattern to rtl optimization earlier. The current implementation delays splitting the pattern until after reload which results in poor code generation for the following code:
> 
> 
> #include "arm_neon.h"
> 
> int16x8_t
> foo ()
> {
>   return vcombine_s16 (vdup_n_s16 (0), vdup_n_s16 (8));
> }
> 
> Trunk generates:
> 
> foo:
> 	movi	v1.2s, 0
> 	movi	v0.4h, 0x8
> 	dup	d2, v1.d[0]
> 	ins	v2.d[1], v0.d[0]
> 	orr	v0.16b, v2.16b, v2.16b
> 	ret
> 
> With the patch we now generate:
> 
> foo:
> 	movi	v1.4h, 0x8
> 	movi	v0.4s, 0
> 	ins	v0.d[1], v1.d[0]
> 	ret
> 
> Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk.
> 
> 2017-06-15  Michael Collison  <michael.collison@arm.com>
> 
> 	* config/aarch64/aarch64-simd.md(aarch64_combine_internal<mode>):
> 	Convert from define_insn_and_split into define_expand
> 	* config/aarch64/aarch64.c(aarch64_split_simd_combine):
> 	Allow register and subreg operands.
> 

Your changelog entry is confusing.  You've deleted the
aarch64_combine_internal<mode> pattern entirely, having merged some of
its functionality directly into its caller (aarch64_combine<mode>).

So I think it should read:

* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Directly call
aarch64_split_simd_combine.
(aarch64_combine_internal<mode>): Delete pattern.
* ...

Note also there should be a space between the file name and the open
bracket for the first function name.

Why don't you need the big-endian code path any more?

R.

> 
> pr7057.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index c462164..4a253a9 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2807,27 +2807,11 @@
>        op1 = operands[1];
>        op2 = operands[2];
>      }
> -  emit_insn (gen_aarch64_combine_internal<mode> (operands[0], op1, op2));
> -  DONE;
> -}
> -)
>  
> -(define_insn_and_split "aarch64_combine_internal<mode>"
> -  [(set (match_operand:<VDBL> 0 "register_operand" "=&w")
> -        (vec_concat:<VDBL> (match_operand:VDC 1 "register_operand" "w")
> -			   (match_operand:VDC 2 "register_operand" "w")))]
> -  "TARGET_SIMD"
> -  "#"
> -  "&& reload_completed"
> -  [(const_int 0)]
> -{
> -  if (BYTES_BIG_ENDIAN)
> -    aarch64_split_simd_combine (operands[0], operands[2], operands[1]);
> -  else
> -    aarch64_split_simd_combine (operands[0], operands[1], operands[2]);
> +  aarch64_split_simd_combine (operands[0], op1, op2);
> +
>    DONE;
>  }
> -[(set_attr "type" "multiple")]
>  )
>  
>  (define_expand "aarch64_simd_combine<mode>"
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 2e385c4..46bd78b 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1650,7 +1650,8 @@ aarch64_split_simd_combine (rtx dst, rtx src1, rtx src2)
>  
>    gcc_assert (VECTOR_MODE_P (dst_mode));
>  
> -  if (REG_P (dst) && REG_P (src1) && REG_P (src2))
> +  if (register_operand (dst, dst_mode) && register_operand (src1, src_mode)
> +      && register_operand (src2, src_mode))
>      {
>        rtx (*gen) (rtx, rtx, rtx);
>  
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]