[PATCH 13/20] aarch64: Use RTL builtins for FP ml[as][q]_laneq intrinsics

Tue May 4 16:40:57 GMT 2021

Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi Richard,
>
> I think you may be referencing an older checkout as we refactored this
> pattern in a previous change to:
>
> (define_insn "mul_lane<mode>3"
>  [(set (match_operand:VMUL 0 "register_operand" "=w")
>        (mult:VMUL
>    (vec_duplicate:VMUL
>      (vec_select:<VEL>
>        (match_operand:VMUL 2 "register_operand" "<h_con>")
>        (parallel [(match_operand:SI 3 "immediate_operand" "i")])))
>    (match_operand:VMUL 1 "register_operand" "w")))]
>   "TARGET_SIMD"
>   {
>     operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
>     return "<f>mul\\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[%3]";
>   }
>   [(set_attr "type" "neon<fp>_mul_<stype>_scalar<q>")]
> )
>
> which doesn't help us with the 'laneq' intrinsics as the machine mode for
> operands 0 and 1 (of the laneq intrinsics) is narrower than the machine
> mode for operand 2.

Gah, I copied the wrong one, sorry.  The one I meant was:

(define_insn "*aarch64_mul3_elt_<vswap_width_name><mode>"
  [(set (match_operand:VMUL_CHANGE_NLANES 0 "register_operand" "=w")
     (mult:VMUL_CHANGE_NLANES
       (vec_duplicate:VMUL_CHANGE_NLANES
	  (vec_select:<VEL>
	    (match_operand:<VSWAP_WIDTH> 1 "register_operand" "<h_con>")
	    (parallel [(match_operand:SI 2 "immediate_operand")])))
      (match_operand:VMUL_CHANGE_NLANES 3 "register_operand" "w")))]
  "TARGET_SIMD"
  {
    operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
    return "<f>mul\\t%0.<Vtype>, %3.<Vtype>, %1.<Vetype>[%2]";
  }
  [(set_attr "type" "neon<fp>_mul_<Vetype>_scalar<q>")]
)

This already provides patterns in which the indexed operand is
wider than the other operands.

Thanks,
Richard