[PATCH 13/20] aarch64: Use RTL builtins for FP ml[as][q]_laneq intrinsics
Richard Sandiford
rdsandiford@googlemail.com
Tue May 4 16:40:57 GMT 2021
Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi Richard,
>
> I think you may be referencing an older checkout as we refactored this
> pattern in a previous change to:
>
> (define_insn "mul_lane<mode>3"
> [(set (match_operand:VMUL 0 "register_operand" "=w")
> (mult:VMUL
> (vec_duplicate:VMUL
> (vec_select:<VEL>
> (match_operand:VMUL 2 "register_operand" "<h_con>")
> (parallel [(match_operand:SI 3 "immediate_operand" "i")])))
> (match_operand:VMUL 1 "register_operand" "w")))]
> "TARGET_SIMD"
> {
> operands[3] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[3]));
> return "<f>mul\\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[%3]";
> }
> [(set_attr "type" "neon<fp>_mul_<stype>_scalar<q>")]
> )
>
> which doesn't help us with the 'laneq' intrinsics as the machine mode for
> operands 0 and 1 (of the laneq intrinsics) is narrower than the machine
> mode for operand 2.
Gah, I copied the wrong one, sorry. The one I meant was:
(define_insn "*aarch64_mul3_elt_<vswap_width_name><mode>"
[(set (match_operand:VMUL_CHANGE_NLANES 0 "register_operand" "=w")
(mult:VMUL_CHANGE_NLANES
(vec_duplicate:VMUL_CHANGE_NLANES
(vec_select:<VEL>
(match_operand:<VSWAP_WIDTH> 1 "register_operand" "<h_con>")
(parallel [(match_operand:SI 2 "immediate_operand")])))
(match_operand:VMUL_CHANGE_NLANES 3 "register_operand" "w")))]
"TARGET_SIMD"
{
operands[2] = aarch64_endian_lane_rtx (<VSWAP_WIDTH>mode, INTVAL (operands[2]));
return "<f>mul\\t%0.<Vtype>, %3.<Vtype>, %1.<Vetype>[%2]";
}
[(set_attr "type" "neon<fp>_mul_<Vetype>_scalar<q>")]
)
This already provides patterns in which the indexed operand is
wider than the other operands.
Thanks,
Richard
More information about the Gcc-patches
mailing list