[PATCH][ARM] Remove remaining Neon DImode support
Kyrill Tkachov
kyrylo.tkachov@foss.arm.com
Thu Aug 22 13:46:00 GMT 2019
Hi Wilco,
On 7/22/19 5:18 PM, Wilco Dijkstra wrote:
> Remove the remaining Neon adddi3, subdi3 and negdi2 patterns. As a result
> adddi3, subdi3 and negdi2 can now always be expanded early
> irrespectively of
> whether Neon is available. Also expand the extenddi patterns at the same
> time. Several Neon arch attributes are no longer used and removed.
>
> Code generation is improved in all cases, saving another 400-500
> instructions
> from the PR77308 testcase (total improvement is over 1700 instructions
> with
> -mcpu=cortex-a57 -O2).
>
> Bootstrap & regress OK on arm-none-linux-gnueabihf --with-cpu=cortex-a57
>
Ok.
Thanks,
Kyrill
> ChangeLog:
> 2019-07-19 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/arm/arm.md (neon_for_64bits): Remove.
> (avoid_neon_for_64bits): Remove.
> (arm_adddi3): Always split early.
> (arm_subdi3): Always split early.
> (negdi2): Remove Neon expansion.
> (split zero_extend): Split before reload.
> (split sign_extend): Split before reload.
> ---
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index
> 10ed70dac4384354c0a2453c5e51a29108c6c062..6d8a5a54997caee0e6956f01018cb5300a9a07e1
> 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -125,7 +125,7 @@ (define_attr "length" ""
>  ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
>  ; Baseline. This attribute is used to compute attribute "enabled",
> Â ; use type "any" to enable an alternative in all cases.
> -(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
> +(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
> Â Â (const_string "any"))
>
> Â (define_attr "arch_enabled" "no,yes"
> @@ -168,16 +168,6 @@ (define_attr "arch_enabled" "no,yes"
> Â Â Â Â Â Â (match_test "TARGET_THUMB1 && arm_arch8"))
> Â (const_string "yes")
>
> - (and (eq_attr "arch" "avoid_neon_for_64bits")
> -Â Â Â Â Â (match_test "TARGET_NEON")
> -Â Â Â Â Â (not (match_test "TARGET_PREFER_NEON_64BITS")))
> - (const_string "yes")
> -
> - (and (eq_attr "arch" "neon_for_64bits")
> -Â Â Â Â Â (match_test "TARGET_NEON")
> -Â Â Â Â Â (match_test "TARGET_PREFER_NEON_64BITS"))
> - (const_string "yes")
> -
> Â (and (eq_attr "arch" "iwmmxt2")
> Â Â Â Â Â Â (match_test "TARGET_REALLY_IWMMXT2"))
> Â (const_string "yes")
> @@ -450,13 +440,8 @@ (define_expand "adddi3"
> Â Â Â Â (clobber (reg:CC CC_REGNUM))])]
> Â Â "TARGET_EITHER"
> Â Â "
> -Â if (TARGET_THUMB1)
> -Â Â Â {
> -Â Â Â Â Â if (!REG_P (operands[1]))
> -Â Â Â Â Â Â Â operands[1] = force_reg (DImode, operands[1]);
> -Â Â Â Â Â if (!REG_P (operands[2]))
> -Â Â Â Â Â Â Â operands[2] = force_reg (DImode, operands[2]);
> -Â Â Â Â }
> +Â if (TARGET_THUMB1 && !REG_P (operands[2]))
> +Â Â Â operands[2] = force_reg (DImode, operands[2]);
> Â Â "
> Â )
>
> @@ -465,9 +450,9 @@ (define_insn_and_split "*arm_adddi3"
> Â (plus:DI (match_operand:DI 1 "arm_general_register_operand" "%0, 0,
> r, 0, r")
>  (match_operand:DI 2 "arm_general_adddi_operand" "r, 0, r, Dd, Dd")))
> Â Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_32BIT && !TARGET_NEON"
> +Â "TARGET_32BIT"
> Â Â "#"
> -Â "TARGET_32BIT && ((!TARGET_NEON && !TARGET_IWMMXT) ||
> reload_completed)"
> +Â "TARGET_32BIT"
> Â Â [(parallel [(set (reg:CC_C CC_REGNUM)
> Â Â Â (compare:CC_C (plus:SI (match_dup 1) (match_dup 2))
> Â (match_dup 1)))
> @@ -1290,24 +1275,16 @@ (define_expand "subdi3"
> Â Â Â Â (clobber (reg:CC CC_REGNUM))])]
> Â Â "TARGET_EITHER"
> Â Â "
> -Â if (TARGET_THUMB1)
> -Â Â Â {
> -Â Â Â Â Â if (!REG_P (operands[1]))
> -Â Â Â Â Â Â Â operands[1] = force_reg (DImode, operands[1]);
> -Â Â Â Â Â if (!REG_P (operands[2]))
> -Â Â Â Â Â Â Â operands[2] = force_reg (DImode, operands[2]);
> -Â Â Â Â }
> -Â "
> -)
> +")
>
> Â (define_insn_and_split "*arm_subdi3"
> Â Â [(set (match_operand:DIÂ Â Â Â Â Â Â Â Â Â 0 "arm_general_register_operand"
> "=&r,&r,&r")
> Â (minus:DI (match_operand:DI 1 "arm_general_register_operand" "0,r,0")
> Â Â (match_operand:DI 2 "arm_general_register_operand" "r,0,0")))
> Â Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_32BIT && !TARGET_NEON"
> +Â "TARGET_32BIT"
> Â Â "#"Â ; "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2"
> -Â "&& (!TARGET_IWMMXT || reload_completed)"
> +Â "TARGET_32BIT"
> Â Â [(parallel [(set (reg:CC CC_REGNUM)
> Â Â Â (compare:CC (match_dup 1) (match_dup 2)))
> Â Â Â Â Â Â (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
> @@ -4164,13 +4141,6 @@ (define_expand "negdi2"
> Â (neg:DI (match_operand:DI 1 "s_register_operand")))
> Â Â Â Â (clobber (reg:CC CC_REGNUM))])]
> Â Â "TARGET_EITHER"
> -Â {
> -Â Â Â if (TARGET_NEON)
> -Â Â Â Â Â {
> -Â Â Â Â Â Â Â emit_insn (gen_negdi2_neon (operands[0], operands[1]));
> -DONE;
> -Â Â Â Â Â }
> -Â }
> Â )
>
> Â ;; The constraints here are to prevent a *partial* overlap (where %Q0
> == %R1).
> @@ -4182,7 +4152,7 @@ (define_insn_and_split "*negdi2_insn"
> Â Â "TARGET_32BIT"
> Â Â "#"; rsbs %Q0, %Q1, #0; rsc %R0, %R1, #0Â Â Â Â Â Â (ARM)
> Â ; negs %Q0, %Q1Â Â Â ; sbc %R0, %R1, %R1, lsl #1 (Thumb-2)
> -Â "&& reload_completed"
> +Â "TARGET_32BIT"
> Â Â [(parallel [(set (reg:CC CC_REGNUM)
> Â Â Â (compare:CC (const_int 0) (match_dup 1)))
> Â Â Â Â Â Â (set (match_dup 0) (minus:SI (const_int 0) (match_dup 1)))])
> @@ -4714,25 +4684,17 @@ (define_insn "extend<mode>di2"
> Â (define_split
> Â Â [(set (match_operand:DI 0 "s_register_operand" "")
> Â Â Â Â Â Â Â Â (zero_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
> -Â "TARGET_32BIT && reload_completed && !IS_VFP_REGNUM (REGNO
> (operands[0]))"
> +Â "TARGET_32BIT"
> Â Â [(set (match_dup 0) (match_dup 1))]
> Â {
> Â Â rtx lo_part = gen_lowpart (SImode, operands[0]);
> Â Â machine_mode src_mode = GET_MODE (operands[1]);
>
> -Â if (REG_P (operands[0])
> -Â Â Â Â Â && !reg_overlap_mentioned_p (operands[0], operands[1]))
> -Â Â Â emit_clobber (operands[0]);
> -Â if (!REG_P (lo_part) || src_mode != SImode
> -Â Â Â Â Â || !rtx_equal_p (lo_part, operands[1]))
> -Â Â Â {
> -Â Â Â Â Â if (src_mode == SImode)
> -Â Â Â Â Â Â Â emit_move_insn (lo_part, operands[1]);
> -Â Â Â Â Â else
> -Â Â Â Â Â Â Â emit_insn (gen_rtx_SET (lo_part,
> -gen_rtx_ZERO_EXTEND (SImode, operands[1])));
> -Â Â Â Â Â operands[1] = lo_part;
> -Â Â Â }
> +Â if (src_mode == SImode)
> +Â Â Â emit_move_insn (lo_part, operands[1]);
> +Â else
> +Â Â Â emit_insn (gen_rtx_SET (lo_part,
> +Â Â Â gen_rtx_ZERO_EXTEND (SImode, operands[1])));
> Â Â operands[0] = gen_highpart (SImode, operands[0]);
> Â Â operands[1] = const0_rtx;
> Â })
> @@ -4740,26 +4702,18 @@ (define_split
> Â (define_split
> Â Â [(set (match_operand:DI 0 "s_register_operand" "")
> Â Â Â Â Â Â Â Â (sign_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
> -Â "TARGET_32BIT && reload_completed && !IS_VFP_REGNUM (REGNO
> (operands[0]))"
> +Â "TARGET_32BIT"
> Â Â [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
> Â {
> Â Â rtx lo_part = gen_lowpart (SImode, operands[0]);
> Â Â machine_mode src_mode = GET_MODE (operands[1]);
>
> -Â if (REG_P (operands[0])
> -Â Â Â Â Â && !reg_overlap_mentioned_p (operands[0], operands[1]))
> -Â Â Â emit_clobber (operands[0]);
> -
> -Â if (!REG_P (lo_part) || src_mode != SImode
> -Â Â Â Â Â || !rtx_equal_p (lo_part, operands[1]))
> -Â Â Â {
> -Â Â Â Â Â if (src_mode == SImode)
> -Â Â Â Â Â Â Â emit_move_insn (lo_part, operands[1]);
> -Â Â Â Â Â else
> -Â Â Â Â Â Â Â emit_insn (gen_rtx_SET (lo_part,
> -gen_rtx_SIGN_EXTEND (SImode, operands[1])));
> -Â Â Â Â Â operands[1] = lo_part;
> -Â Â Â }
> +Â if (src_mode == SImode)
> +Â Â Â emit_move_insn (lo_part, operands[1]);
> +Â else
> +Â Â Â emit_insn (gen_rtx_SET (lo_part,
> +Â Â Â gen_rtx_SIGN_EXTEND (SImode, operands[1])));
> +Â operands[1] = lo_part;
> Â Â operands[0] = gen_highpart (SImode, operands[0]);
> Â })
>
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
> index
> 757f2c0f5377148c770e061849424aed924a7d7a..0c1ee746b6ada4f83040cd1717f17bef03dc2264
> 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -527,32 +527,6 @@ (define_insn "add<mode>3_fp16"
> Â Â Â Â (const_string "neon_add<q>")))]
> Â )
>
> -(define_insn "adddi3_neon"
> -Â [(set (match_operand:DI 0 "s_register_operand"
> "=w,?&r,?&r,?w,?&r,?&r,?&r")
> -Â Â Â Â Â Â Â (plus:DI (match_operand:DI 1 "s_register_operand"
> "%w,0,0,w,r,0,r")
> -Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (match_operand:DI 2 "arm_adddi_operand"Â Â Â Â
> "w,r,0,w,r,Dd,Dd")))
> -Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_NEON"
> -{
> -Â switch (which_alternative)
> -Â Â Â {
> -Â Â Â case 0: /* fall through */
> -Â Â Â case 3: return "vadd.i64\t%P0, %P1, %P2";
> -Â Â Â case 1: return "#";
> -Â Â Â case 2: return "#";
> -Â Â Â case 4: return "#";
> -Â Â Â case 5: return "#";
> -Â Â Â case 6: return "#";
> -Â Â Â default: gcc_unreachable ();
> -Â Â Â }
> -}
> -Â [(set_attr "type" "neon_add,multiple,multiple,neon_add,\
> -Â Â Â Â multiple,multiple,multiple")
> -Â Â (set_attr "conds" "*,clob,clob,*,clob,clob,clob")
> -Â Â (set_attr "length" "*,8,8,*,8,8,8")
> -Â Â (set_attr "arch" "neon_for_64bits,*,*,avoid_neon_for_64bits,*,*,*")]
> -)
> -
> Â (define_insn "*sub<mode>3_neon"
> Â Â [(set (match_operand:VDQ 0 "s_register_operand" "=w")
> Â Â Â Â Â Â Â Â (minus:VDQ (match_operand:VDQ 1 "s_register_operand" "w")
> @@ -587,29 +561,6 @@ (define_insn "sub<mode>3_fp16"
> Â [(set_attr "type" "neon_sub<q>")]
> Â )
>
> -(define_insn "subdi3_neon"
> -Â [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?&r,?w")
> -Â Â Â Â Â Â Â (minus:DI (match_operand:DI 1 "s_register_operand" "w,0,r,0,w")
> -Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (match_operand:DI 2 "s_register_operand" "w,r,0,0,w")))
> -Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_NEON"
> -{
> -Â switch (which_alternative)
> -Â Â Â {
> -Â Â Â case 0: /* fall through */
> -Â Â Â case 4: return "vsub.i64\t%P0, %P1, %P2";
> -Â Â Â case 1: /* fall through */
> -Â Â Â case 2: /* fall through */
> -   case 3: return "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2";
> -Â Â Â default: gcc_unreachable ();
> -Â Â Â }
> -}
> -Â [(set_attr "type" "neon_sub,multiple,multiple,multiple,neon_sub")
> -Â Â (set_attr "conds" "*,clob,clob,clob,*")
> -Â Â (set_attr "length" "*,8,8,8,*")
> -Â Â (set_attr "arch" "neon_for_64bits,*,*,*,avoid_neon_for_64bits")]
> -)
> -
> Â (define_insn "*mul<mode>3_neon"
> Â Â [(set (match_operand:VDQW 0 "s_register_operand" "=w")
> Â Â Â Â Â Â Â Â (mult:VDQW (match_operand:VDQW 1 "s_register_operand" "w")
> @@ -886,46 +837,6 @@ (define_insn "neg<mode>2"
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (const_string "neon_neg<q>")))]
> Â )
>
> -(define_insn "negdi2_neon"
> -Â [(set (match_operand:DI 0 "s_register_operand" "=&w, w,r,&r")
> -(neg:DI (match_operand:DI 1 "s_register_operand" "Â w, w,0, r")))
> -Â Â (clobber (match_scratch:DI 2 "= X,&w,X, X"))
> -Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_NEON"
> -Â "#"
> -Â [(set_attr "length" "8")
> -Â Â (set_attr "type" "multiple")]
> -)
> -
> -; Split negdi2_neon for vfp registers
> -(define_split
> -Â [(set (match_operand:DI 0 "s_register_operand" "")
> -(neg:DI (match_operand:DI 1 "s_register_operand" "")))
> -Â Â (clobber (match_scratch:DI 2 ""))
> -Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO
> (operands[0]))"
> -Â [(set (match_dup 2) (const_int 0))
> -Â Â (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1)))
> -Â Â Â Â Â (clobber (reg:CC CC_REGNUM))])]
> -Â {
> -Â Â Â if (!REG_P (operands[2]))
> -Â Â Â Â Â operands[2] = operands[0];
> -Â }
> -)
> -
> -; Split negdi2_neon for core registers
> -(define_split
> -Â [(set (match_operand:DI 0 "s_register_operand" "")
> -(neg:DI (match_operand:DI 1 "s_register_operand" "")))
> -Â Â (clobber (match_scratch:DI 2 ""))
> -Â Â (clobber (reg:CC CC_REGNUM))]
> -Â "TARGET_32BIT && reload_completed
> -Â Â && arm_general_register_operand (operands[0], DImode)"
> -Â [(parallel [(set (match_dup 0) (neg:DI (match_dup 1)))
> -Â Â Â Â Â (clobber (reg:CC CC_REGNUM))])]
> -Â ""
> -)
> -
> Â (define_insn "<absneg_str><mode>2"
> Â Â [(set (match_operand:VH 0 "s_register_operand" "=w")
> Â Â Â Â (ABSNEG:VH (match_operand:VH 1 "s_register_operand" "w")))]
More information about the Gcc-patches
mailing list