This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Updated the patch per Richard's suggestions to allow scheduling of instructions before reload. Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk? 2017-05-22 Kyrylo Tkachov <kyrylo.tkachov@arm.com> Michael Collison <michael.collison@arm.com> PR target/70119 * config/aarch64/aarch64.md (*aarch64_<optab>_reg_<mode>3_mask1): New pattern. (*aarch64_reg_<mode>3_neg_mask2): New pattern. (*aarch64_reg_<mode>3_minus_mask): New pattern. (*aarch64_<optab>_reg_di3_mask2): New pattern. * config/aarch64/aarch64.c (aarch64_rtx_costs): Account for cost of shift when the shift amount is masked with constant equal to the size of the mode. * config/aarch64/predicates.md (subreg_lowpart_operator): New predicate. 2016-05-22 Kyrylo Tkachov <kyrylo.tkachov@arm.com> Michael Collison <michael.collison@arm.com> PR target/70119 * gcc.target/aarch64/var_shift_mask_1.c: New test. -----Original Message----- From: Richard Sandiford [mailto:richard.sandiford@linaro.org] Sent: Thursday, June 15, 2017 6:40 AM To: Michael Collison <Michael.Collison@arm.com> Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>; Christophe Lyon <christophe.lyon@linaro.org>; GCC Patches <gcc-patches@gcc.gnu.org>; nd <nd@arm.com> Subject: Re: [PATCH] [Aarch64] Variable shift count truncation issues Michael Collison <Michael.Collison@arm.com> writes: > +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (SHIFT:GPI > + (match_operand:GPI 1 "register_operand" "r") > + (match_operator 4 "subreg_lowpart_operator" > + [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r") > + (match_operand 3 "const_int_operand" "n")))]))) > + (clobber (match_scratch:SI 5 "=&r"))] > + "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > + { > + emit_insn (gen_negsi2 (operands[5], operands[2])); > + > + rtx and_op = gen_rtx_AND (SImode, operands[5], operands[3]); > + rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op, > + SUBREG_BYTE (operands[4])); > + emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp)); > + DONE; > + } > +) Thanks, I agree this looks correct from the split/reload_completed POV. I think we can go one better though, either: (a) Still allow the split when !reload_completed, and use: if (GET_MODE (operands[5]) == SCRATCH) operands[5] = gen_reg_rtx (SImode); This will allow the individual instructions to be scheduled by sched1. (b) Continue to restrict the split to reload_completed, change operand 0 to =&r so that it can be used as a temporary, and drop operand 5 entirely. Or perhaps do both: (define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2" [(set (match_operand:GPI 0 "register_operand" "=&r") (SHIFT:GPI (match_operand:GPI 1 "register_operand" "r") (match_operator 4 "subreg_lowpart_operator" [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r") (match_operand 3 "const_int_operand" "n")))])))] "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)" "#" "&& 1" [(const_int 0)] { rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (<GPI:MODE>mode) : operands[0]); emit_insn (gen_negsi2 (tmp, operands[2])); rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]); rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op, SUBREG_BYTE (operands[4])); emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp)); DONE; } ) Sorry for the run-around. I should have realised earlier that these patterns didn't really need a distinct register after RA. Thanks, Richard
Attachment:
pr5546v5.patch
Description: pr5546v5.patch
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |