This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [Aarch64] Variable shift count truncation issues
- From: Richard Sandiford <richard dot sandiford at linaro dot org>
- To: Michael Collison <Michael dot Collison at arm dot com>
- Cc: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, Christophe Lyon <christophe dot lyon at linaro dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Thu, 15 Jun 2017 14:39:39 +0100
- Subject: Re: [PATCH] [Aarch64] Variable shift count truncation issues
- Authentication-results: sourceware.org; auth=none
- References: <AM5PR0802MB2610A4E64B6EB028F6EFFB4483E50@AM5PR0802MB2610.eurprd08.prod.outlook.com> <87d1az1wi2.fsf@linaro.org> <HE1PR0802MB2377EEF9750DC4A795C79C8695F90@HE1PR0802MB2377.eurprd08.prod.outlook.com> <HE1PR0802MB2377B1CFAB6B51E15E76BD5295C00@HE1PR0802MB2377.eurprd08.prod.outlook.com>
Michael Collison <Michael.Collison@arm.com> writes:
> +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
> + [(set (match_operand:GPI 0 "register_operand" "=r")
> + (SHIFT:GPI
> + (match_operand:GPI 1 "register_operand" "r")
> + (match_operator 4 "subreg_lowpart_operator"
> + [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r")
> + (match_operand 3 "const_int_operand" "n")))])))
> + (clobber (match_scratch:SI 5 "=&r"))]
> + "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)"
> + "#"
> + "&& reload_completed"
> + [(const_int 0)]
> + {
> + emit_insn (gen_negsi2 (operands[5], operands[2]));
> +
> + rtx and_op = gen_rtx_AND (SImode, operands[5], operands[3]);
> + rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
> + SUBREG_BYTE (operands[4]));
> + emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp));
> + DONE;
> + }
> +)
Thanks, I agree this looks correct from the split/reload_completed POV.
I think we can go one better though, either:
(a) Still allow the split when !reload_completed, and use:
if (GET_MODE (operands[5]) == SCRATCH)
operands[5] = gen_reg_rtx (SImode);
This will allow the individual instructions to be scheduled by sched1.
(b) Continue to restrict the split to reload_completed, change operand 0
to =&r so that it can be used as a temporary, and drop operand 5 entirely.
Or perhaps do both:
(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
[(set (match_operand:GPI 0 "register_operand" "=&r")
(SHIFT:GPI
(match_operand:GPI 1 "register_operand" "r")
(match_operator 4 "subreg_lowpart_operator"
[(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r")
(match_operand 3 "const_int_operand" "n")))])))]
"((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)"
"#"
"&& 1"
[(const_int 0)]
{
rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (<GPI:MODE>mode)
: operands[0]);
emit_insn (gen_negsi2 (tmp, operands[2]));
rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]);
rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
SUBREG_BYTE (operands[4]));
emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp));
DONE;
}
)
Sorry for the run-around. I should have realised earlier that these
patterns didn't really need a distinct register after RA.
Thanks,
Richard