This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [Aarch64] Variable shift count truncation issues


Michael Collison <Michael.Collison@arm.com> writes:
> +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
> +  [(set (match_operand:GPI 0 "register_operand" "=r")
> +	(SHIFT:GPI
> +	  (match_operand:GPI 1 "register_operand" "r")
> +	  (match_operator 4 "subreg_lowpart_operator"
> +	  [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r")
> +			   (match_operand 3 "const_int_operand" "n")))])))
> +   (clobber (match_scratch:SI 5 "=&r"))]
> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)"
> +  "#"
> +  "&& reload_completed"
> +  [(const_int 0)]
> +  {
> +    emit_insn (gen_negsi2 (operands[5], operands[2]));
> +
> +    rtx and_op = gen_rtx_AND (SImode, operands[5], operands[3]);
> +    rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
> +    		     		     SUBREG_BYTE (operands[4]));
> +    emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp));
> +    DONE;
> +  }
> +)

Thanks, I agree this looks correct from the split/reload_completed POV.
I think we can go one better though, either:

(a) Still allow the split when !reload_completed, and use:

     if (GET_MODE (operands[5]) == SCRATCH)
       operands[5] = gen_reg_rtx (SImode);

    This will allow the individual instructions to be scheduled by sched1.

(b) Continue to restrict the split to reload_completed, change operand 0
    to =&r so that it can be used as a temporary, and drop operand 5 entirely.

Or perhaps do both:

(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2"
  [(set (match_operand:GPI 0 "register_operand" "=&r")
	(SHIFT:GPI
	  (match_operand:GPI 1 "register_operand" "r")
	  (match_operator 4 "subreg_lowpart_operator"
	  [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r")
			   (match_operand 3 "const_int_operand" "n")))])))]
  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)"
  "#"
  "&& 1"
  [(const_int 0)]
  {
    rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (<GPI:MODE>mode)
               : operands[0]);
    emit_insn (gen_negsi2 (tmp, operands[2]));

    rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]);
    rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op,
    		     		     SUBREG_BYTE (operands[4]));
    emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp));
    DONE;
  }
)

Sorry for the run-around.  I should have realised earlier that these
patterns didn't really need a distinct register after RA.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]