[PATCH V4] rs6000: Optimize cmp on rotated 16bits constant

Jiufu Guo guojiufu@linux.ibm.com
Thu Aug 25 12:11:31 GMT 2022


Hi,

Segher Boessenkool <segher@kernel.crashing.org> writes:

> On Wed, Aug 24, 2022 at 03:48:49PM +0800, Jiufu Guo wrote:
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> >> +  "TARGET_POWERPC64 && !reload_completed && can_create_pseudo_p ()
>> >
>> > reload_completed in splitters is almost always wrong.  It isn't any
>> > better if it is in the insn condition of a define_insn_and_split :-)
>> >
>> Thanks, 'can_create_pseudo_p' would be ok for this patch.
>> Or just FAIL, if !can_create_pseudo_p()?
>
> You usually can split fine if you cannot create new pseudos, by reusing
> existing registers.
>
> FAIL will cause an ICE: the RTL instruction does match, but will fail
> when trying to generate machine code for it.
>
Previous patch is using "gen_reg_rtx (DImode)" to generate a pseudo for
the rotated result to prevent orignal one being changed accidently.
So, an 'assert (can_create_pseudo_p ())' would catch it in after RA.

To enable this splitter works after RA, we may need to reserve one
register (clobber would be ok).  Such as below:

  [(set (pc)
	(if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r")
			    (match_operand:DI 2 "const_int_operand" "n"))
		      (label_ref (match_operand 0 ""))
		      (pc)))
  (clobber (match_scratch:DI 3 "=r"))
  (clobber (match_scratch:CCUNS 4 "=y"))]
  "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1
   && compare_rotate_immediate_p (UINTVAL (operands[2]))"
   "#"
   "&& 1"
  [(pc)]

>> >> +   && num_insns_constant (operands[2], DImode) > 1
>> >> +   && (rotate_from_leading_zeros_const (~UINTVAL (operands[2]), 49) > 0
>> >> +       || rotate_from_leading_zeros_const (UINTVAL (operands[2]), 48) > 0)"
>> > There must be a better way to describe this.
>> Will update this. I'm thinking to replace this with a meaning function,
>> maybe 'compare_rotate_immediate_p'.
>
> Thanks!
>
>> > Why is this doing a conditional branch at all?  Unpredictable
>> > conditional branches are extremely costly.
>> This optimization needs to check whether the comparison code is ne/eq or
>> not.  To get the comparison code, we need to check the parent insn of
>> the 'cmp' insn.  This is why conditional branch patterns in used here.
>> 
>> This patch should not change the information (about prediction) of the
>> branch insn. I'm  thinking of updating the patch to keep the 'note info
>> REG_BR_PROB' for the branch instruction.
>
> Ah, good.  Explain a bit about that?  In a code comment or in the commit
> message, whichever works best here.
>
Thanks! will add a comment for this.

BR,
Jeff(Jiufu)

> Thanks!
>
>
> Segher


More information about the Gcc-patches mailing list