This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Add SSE4.2 support - pcmpstr part


On 5/30/07, H. J. Lu <hjl@lucon.org> wrote:

Here is the updated patch. I added OPTION_MASK_ISA_XXX_UNSET so
that we only need to change one macro when we add a new ISA.  Tested
on Linux/Intel64.

I think we need 3 patterns for pcmp* insns:


a) This one when index is required:

(define_insn "sse4_2_pcmpistri"
 [(set (reg  ...???... )
	(unspec:SI
	  [(match_operand:V16QI 0 "register_operand" "x")
	   (reg:SI 0)
	   (match_operand:V16QI 1 "nonimmediate_operand" "xm")
	   (reg:SI 1)
	   (match_operand:SI 2 "const_0_to_255_operand" "n")]
	  UNSPEC_PCMPESTR))
  (clobber (reg:CC FLAGS_REG))]

b) Similar when we need mask:

(define_insn "sse4_2_pcmpestrm"
 [(set (reg ...???... )
	(unspec:V16QI
	  [(match_operand:V16QI 0 "register_operand" "x")
	   (reg:SI 0)
	   (match_operand:V16QI 1 "nonimmediate_operand" "xm")
	   (reg:SI 1)
	   (match_operand:SI 2 "const_0_to_255_operand" "n")]
	  UNSPEC_PCMPESTR))
 (clobber (reg:CC FLAGS_REG))]

c) CC setting insn, CConly (only "mask" one is shown here...)

 [(set (reg:CC FLAGS_REG)
	(unspec:CC
	  [(match_operand:V16QI 0 "register_operand" "x")
	   (reg:SI 0)
	   (match_operand:V16QI 1 "nonimmediate_operand" "xm")
	   (reg:SI 1)
	   (match_operand:SI 2 "const_0_to_255_operand" "n")]
	  UNSPEC_PCMPESTR))
  (clobber (match_scratch ...???... ))]

d) IMO we also need CConly + reg setting ins, so combine can combine
two succesive instructions into one (thus doing automatically the part
you process "manually"):

 [(set (reg ...???... )
	(unspec:V16QI
	  [(match_operand:V16QI 0 "register_operand" "x")
	   (reg:SI 0)
	   (match_operand:V16QI 1 "nonimmediate_operand" "xm")
	   (reg:SI 1)
	   (match_operand:SI 2 "const_0_to_255_operand" "n")]
	  UNSPEC_PCMPESTR))
  (set (reg:CC FLAGS_REG)
	(unspec:CC
	  [(match_dup 0)
	   (reg:SI 0)
	   (match_dup 1)
	   (reg:SI 1)
	   (match_dup 2)]
	  UNSPEC_PCMPESTR))]

and

(define_insn "sse4_2_pcmpestri"
 [(set (reg ...???... )
	(unspec:SI
	  [(match_operand:V16QI 0 "register_operand" "x")
	   (reg:SI 0)
	   (match_operand:V16QI 1 "nonimmediate_operand" "xm")
	   (reg:SI 1)
	   (match_operand:SI 2 "const_0_to_255_operand" "n")]
	  UNSPEC_PCMPESTR))
  (set (reg:CC FLAGS_REG)
	(unspec:CC
	  [(match_dup 0)
	   (reg:SI 0)
	   (match_dup 1)
	   (reg:SI 1)
	   (match_dup 2)]
	  UNSPEC_PCMPESTR))]

Now for the hardest part ... Since ...???... can be either xmm or ecx,
IMO the best way is to create new register class, so register
allocator is free to choose either register, whichever fits best
(hopefully ;).

Also, I see no reason, why we need two new CC modes. CCmode mode is
enough to use as a generic comparison mode.

Uros,


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]