PATCH: Add SSE4.1 support

Jan Hubicka hubicka@ucw.cz
Wed Apr 18 22:25:00 GMT 2007


> --- gcc/config/i386/constraints.md.sni	2007-03-27 18:03:26.000000000 -0700
> +++ gcc/config/i386/constraints.md	2007-04-16 07:10:53.000000000 -0700
> @@ -84,13 +84,24 @@
>   "Any SSE register.")
>  
>  ;; We use the Y prefix to denote any number of conditional register sets:
> +;;  0	%xmm0
>  ;;  2	SSE2 enabled
> +;;  4	SSE2 inter-unit moves disabled
>  ;;  i	SSE2 inter-unit moves enabled
>  ;;  m	MMX inter-unit moves enabled
> +;;  n	%xmm1 - %xmm15
> +
> +(define_register_constraint "Y0" "XMM0REG"
> + "The @code{xmm0} register.")

At least Y0/Yn should be documented in texinfo docs so users writting
assembly can use it.
>  
>  (define_register_constraint "Y2" "TARGET_SSE2 ? SSE_REGS : NO_REGS"
>   "@internal Any SSE register, when SSE2 is enabled.")
>  
> +(define_register_constraint "Y4"
> + "TARGET_SSE4_1 && !TARGET_INTER_UNIT_MOVES ? SSE_REGS : NO_REGS"
> + "@internal Any SSE register, when SSE4.1 is enabled and inter-unit moves
> +  are disabled.")
> +
>  (define_register_constraint "Yi"
>   "TARGET_SSE2 && TARGET_INTER_UNIT_MOVES ? SSE_REGS : NO_REGS"
>   "@internal Any SSE register, when SSE2 and inter-unit moves are enabled.")
> @@ -99,6 +110,9 @@
>   "TARGET_MMX && TARGET_INTER_UNIT_MOVES ? MMX_REGS : NO_REGS"
>   "@internal Any MMX register, when inter-unit moves are enabled.")
>  
> +(define_register_constraint "Yn" "XMMN_REGS"
> + "Any SSE register except for @code{xmm0}.")
> +
>  ;; Integer constant constraints.
>  (define_constraint "I"
>    "Integer constant in the range 0 @dots{} 31, for 32-bit shifts."
> +int
> +ix86_sse_4_operands_ok (rtx operands[4])
> +{
> +  rtx dst = operands[0];
> +  rtx src1 = operands[1];
> +  rtx src2 = operands[2];
> +  rtx src3 = operands[3];
> +
> +  /* The destination and the first source operands must be register.
> +     The second source operand must be register or in memory.  */
> +  if (!REG_P (dst)
> +      || !REG_P (src1)
> +      || (!REG_P (src2) && !MEM_P (src2)))

I think you want register_operand/nonimmediate_operand checks here so
subregs are not being banned.
> +  /* If the third source is register, the destination and both
> +     sources must not be FIRST_SSE_REG since the third source
> +     will be FIRST_SSE_REG.  */
> +  if (REG_P (src3)
> +      && (REGNO (dst) == FIRST_SSE_REG
> +	  || REGNO (src1) == FIRST_SSE_REG
and true_regnum here.
> +	  || (REG_P (src2) && REGNO (src2) == FIRST_SSE_REG)))

Should not be pseudo register allowed here?
> -			"=r  ,m  ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
> +			"=r  ,m  ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x,?r ")
>  	(match_operand:SI 1 "general_operand"
> -			"rinm,rin,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
> +			"rinm,rin,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m ,*Y4"))]
>    "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
>  {
>    switch (get_attr_type (insn))
>      {
> +    case TYPE_SSELOG:
> +      return "pextrd\t{$0, %1, %0|%0, %1, $0}";

This behaviour is a bit funny - ie when INTER_UNIT_MOVES is allowed, we
use mov instruction, but when it is disabled we go via memory while when
it is disabled and wehn enabled, we use pextrd that is likely just
having longer encoding.

I would say that on !INTER_UNIT_MOVES the pextrd instruction with
integer register operand should probably be also banned (ie expanded
into memory store) that might result in slightly easier implementation
and not need for extra constrain doubleletter.

But I am fine with the current implementation until we see how real
hardware behave.
One might consider also extra expnse on reload having to deal with exra
alternative on common move pattern, but it is probably quite minnor.
> +;; Match exactly one bit in 2-bit mask.
> +(define_predicate "const_pow2_1_to_2_operand"
> +  (match_code "const_int")
> +{
> +  unsigned int log = exact_log2 (INTVAL (op));
> +  return log <= 1;

I am not sure I like the trick of casting -1 into unsigned. At least a
comment on extract_log2 returning -1 should be here.
Perhaps just listing the two allowed values here is both faster and
cleaner.

Rest of patch seems all fine, I will just need a bit more time to go
through all the new instruction templates.

Honza



More information about the Gcc-patches mailing list