[Patch AArch64 1/2]Implement vcond_mask/vec_cmp patterns.

James Greenhalgh james.greenhalgh@arm.com
Tue Jun 21 14:33:00 GMT 2016


On Wed, Jun 15, 2016 at 09:21:29AM +0000, Bin Cheng wrote:
> Hi,
> According to review comments, I split the original patch @
> https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01182.html into two, as well as
> refined the comments.  Here is the first one implementing vcond_mask/vec_cmp
> patterns on AArch64.  These new patterns will be used in the second patch for
> vcond.
> 
> +;; Patterns comparing two vectors to produce a mask.
> +
> +;; Internal pattern for vec_cmp.  It returns expected result mask for
> +;; comparison operators other than NE.  For NE operator, it returns
> +;; the opposite result mask.  This is intended behavior so that we
> +;; can save one mask inverting instruction when using this pattern in
> +;; vcond patterns.  In this case, it is the caller's responsibility
> +;; to interpret and use the result mask correctly.

I'm not convinced by this design at all. Having a function that sometimes
generates the inverse of what you expect is difficult to follow, and I'm
not going to OK it unless it is really the only way of implementing these
hooks.

Can we not rely on the compiler spotting that it can simplify two
one's compliment that appear beside each other?

The integer case needing negation of NE is almost possible to follow, but
the float unordered/UNGE/etc. cases become very, very difficult to reason
about. Particularly seeing UNGT fall through to UNLT

> +;; Internal pattern for vec_cmp.  Similar to vec_cmp<mode<mode>_internal,
> +;; it returns the opposite result mask for operators NE, UNEQ, UNLT,
> +;; UNLE, UNGT, UNGE and UNORDERED.  This is intended behavior so that
> +;; we can save one mask inverting instruction when using this pattern
> +;; in vcond patterns.  In these cases, it is the caller's responsibility
> +;; to interpret and use the result mask correctly.
> +(define_expand "vec_cmp<mode><v_cmp_result>_internal"
> +  [(set (match_operand:<V_cmp_result> 0 "register_operand")
> +	(match_operator 1 "comparison_operator"
> +	    [(match_operand:VDQF 2 "register_operand")
> +	     (match_operand:VDQF 3 "nonmemory_operand")]))]
> +  "TARGET_SIMD"
> +{
> +  int use_zero_form = 0;
> +  enum rtx_code code = GET_CODE (operands[1]);
> +  rtx tmp = gen_reg_rtx (<V_cmp_result>mode);
> +
> +  rtx (*comparison) (rtx, rtx, rtx);
> +
> +  if (operands[3] == CONST0_RTX (<MODE>mode)
> +      && (code == LE || code == LT || code == GE || code == GT || code == EQ))
> +    {
> +      /* Some instructions have a form taking an immediate zero.  */
> +      use_zero_form = 1;
> +    }
> +  else if (!REG_P (operands[3]))
> +    {
> +      /* Make sure we can handle the last operand.  */
> +      operands[3] = force_reg (<MODE>mode, operands[3]);
> +    }
> +
> +  switch (code)
> +    {
> +    case LT:
> +      if (use_zero_form)
> +	{
> +	  comparison = gen_aarch64_cmlt<mode>;
> +	  break;
> +	}
> +      /* Else, fall through.  */
> +    case UNGE:
> +      std::swap (operands[2], operands[3]);
> +      /* Fall through.  */
> +    case UNLE:
> +    case GT:
> +      comparison = gen_aarch64_cmgt<mode>;
> +      break;
> +    case LE:
> +      if (use_zero_form)
> +	{
> +	  comparison = gen_aarch64_cmle<mode>;
> +	  break;
> +	}
> +      /* Else, fall through.  */
> +    case UNGT:
> +      std::swap (operands[2], operands[3]);
> +      /* Fall through.  */
> +    case UNLT:
> +    case GE:
> +      comparison = gen_aarch64_cmge<mode>;
> +      break;
> +    case NE:
> +    case EQ:
> +      comparison = gen_aarch64_cmeq<mode>;
> +      break;
> +    case UNEQ:
> +    case ORDERED:
> +    case UNORDERED:
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  switch (code)
> +    {
> +    case UNGT:
> +    case UNGE:
> +    case NE:
> +    case UNLT:
> +    case UNLE:
> +      /* FCM returns false for lanes which are unordered, so if we use
> +	 the inverse of the comparison we actually want to emit, then
> +	 revert the result, we will end up with the correct result.
> +	 Note that a NE NaN and NaN NE b are true for all a, b.
> +
> +	 Our transformations are:
> +	 a GE b -> !(b GT a)
> +	 a GT b -> !(b GE a)
> +	 a LE b -> !(a GT b)
> +	 a LT b -> !(a GE b)
> +	 a NE b -> !(a EQ b)
> +
> +	 See comment at the beginning of this pattern, we return the
> +	 opposite of result mask for these operators, and it's caller's
> +	 resonsibility to invert the mask.

These comments don't fit with the code (which does nothing other than
fall through).

> +
> +	 Fall through.  */
> +    case LT:
> +    case LE:
> +    case GT:
> +    case GE:
> +    case EQ:
> +      /* The easy case.  Here we emit one of FCMGE, FCMGT or FCMEQ.
> +	 As a LT b <=> b GE a && a LE b <=> b GT a.  Our transformations are:
> +	 a GE b -> a GE b
> +	 a GT b -> a GT b
> +	 a LE b -> b GE a
> +	 a LT b -> b GT a
> +	 a EQ b -> a EQ b
> +	 Note that there also exist direct comparison against 0 forms,
> +	 so catch those as a special case.  */
> +
> +      emit_insn (comparison (operands[0], operands[2], operands[3]));
> +      break;
> +
> +    case UNEQ:
> +      /* We first check (a > b ||  b > a) which is !UNEQ, reverting

s/reverting/inverting?

> +	 this result will then give us (a == b || a UNORDERED b).
> +
> +	 See comment at the beginning of this pattern, we return the
> +	 opposite of result mask for this operator, and it's caller's
> +	 resonsibility to invert the mask.  */

Yes, this is exactly what I'm concerned about. You mention the need to
invert the mask here, then leave it for the caller to do. This is not
easy code to read!

Is there really no clearer design that we can come up?

Thanks,
James



More information about the Gcc-patches mailing list