[PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

Michael Meissner meissner@linux.ibm.com
Wed Aug 12 03:32:32 GMT 2020


On Tue, Aug 11, 2020 at 08:01:50PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Aug 11, 2020 at 12:23:07PM -0400, Michael Meissner wrote:
> > +  /* See if we can use the ISA 3.1 min/max/compare instructions for IEEE
> > +     128-bit floating point.  At present, don't worry about doing conditional
> > +     moves with different types for the comparison and movement (unlike SF/DF,
> > +     where you can do a conditional test between double and use float as the
> > +     if/then parts. */
> 
> Why is that?  That makes the code quite different too (harder to
> review), but also, it could just use existing code more.

It is a combinatorial expansion problem.

In order to do:

	double cmove (double x, double y, float a, float b)
	{
	  return (a == b) ? x : y;
	}

You need two pair of SFDF iterators.  You use SFDF for the target, and SFDF2
for the comparison.

	(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
	  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
		(if_then_else:SFDF
		 (match_operator:CCFP 1 "fpmask_comparison_operator"
			[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
			 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
		 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
		 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
	   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
	  "TARGET_P9_MINMAX"
	  "#"
	  ""
	  [(set (match_dup 6)
		(if_then_else:V2DI (match_dup 1)
				   (match_dup 7)
				   (match_dup 8)))
	   (set (match_dup 0)
		(if_then_else:SFDF (ne (match_dup 6)
				       (match_dup 8))
				   (match_dup 4)
				   (match_dup 5)))]
	{
	  if (GET_CODE (operands[6]) == SCRATCH)
	    operands[6] = gen_reg_rtx (V2DImode);

	  operands[7] = CONSTM1_RTX (V2DImode);
	  operands[8] = CONST0_RTX (V2DImode);
	}
	 [(set_attr "length" "8")
	  (set_attr "type" "vecperm")])

This means there are 4 versions of the pattern:
    1: DF target, DF comparison
    2: DF target, SF comparison
    3: SF target, DF comparison
    4: SF target, SF comparison

The dueling iterators was added in May 26th, 2016.  I believe we had separate
insns for the 4 cases for fsel before I added the xscmpeqdp, etc. support.

To grow this for IEEE 128, you would need two iterators, each with 4 elements
(DF, SF, KF, and optionally TF).  Thus you would need 16 patterns to represent
all of the patterns.  I can do this, I just didn't think it was worth it.

In addition, it becomes more involved due to constraints.  Currently for SF/DF
you essentially want to use any vector register at this point (when I added it
in 2016, there was still the support for limiting whether SF/DF could use the
Altivec registers).  But for IEEE 128-bit fp types, you want "v" for the
registers used for comparison.  You might want "v" for the contitional move
result, or you might want "wa".

I looked at combining the SF/DF and IEEE 128-bit cases, but it was becoming too
complex to describe these cases.  It is doable, but it makes the diffs even
harder to read.

> > +;; IEEE 128-bit min/max
> > +(define_insn "s<minmax><mode>3"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(fp_minmax:IEEE128
> > +	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
> > +	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> > +  "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (<MODE>mode)"
> > +  "xs<minmax>cqp %0,%1,%2"
> > +  [(set_attr "type" "fp")
> > +   (set_attr "size" "128")])
> 
> So why do we want the asymmetrical xsmincqp instead of xsminqp?  This
> should be documented at the very least.  (We also should have min/max
> that work correctly without -ffast-math, of course :-( ).

We don't have an xsminqp or xsmaxqp instruction in power10.  We only have
xsmincqp and xsmaxcqp instructions.

> > +;; IEEE 128-bit conditional move.  At present, don't worry about doing
> > +;; conditional moves with different types for the comparison and movement
> > +;; (unlike SF/DF, where you can do a conditional test between double and use
> > +;; float as the if/then parts.
> 
> (Unmatched brackets).  So why is this?  "At present" doesn't belong in
> the code btw, but in your patch description.

Typo.

> > +(define_insn_and_split "*mov<mode>cc_hardware"
> 
> Please don't use meaningless names like this.
> 
> > +(define_insn "*xxsel<mode>"

Yes, this was more due to the fact that I was cloning the SF/DF code, and that
doesn't have XXSEL support.  I'll look at that.

> This already exists for other vector modes.  Put it together with that
> please.  Same for all other insn patterns.
> 
> Ideally you can make those one pattern then, even.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> > @@ -0,0 +1,70 @@
> > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> > +/* { dg-require-effective-target power10_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> > +/* { dg-final { scan-assembler-not "xscmpuqp"  } } */
> > +/* { dg-final { scan-assembler     "xscmpeqqp" } } */
> > +/* { dg-final { scan-assembler     "xscmpgtqp" } } */
> > +/* { dg-final { scan-assembler     "xscmpgeqp" } } */
> > +/* { dg-final { scan-assembler     "xsmaxcqp"  } } */
> > +/* { dg-final { scan-assembler     "xsmincqp"  } } */
> > +/* { dg-final { scan-assembler     "xxsel"     } } */
> 
> Use \m \M please.  Don't ask for powerpc* in gcc.target/powerpc/ (it is
> implied there).  Why does it need lp64?  There should be some test for
> __float128, instead.
> 
> All in all, this is very hard to review :-(

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797


More information about the Gcc-patches mailing list