[PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions

Michael Meissner meissner@linux.ibm.com
Thu Aug 27 02:41:42 GMT 2020


The following patches are a rewrite of the previous set of patches to add
support for the power10 IEEE 128-bit C minimum, C maximum, and compare/set mask
instructions that are similar to the instructions added in power9.

There are 4 patches in this series.

The first patch is a cosmetic patch.  In the previous patches, Segher said the
new functions should return bool instead of int.  In adding the support, I
noticed the two existing functions processing conditional moves
(rs6000_emit_cmove and rs6000_emit_int_cmove) returned int rather than bool.
The first patch just changes the return type and return statements to now
return bool.

The second patch renames the functions that generate the ISA 3.0 C minimum, C
maximum, and conditional move instructions to use a better name than just using
a _p9 suffix.  As Segher suggested, the names should be of the form
"maybe_emit" instead of "generate_", since both functions can fail.

The third patch adds the minimum and maximum support without adding the
conditional move support (the 4th patch will add the conditional move support).
Because of the NaN differences, the built-in functions will only generate these
instructions if -ffast-math is used.

The fourth patch adds the conditional move support.  In adding the conditional
move support, the optimizers will be able to convert things like:

	a = (b > c) ? b : c;

into the instructions.  Unlike the previous set of patches, this patch merges
together the scalar SF/DF conditional move with the scalar KF/TF conditional
move.  It extends the optimization that was previously used for SFmode and
DFmode to allow the comparison to be a different scalar floating point mode
than the move.  I.e.

	__float128 a, b, c;
	float x, y;

	/* ... */

	a = (x == y) ? b : c;

I did have to add an XXPERMDI if the comparison mode was SFmode or DFmode, and
the move mode is KFmode or TFmode (the XSCMP{EQ,GT,GE}DP instructions
explicitly set the bottom 64 bits of the vector register to 0).

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check these patches into the master
branch for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797


More information about the Gcc-patches mailing list