[PATCH] rs6000/test: Add emulated gather test case

Kewen.Lin linkw@linux.ibm.com
Thu Nov 25 03:20:57 GMT 2021


Hi,

This patch is to add a test case similar to the one in i386
to add testing coverage for 510.parest_r hotspots.

As evaluated, the emulated gather capability of vectorizer
(r12-2733) can help to speed up SPEC2017 510.parest_r on
Power8/9/10 by 5% to 9% with option sets Ofast unroll and
Ofast lto.  But since rs6000 missed unpacking support for
unsigned int before, it can only vectorize the hotspots
until r12-3134.

By checking why r12-2733 doesn't immediately show its impact
for SPEC2017 510.parest_r while the associated test case
already can get vectorized on rs6000 at that time, I realized
the associated test case use int as INDEXTYPE while the
hotspots actually use unsigned int.  So different from the one
in i386, this patch uses unsigned int as INDEXTYPE since the
unpack support for unsigned int (r12-3134) also matters for
the hotspots vectorization.  Not sure if it's worth to updating
the one in i386 as well?

Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-----
gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/vect-gather-1.c: New test.

diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
new file mode 100644
index 00000000000..bf98045ab03
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* Profitable from Power8 since it supports efficient unaligned load.  */
+/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */
+
+#ifndef INDEXTYPE
+#define INDEXTYPE unsigned int
+#endif
+double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
+	    double *luval, double *dst)
+{
+  double res = 0;
+  for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
+        res += *luval * dst[*col];
+  return res;
+}
+
+/* With gather emulation this should be profitable to vectorize from Power8.  */
+/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
+/* The index vector loads and promotions should be scalar after forwprop.  */
+/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */
--
2.25.1



More information about the Gcc-patches mailing list