This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, rs6000] Generate correct constant permutes using xxpermdi
- From: Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: dje dot gcc at gmail dot com
- Date: Fri, 22 Nov 2013 09:41:48 -0600
- Subject: [PATCH, rs6000] Generate correct constant permutes using xxpermdi
- Authentication-results: sourceware.org; auth=none
Hi,
Most of our constant vector permutes use the vperm instructions, but for
V2DImode and V2DFmode we use xxpermdi. This patch corrects the
generated xxpermdi to be correct for little endian, which fixes failures
of the test cases gcc.dg/torture/vshuf-v2d[fi].c. Note that we can't
fix this directly in the pattern for xxpermdi, because that pattern is
used by the corresponding intrinsic.
Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions. Ok for trunk?
Thanks,
Bill
2013-11-22 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct
for little endian.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 205243)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -30021,6 +30021,21 @@ rs6000_expand_vec_perm_const_1 (rtx target, rtx op
gcc_assert (GET_MODE_NUNITS (vmode) == 2);
dmode = mode_for_vector (GET_MODE_INNER (vmode), 4);
+ /* For little endian, swap operands and invert/swap selectors
+ to get the correct xxpermdi. The operand swap sets up the
+ inputs as a little endian array. The selectors are swapped
+ because they are defined to use big endian ordering. The
+ selectors are inverted to get the correct doublewords for
+ little endian ordering. */
+ if (!BYTES_BIG_ENDIAN)
+ {
+ int n;
+ perm0 = 3 - perm0;
+ perm1 = 3 - perm1;
+ n = perm0, perm0 = perm1, perm1 = n;
+ x = op0, op0 = op1, op1 = x;
+ }
+
x = gen_rtx_VEC_CONCAT (dmode, op0, op1);
v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1));
x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v));