[Bug target/97019] New: rs6000:redundant rldicr fed to lvx/stvx
linkw at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri Sep 11 10:29:20 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97019
Bug ID: 97019
Summary: rs6000:redundant rldicr fed to lvx/stvx
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: linkw at gcc dot gnu.org
Target Milestone: ---
When we do the early expansion for altivec built-in function vec_ld/vec_st, we
can probably leave some redundant rldicr x,y,0,59 which aims to AND (-16) for
the vector access address, since the lvx/stvx will do the aligned and with -16
themselves, they are useless.
===== test case ====
extern int a, b, c;
extern vector unsigned long long ev5, ev6, ev7, ev8;
int test(unsigned char *pe) {
vector unsigned long long v1, v2, v3, v4, v9;
vector unsigned long long v5 = ev5;
vector unsigned long long v6 = ev6;
vector unsigned long long v7 = ev7;
vector unsigned long long v8 = ev8;
unsigned char *e = pe;
do {
if (a) {
asm("memory");
v1 = __builtin_vec_ld(16, (unsigned long long *)e);
v2 = __builtin_vec_ld(32, (unsigned long long *)e);
v3 = __builtin_vec_ld(48, (unsigned long long *)e);
e = e + 8;
for (int i = 0; i < a; i++) {
v4 = v5;
v5 = __builtin_crypto_vpmsumd(v1, v6);
v6 = __builtin_crypto_vpmsumd(v2, v7);
v7 = __builtin_crypto_vpmsumd(v3, v8);
e = e + 8;
}
}
v5 = __builtin_vec_ld(16, (unsigned long long *)e);
v6 = __builtin_vec_ld(32, (unsigned long long *)e);
v7 = __builtin_vec_ld(48, (unsigned long long *)e);
if (c)
b = 1;
} while (b);
v9 = v4;
int p = __builtin_unpack_vector_int128((vector __int128_t)v9, 0);
return p;
}
==== command ====
-m64 -O2 -mcpu=power8
Currently the function find_alignment_op in RTL swaps pass cares the case where
have one single AND operation definition, we can extend it to check all
definitions are AND operations and aligned with -16B.
More information about the Gcc-bugs
mailing list