[Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm
zhongyunde at huawei dot com
gcc-bugzilla@gcc.gnu.org
Thu Jun 30 12:29:46 GMT 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106146
Bug ID: 106146
Summary: [instcombine] a redundant movprfx insn compare to llvm
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: zhongyunde at huawei dot com
Target Milestone: ---
* test case, gcc has a redundant movprfx insn in the kernel loop body, see
detail https://gcc.godbolt.org/z/8vG4PzM18.
```
#include <arm_sve.h>
#define ARRAY_ALIGNMENT 64
#define LEN_2D 128ll
#define LEN_1D 8000ll
#define iterations 10000
typedef float real_t;
__attribute__((aligned(ARRAY_ALIGNMENT))) real_t a[LEN_1D],b[LEN_1D];
void s113_tuned(void) {
for (int nl = 0; nl < 4*iterations; nl++) {
int64_t i = 1;
svbool_t pg = svwhilelt_b32(i, LEN_1D);
svfloat32_t a0v = svdup_f32(a[0]);
do {
svfloat32_t bv = svld1_f32(pg, &b[i]);
svfloat32_t res = svadd_z(pg, bv, a0v);
svst1(pg, &a[i], res);
i += svcntw();
pg = svwhilelt_b32(i, LEN_1D);
} while (svptest_any(svptrue_b32(), pg));
}
return;
}
```
* gcc's kernel loop
```
.L2:
ld1w z0.s, p0/z, [x3, x0, lsl 2]
movprfx z0.s, p0/z, z0.s
fadd z0.s, p0/m, z0.s, z1.s
st1w z0.s, p0, [x1, x0, lsl 2]
incw x0
whilelt p0.s, x0, x2
b.any .L2
```
* llvm's kernel loop:
```
.LBB0_2: // Parent Loop BB0_1 Depth=1
ld1w { z1.s }, p2/z, [x13, x14, lsl #2]
fadd z1.s, p2/m, z1.s, z0.s
st1w { z1.s }, p2, [x12, x14, lsl #2]
add x14, x10, x14
whilelt p2.s, x14, x9
b.ne .LBB0_2
```
More information about the Gcc-bugs
mailing list