[Bug target/66433] New: Arm NEON postincrement optimization missed
y.usishchev at samsung dot com
gcc-bugzilla@gcc.gnu.org
Fri Jun 5 15:00:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66433
Bug ID: 66433
Summary: Arm NEON postincrement optimization missed
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: y.usishchev at samsung dot com
Target Milestone: ---
Created attachment 35701
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35701&action=edit
test with vld and vst
GCC from trunk, configured with --target=armv7l-tizen-linux-gnueabi with
options "-O2 -mfpu=neon" on attached testcase does not generate autoincrement
for vld/vst instructions.
auto-inc-dec pass ignores possibilities of optimization vld/vst instructions:
for code
for () { //some loop
s0_32x4 = vld1q_u32(s);
s1_32x4 = vld1q_u32(s+4);
s+=8;
...
}
gcc generates
vld1.32 {d6-d7}, [r1]
add.w r4, r1, #16
adds r1, #32
vld1.32 {d28-d29}, [r4]
instead of
vld1.32 {d6-d7}, [r1]!
vld1.32 {d28-d29}, [r1]!
This is caused by presumably wrong cost estimation:
vld1.32 instruction without increment costs 4, but with increment its cost is
16 (gcc/config/arm/arm.c:9415):
case MEM:
if (REG_P (XEXP (x, 0)))
*cost = COSTS_N_INSNS (1);
...
else
*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
More information about the Gcc-bugs
mailing list