This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH][ARM] PR/66433: Reduce cost of memory instructions with autoincrement
- From: Yury Usishchev <y dot usishchev at samsung dot com>
- To: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Cc: Vyacheslav Barinov <v dot barinov at samsung dot com>
- Date: Tue, 16 Jun 2015 17:04:42 +0300
- Subject: [PATCH][ARM] PR/66433: Reduce cost of memory instructions with autoincrement
- Authentication-results: sourceware.org; auth=none
Hello!
Following patch fixes PR target/66433.
As described in PR, cost of memory operation with autoincrement is
considered to be greater than same operation without autoincrement. This
causes auto-inc-dec pass not to optimize vector memory operations like
vld and vst.
Bootstrapped and regtested on armv7l-linux-gnueabi on trunk.
OK for trunk?
--
BR,
Yury Usishchev
gcc/
2015-06-16 Yury Usishchev <y.usishchev@samsung.com>
PR target/66433
* config/arm/arm.c (arm_new_rtx_costs): Reduce cost of memory instructions
with autoincrement.
gcc/testsuite/
2015-06-16 Yury Usishchev <y.usishchev@samsung.com>
PR target/66433
* gcc.target/arm/pr66433.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f5050cb..a8dc0ed 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9444,7 +9444,9 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
case MEM:
/* A memory access costs 1 insn if the mode is small, or the address is
a single register, otherwise it costs one insn per word. */
- if (REG_P (XEXP (x, 0)))
+ if (REG_P (XEXP (x, 0))
+ || (GET_RTX_CLASS (GET_CODE (XEXP (x, 0))) == RTX_AUTOINC
+ && REG_P (XEXP (XEXP (x, 0), 0))))
*cost = COSTS_N_INSNS (1);
else if (flag_pic
&& GET_CODE (XEXP (x, 0)) == PLUS
diff --git a/gcc/testsuite/gcc.target/arm/pr66433.c b/gcc/testsuite/gcc.target/arm/pr66433.c
new file mode 100644
index 0000000..22ba158
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr66433.c
@@ -0,0 +1,21 @@
+/* Test the optimization of `vld*' ARM NEON intrinsic with autoincrement. */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+void test_vld_autoinc (uint32_t *__restrict__ a, uint32_t *__restrict__ b)
+{
+ int i;
+ for(i = 0; i < 1000000; i++) {
+ vst1q_u32 (b, vld1q_u32 (a));
+ a += 4;
+ b += 4;
+ }
+}
+
+/* { dg-final { scan-assembler "vld1\.32.*!.*\n" } } */
+/* { dg-final { scan-assembler "vst1\.32.*!.*\n" } } */