This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH v2] [ARM] PR61551 RFC: Improve costs for NEON addressing modes


Hi

Following on from previous discussion:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03464.html
and IRC.

I'm going to try once more to make the case for fixing the worst
problem for GCC 6, pending a rewrite of the address_cost
infrastructure for GCC 7. I think the rewrite you're describing is
overkill for this problem. There is one specific problem which I would
like to fix for GCC6, and that is the failure of the ARM backend to
allow use of post-indexed addressing for some vector modes.

Test program:
 #include <arm_neon.h>
char *f(char *p, int8x8x4_t v, int r)
{ vst4_s8(p, v); p+=32; return p; }

Desired code:
f:
        vst4.8  {d0-d3}, [r0]!
        bx      lr

Currently generated code:
f:
        mov     r3, r0
        adds    r0, r0, #32
        vst4.8  {d0-d3}, [r3]
        bx      lr

The auto-inc-dec phase does not apply in this case, because the costs
for RTXs which use POST_INC are wrong. Using gdb to poke at this, we
can see:

$ arm-unknown-linux-gnueabihf-gcc -mfpu=neon -O3 -S /tmp/foo.c
-wrapper gdb,--args
GNU gdb (Ubuntu 7.9-1ubuntu1) 7.9
<snip>
Reading symbols from
/home/charles.baylis/tools/tools-arm-unknown-linux-gnueabihf-git/bin/../libexec/gcc/arm-unknown-linux-gnueabihf/6.0.0/cc1...done.
(gdb) b auto-inc-dec.c:473
Breakpoint 1 at 0x102c253: file
/home/charles.baylis/srcarea/gcc/gcc-git/gcc/auto-inc-dec.c, line 473.
(gdb) r
<snip, gdb has an assertion failure here, but it works to continue>
(gdb) print debug_rtx(mem)
(mem:OI (reg/v/f:SI 112 [ p ]) [0 MEM[(signed char[32] *)p_2(D)]+0 S32 A8])
$1 = void
(gdb) print rtx_cost(mem, V16QImode, SET, 1, false)
$2 = 4
(gdb) print debug_rtx(mem_tmp)
(mem:OI (post_inc:SI (reg/f:SI 115 [ p ])) [0  S32 A64])
$3 = void
(gdb) print rtx_cost(mem_tmp, V16QImode, SET, 1, false)
$4 = 32

So, the cost of
     (mem:OI (reg/v/f:SI 112 [ p ]))
is 4, while the cost of
    (mem:OI (post_inc:SI (reg/f:SI 115 [ p ])))
is 32.

That is a difference equivalent to 7 insns, which has no basis in
reality. It is just a bug.

Addressing some specific review points from the previous version.

> > +    {
> > +      0,
> > +      COSTS_N_INSNS (15),
> > +      COSTS_N_INSNS (15),
> > +      COSTS_N_INSNS (15),
> > +      COSTS_N_INSNS (15)
> > +    } /* vec512 */
> >    }
> >  };
>
> I'm curious as to the numbers here - The costs should reflect the relative costs of the
> addressing modes not the costs of the loads and stores - thus having high numbers
> here for vector modes may just prevent this from even triggering in auto-inc-dec
> code ? In my experience with GCC I've never satisfactorily answered the question
> whether these should be comparable to rtx_costs or not. In an ideal world they should
> be but I'm never sure. IOW I'm not sure if using COSTS_N_INSNS or plain numbers
> here is appropriate.

That's the point of the patch. These numbers give the same behaviour
as the current arm_rtx_costs code, and they are obviously wrong.

> 17:45 < ramana> My problem is that the mid-end in a number of other places
>         compares the cost coming out of rtx_cost and address_cost and if the 2
>         are not in sync we get funny values.

There is already no correspondence at all between the two at present.
My patch doesn't address this, but I think it must at least make it
better.

However, I don't really understand this comment - as you point out
above, address_cost and rtx_cost return values measured in different
units. I don't see how they can be made to correspond, given that.

> Right, but this does not change arm_address_costs - so how is this going to work?
> I would like this moved into a new function aarch_address_costs and that replacing
> arm_address_costs only to be called from here.

I could do that, but if I did, I would have to resubmit the patch at
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00387.html along with a
reimplemention of arm_address_costs which used a table without
changing its numerical results (pending subsequent tuning). Since the
former would already solve my problem, and the latter would then be a
pure code clean up of a separate function, why not accept the '387
patch as is, and leave the clean up until GCC 7?

Alternatively, this is an updated patch series which changes the costs
for MEMs in arm_rtx_costs using the table. Passes make check with no
regressions for arm-unknown-linux-gnueabihf on qemu.
From d8110f141a449c62f1ba2c4f47832ee2633d3998 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Wed, 28 Oct 2015 18:48:16 +0000
Subject: [PATCH 1/4] Add table-driven implemention of "case MEM:" in
 arm_rtx_costs_new.

This patch replicates the existing cost calculation using a table, so that
the costs can be tuned cleanly. The old implementation is retained for
comparison, and check is made that the same result is obtained from both
methods.

Change-Id: If349ffd7dbbe13a814be4a0d022382ddc8270973
---
 gcc/config/arm/aarch-common-protos.h |  28 ++++++++
 gcc/config/arm/aarch-cost-tables.h   |  95 +++++++++++++++++++++++++--
 gcc/config/arm/arm.c                 | 124 ++++++++++++++++++++++++++++++-----
 3 files changed, 226 insertions(+), 21 deletions(-)

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 348ae74..5bba777 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -130,6 +130,33 @@ struct vector_cost_table
   const int alu;
 };
 
+struct extra_mem_cost_table
+{
+  enum access_type
+  {
+    REG,
+    POST_INCDEC,
+    PRE_INCDEC,
+    /*PRE_MODIFY,*/
+    POST_MODIFY,
+    PLUS,
+    ACCESS_TYPE_LAST = PLUS
+  };
+  const int si[ACCESS_TYPE_LAST + 1];
+  const int di[ACCESS_TYPE_LAST + 1];
+  const int cdi[ACCESS_TYPE_LAST + 1];
+  const int sf[ACCESS_TYPE_LAST + 1];
+  const int df[ACCESS_TYPE_LAST + 1];
+  const int cdf[ACCESS_TYPE_LAST + 1];
+  const int blk[ACCESS_TYPE_LAST + 1];
+  const int vec64[ACCESS_TYPE_LAST + 1];
+  const int vec128[ACCESS_TYPE_LAST + 1];
+  const int vec192[ACCESS_TYPE_LAST + 1];
+  const int vec256[ACCESS_TYPE_LAST + 1];
+  const int vec384[ACCESS_TYPE_LAST + 1];
+  const int vec512[ACCESS_TYPE_LAST + 1];
+};
+
 struct cpu_cost_table
 {
   const struct alu_cost_table alu;
@@ -137,6 +164,7 @@ struct cpu_cost_table
   const struct mem_cost_table ldst;
   const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
   const struct vector_cost_table vect;
+  const struct extra_mem_cost_table * const extra_mem;
 };
 
 
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 66e09a8..62ec874 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -22,6 +22,89 @@
 #ifndef GCC_AARCH_COST_TABLES_H
 #define GCC_AARCH_COST_TABLES_H
 
+const struct extra_mem_cost_table generic_extra_mem_costs =
+{
+  { 0, 0, 0, 0, 0 },		/* si */
+  {
+    0,
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1)
+  },				/* di */
+  {
+    0,
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3)
+  },				/* cdi */
+  { 0, 0, 0, 0, 0 },		/* sf */
+  {
+    0,
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1)
+  },				/* df */
+  {
+    0,
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3)
+  },				/* cdf */
+  {
+    0,
+    - COSTS_N_INSNS (1),
+    - COSTS_N_INSNS (1),
+    - COSTS_N_INSNS (1),
+    - COSTS_N_INSNS (1),
+  },				/* blk */
+  {
+    0,
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1),
+    COSTS_N_INSNS (1)
+  },				/* vec64 */
+  {
+    0,
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3),
+    COSTS_N_INSNS (3)
+  },				/* vec128 */
+  {
+    0,
+    COSTS_N_INSNS (5),
+    COSTS_N_INSNS (5),
+    COSTS_N_INSNS (5),
+    COSTS_N_INSNS (5)
+  },				/* vec192 */
+  {
+    0,
+    COSTS_N_INSNS (7),
+    COSTS_N_INSNS (7),
+    COSTS_N_INSNS (7),
+    COSTS_N_INSNS (7)
+  },				/* vec256 */
+  {
+    0,
+    COSTS_N_INSNS (11),
+    COSTS_N_INSNS (11),
+    COSTS_N_INSNS (11),
+    COSTS_N_INSNS (11)
+  },				/* vec384 */
+  {
+    0,
+    COSTS_N_INSNS (15),
+    COSTS_N_INSNS (15),
+    COSTS_N_INSNS (15),
+    COSTS_N_INSNS (15)
+  }				/* vec512 */
+};
+
 const struct cpu_cost_table generic_extra_costs =
 {
   /* ALU */
@@ -122,7 +205,8 @@ const struct cpu_cost_table generic_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs
 };
 
 const struct cpu_cost_table cortexa53_extra_costs =
@@ -225,7 +309,8 @@ const struct cpu_cost_table cortexa53_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table cortexa57_extra_costs =
@@ -328,7 +413,8 @@ const struct cpu_cost_table cortexa57_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)  /* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table xgene1_extra_costs =
@@ -431,7 +517,8 @@ const struct cpu_cost_table xgene1_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (2)  /* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 #endif /* GCC_AARCH_COST_TABLES_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7093694..e36f34e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1058,7 +1058,8 @@ const struct cpu_cost_table cortexa9_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table cortexa8_extra_costs =
@@ -1161,7 +1162,8 @@ const struct cpu_cost_table cortexa8_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table cortexa5_extra_costs =
@@ -1265,7 +1267,8 @@ const struct cpu_cost_table cortexa5_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 
@@ -1370,7 +1373,8 @@ const struct cpu_cost_table cortexa7_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table cortexa12_extra_costs =
@@ -1473,7 +1477,8 @@ const struct cpu_cost_table cortexa12_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table cortexa15_extra_costs =
@@ -1576,7 +1581,8 @@ const struct cpu_cost_table cortexa15_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct cpu_cost_table v7m_extra_costs =
@@ -1679,7 +1685,8 @@ const struct cpu_cost_table v7m_extra_costs =
   /* Vector */
   {
     COSTS_N_INSNS (1)	/* alu.  */
-  }
+  },
+  &generic_extra_mem_costs	/* extra_mem */
 };
 
 const struct tune_params arm_slowmul_tune =
@@ -9434,6 +9441,22 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
 	  }								\
 	while (0);
 
+/* Return TRUE if MODE is any of the large INT modes.  */
+static bool
+arm_vect_struct_mode_p (machine_mode mode)
+{
+  return mode == OImode || mode == EImode || mode == TImode || mode == CImode
+    || mode == XImode;
+}
+
+
+/* Return TRUE if MODE is any of the vector modes.  */
+static bool
+arm_vector_mode_p (machine_mode mode)
+{
+  return arm_vector_mode_supported_p (mode) || arm_vect_struct_mode_p (mode);
+}
+
 /* RTX costs.  Make an estimate of the cost of executing the operation
    X, which is contained with an operation with code OUTER_CODE.
    SPEED_P indicates whether the cost desired is the performance cost,
@@ -9516,16 +9539,83 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
     case MEM:
       /* A memory access costs 1 insn if the mode is small, or the address is
 	 a single register, otherwise it costs one insn per word.  */
-      if (REG_P (XEXP (x, 0)))
-	*cost = COSTS_N_INSNS (1);
-      else if (flag_pic
-	       && GET_CODE (XEXP (x, 0)) == PLUS
-	       && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
-	/* This will be split into two instructions.
-	   See arm.md:calculate_pic_address.  */
-	*cost = COSTS_N_INSNS (2);
-      else
-	*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
+      {
+	int cost_old;
+	int cost_new;
+	extra_mem_cost_table::access_type op;
+	if (REG_P (XEXP (x, 0)))
+	  cost_old = COSTS_N_INSNS (1);
+	else if (flag_pic
+		 && GET_CODE (XEXP (x, 0)) == PLUS
+		 && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
+	  /* This will be split into two instructions.
+	     See arm.md:calculate_pic_address.  */
+	  cost_old = COSTS_N_INSNS (2);
+	else
+	  cost_old = COSTS_N_INSNS (ARM_NUM_REGS (mode));
+	switch (GET_CODE (XEXP (x, 0)))
+	  {
+	  case REG:
+	    op = extra_mem_cost_table::REG;
+	    break;
+	  case POST_INC:
+	  case POST_DEC:
+	    op = extra_mem_cost_table::POST_INCDEC;
+	    break;
+	  case PRE_INC:
+	  case PRE_DEC:
+	    op = extra_mem_cost_table::PRE_INCDEC;
+	    break;
+	  case POST_MODIFY:
+	    op = extra_mem_cost_table::POST_MODIFY;
+	    break;
+	  default:
+	  case PLUS:
+	    op = extra_mem_cost_table::PLUS;
+	    break;
+	  }
+	if (flag_pic
+	    && GET_CODE (XEXP (x, 0)) == PLUS
+	    && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
+	  cost_new = COSTS_N_INSNS (2);
+	else
+	  {
+            cost_new = COSTS_N_INSNS (1);
+	    if (arm_vector_mode_p (mode))
+	      {
+		cost_new +=
+		  (ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->vec64[op]
+		  : ARM_NUM_REGS (mode) <= 4 ? extra_cost->extra_mem->vec128[op]
+		  : ARM_NUM_REGS (mode) <= 6 ? extra_cost->extra_mem->vec192[op]
+		  : ARM_NUM_REGS (mode) <= 8 ? extra_cost->extra_mem->vec256[op]
+		  : ARM_NUM_REGS (mode) <= 12 ? extra_cost->extra_mem->vec384[op]
+		  : extra_cost->extra_mem->vec512[op]);
+	      }
+	    else if (FLOAT_MODE_P (mode))
+	      {
+		cost_new +=
+		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->sf[op]
+		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->df[op]
+					     : extra_cost->extra_mem->cdf[op]);
+	      }
+	    else if (mode == BLKmode)
+	      cost_new += extra_cost->extra_mem->blk[op];
+            else
+	      { /* integer modes */
+		cost_new +=
+		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->si[op]
+		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->di[op]
+					     : extra_cost->extra_mem->cdi[op]);
+	      }
+	  }
+	*cost = cost_old;
+        if (cost_old != cost_new)
+        {
+            debug_rtx(x);
+ fprintf(stderr,"old(%d) new(%d)\n", cost_old, cost_new);
+	    gcc_assert (cost_old == cost_new);
+        }
+      }
 
       /* For speed optimizations, add the costs of the address and
 	 accessing memory.  */
-- 
1.9.1

From cb8694117c4af002b00feae8f68da6c8c45cef2d Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Fri, 13 Nov 2015 12:28:24 +0000
Subject: [PATCH 2/4] Remove old_cost calculation

Clean up the code by removing the original cost calculation.

Change-Id: Ia7283424867cf244884b9af9126abf4173daa101
---
 gcc/config/arm/arm.c | 31 ++++++-------------------------
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e36f34e..101ff28 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9540,19 +9540,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
       /* A memory access costs 1 insn if the mode is small, or the address is
 	 a single register, otherwise it costs one insn per word.  */
       {
-	int cost_old;
-	int cost_new;
 	extra_mem_cost_table::access_type op;
-	if (REG_P (XEXP (x, 0)))
-	  cost_old = COSTS_N_INSNS (1);
-	else if (flag_pic
-		 && GET_CODE (XEXP (x, 0)) == PLUS
-		 && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
-	  /* This will be split into two instructions.
-	     See arm.md:calculate_pic_address.  */
-	  cost_old = COSTS_N_INSNS (2);
-	else
-	  cost_old = COSTS_N_INSNS (ARM_NUM_REGS (mode));
 	switch (GET_CODE (XEXP (x, 0)))
 	  {
 	  case REG:
@@ -9577,13 +9565,13 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	if (flag_pic
 	    && GET_CODE (XEXP (x, 0)) == PLUS
 	    && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
-	  cost_new = COSTS_N_INSNS (2);
+	  *cost = COSTS_N_INSNS (2);
 	else
 	  {
-            cost_new = COSTS_N_INSNS (1);
+            *cost = COSTS_N_INSNS (1);
 	    if (arm_vector_mode_p (mode))
 	      {
-		cost_new +=
+		*cost +=
 		  (ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->vec64[op]
 		  : ARM_NUM_REGS (mode) <= 4 ? extra_cost->extra_mem->vec128[op]
 		  : ARM_NUM_REGS (mode) <= 6 ? extra_cost->extra_mem->vec192[op]
@@ -9593,28 +9581,21 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 	      }
 	    else if (FLOAT_MODE_P (mode))
 	      {
-		cost_new +=
+		*cost +=
 		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->sf[op]
 		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->df[op]
 					     : extra_cost->extra_mem->cdf[op]);
 	      }
 	    else if (mode == BLKmode)
-	      cost_new += extra_cost->extra_mem->blk[op];
+	      *cost += extra_cost->extra_mem->blk[op];
             else
 	      { /* integer modes */
-		cost_new +=
+		*cost +=
 		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->si[op]
 		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->di[op]
 					     : extra_cost->extra_mem->cdi[op]);
 	      }
 	  }
-	*cost = cost_old;
-        if (cost_old != cost_new)
-        {
-            debug_rtx(x);
- fprintf(stderr,"old(%d) new(%d)\n", cost_old, cost_new);
-	    gcc_assert (cost_old == cost_new);
-        }
       }
 
       /* For speed optimizations, add the costs of the address and
-- 
1.9.1

From e7190c915ea460940dced3dd58678545e0505ed0 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Fri, 13 Nov 2015 12:30:45 +0000
Subject: [PATCH 3/4] Make vector costs more sensible

Tune the costs for the vector modes.

Change-Id: Ifdee53041b1deeefa1ca78d1efb2a123a9a662e7
---
 gcc/config/arm/aarch-cost-tables.h | 40 +++++---------------------------------
 1 file changed, 5 insertions(+), 35 deletions(-)

diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 62ec874..8be2c05 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -68,41 +68,11 @@ const struct extra_mem_cost_table generic_extra_mem_costs =
     COSTS_N_INSNS (1),
     COSTS_N_INSNS (1)
   },				/* vec64 */
-  {
-    0,
-    COSTS_N_INSNS (3),
-    COSTS_N_INSNS (3),
-    COSTS_N_INSNS (3),
-    COSTS_N_INSNS (3)
-  },				/* vec128 */
-  {
-    0,
-    COSTS_N_INSNS (5),
-    COSTS_N_INSNS (5),
-    COSTS_N_INSNS (5),
-    COSTS_N_INSNS (5)
-  },				/* vec192 */
-  {
-    0,
-    COSTS_N_INSNS (7),
-    COSTS_N_INSNS (7),
-    COSTS_N_INSNS (7),
-    COSTS_N_INSNS (7)
-  },				/* vec256 */
-  {
-    0,
-    COSTS_N_INSNS (11),
-    COSTS_N_INSNS (11),
-    COSTS_N_INSNS (11),
-    COSTS_N_INSNS (11)
-  },				/* vec384 */
-  {
-    0,
-    COSTS_N_INSNS (15),
-    COSTS_N_INSNS (15),
-    COSTS_N_INSNS (15),
-    COSTS_N_INSNS (15)
-  }				/* vec512 */
+  { 0, 0, 0, 0, 0 },		/* vec128 */
+  { 0, 0, 0, 0, 0 },		/* vec192 */
+  { 0, 0, 0, 0, 0 },		/* vec256 */
+  { 0, 0, 0, 0, 0 },		/* vec384 */
+  { 0, 0, 0, 0, 0 }		/* vec512 */
 };
 
 const struct cpu_cost_table generic_extra_costs =
-- 
1.9.1

From 68f4318327e75a709f6a3bea327915c0558127df Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Fri, 13 Nov 2015 12:24:08 +0000
Subject: [PATCH 4/4] Use integer costs for soft float

If using soft float, then costs of accessing FP values is actually the same
as the cost of accessing integers of the same size.

Change-Id: Icb672b2b599ea4e433bc0b29c228e9f910aeb3ee
---
 gcc/config/arm/arm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 101ff28..726a385 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9579,15 +9579,15 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
 		  : ARM_NUM_REGS (mode) <= 12 ? extra_cost->extra_mem->vec384[op]
 		  : extra_cost->extra_mem->vec512[op]);
 	      }
-	    else if (FLOAT_MODE_P (mode))
+	    else if (mode == BLKmode)
+	      *cost += extra_cost->extra_mem->blk[op];
+	    else if (FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT)
 	      {
 		*cost +=
 		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->extra_mem->sf[op]
 		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->extra_mem->df[op]
 					     : extra_cost->extra_mem->cdf[op]);
 	      }
-	    else if (mode == BLKmode)
-	      *cost += extra_cost->extra_mem->blk[op];
             else
 	      { /* integer modes */
 		*cost +=
-- 
1.9.1


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]