This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [ARM] PR61551 RFC: Improve costs for NEON addressing modes


Hi Charles,

Sorry I missed this completely in my inbox.

On 31/10/15 03:34, Charles Baylis wrote:
> Hi Ramana,
> 
> [revisiting https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01593.html]
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61551
> 
> This patch is an initial attempt to rework the ARM rtx costs to better
> handle the costs of various addressing modes, in particular to remove
> the incorrect large costs associated with post-indexed addressing in
> NEON memory operations.
> 
> This patch introduces per-core tables for the costs of using different
> addressing modes for different access modes. I have retained the
> original code so that the calculated costs can be compared. Currently,
> the tables replicate the costs calculated by the original code, and a
> debug assert is left in place.
> 
> Obviously, a fair amount of clean up is needed before this can be
> applied, but I would like a quick comment on the general approach to
> check that I haven't completely missed the point before continuing.

No you haven't missed the point - this is the direction I wanted this taken in though not expecting this degree of detail.



> +struct cbmem_cost_table
> +{
> +  enum access_type
> +  {
> +    REG,
> +    POST_INCDEC,
> +    PRE_INCDEC,
> +    /*PRE_MODIFY,*/
> +    POST_MODIFY,
> +    PLUS,
> +    ACCESS_TYPE_LAST = PLUS
> +  };
> +  const int si[ACCESS_TYPE_LAST + 1];
> +  const int di[ACCESS_TYPE_LAST + 1];
> +  const int cdi[ACCESS_TYPE_LAST + 1];
> +  const int sf[ACCESS_TYPE_LAST + 1];
> +  const int df[ACCESS_TYPE_LAST + 1];
> +  const int cdf[ACCESS_TYPE_LAST + 1];
> +  const int blk[ACCESS_TYPE_LAST + 1];
> +  const int vec64[ACCESS_TYPE_LAST + 1];
> +  const int vec128[ACCESS_TYPE_LAST + 1];
> +  const int vec192[ACCESS_TYPE_LAST + 1];
> +  const int vec256[ACCESS_TYPE_LAST + 1];
> +  const int vec384[ACCESS_TYPE_LAST + 1];
> +  const int vec512[ACCESS_TYPE_LAST + 1];
> +};
> +
> 
> After that, I will clean up the coding style, check for impact on the
> AArch64 backend, remove the debug code and in a separate patch improve
> the tuning for the vector modes.

I think adding additional costs for zero / sign extension of registers would be appropriate for the AArch64 backend. Further more I think Alan recently had patches to change the use of vector modes to BLKmode in the AArch64 backend, so some of the vector costing might become interesting.

If you can start turning this around quickly I'd like to keep the review momentum going but it will need time and effort from a number of parties to get this working. This is however likely to be a high impact change on the backends as this is an invasive change and I'm not sure if it will meet the Stage3 cutoff point.



> 
> Thanks
> Charles
> 
> 
> 0001-WIP.patch
> 
> 
> From b10c6dd7af1f5b9821946783ba9d96b08c751f2b Mon Sep 17 00:00:00 2001
> From: Charles Baylis <charles.baylis@linaro.org>
> Date: Wed, 28 Oct 2015 18:48:16 +0000
> Subject: [PATCH] WIP
> 
> Change-Id: If349ffd7dbbe13a814be4a0d022382ddc8270973
> ---
>  gcc/config/arm/aarch-common-protos.h |  28 ++
>  gcc/config/arm/aarch-cost-tables.h   | 328 +++++++++++++++++
>  gcc/config/arm/arm.c                 | 677 ++++++++++++++++++++++++++++++++++-
>  3 files changed, 1023 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
> index 348ae74..dae42d7 100644
> --- a/gcc/config/arm/aarch-common-protos.h
> +++ b/gcc/config/arm/aarch-common-protos.h
> @@ -130,6 +130,33 @@ struct vector_cost_table
>    const int alu;
>  };
>  
> +struct cbmem_cost_table
> +{
> +  enum access_type
> +  {
> +    REG,
> +    POST_INCDEC,
> +    PRE_INCDEC,
> +    /*PRE_MODIFY,*/
> +    POST_MODIFY,
> +    PLUS,
> +    ACCESS_TYPE_LAST = PLUS
> +  };
> +  const int si[ACCESS_TYPE_LAST + 1];
> +  const int di[ACCESS_TYPE_LAST + 1];
> +  const int cdi[ACCESS_TYPE_LAST + 1];
> +  const int sf[ACCESS_TYPE_LAST + 1];
> +  const int df[ACCESS_TYPE_LAST + 1];
> +  const int cdf[ACCESS_TYPE_LAST + 1];
> +  const int blk[ACCESS_TYPE_LAST + 1];
> +  const int vec64[ACCESS_TYPE_LAST + 1];
> +  const int vec128[ACCESS_TYPE_LAST + 1];
> +  const int vec192[ACCESS_TYPE_LAST + 1];
> +  const int vec256[ACCESS_TYPE_LAST + 1];
> +  const int vec384[ACCESS_TYPE_LAST + 1];
> +  const int vec512[ACCESS_TYPE_LAST + 1];
> +};




I was considering a single table for scalar integer , scalar fp and vector modes mapping scalar fp and vector modes down to scalar integer modes in case of soft float mode or in the absence of a vector unit (i.e. TARGET_NEON was false.) I also wasn't sure what the impact would be by adding address_cost in with the computation of rtx_cost for MEM expressions and whether the 2 needed to be added or not. This needs plenty of analysis and tweaking over a range of benchmarks and mcpu options.

> +
>  struct cpu_cost_table
>  {
>    const struct alu_cost_table alu;
> @@ -137,6 +164,7 @@ struct cpu_cost_table
>    const struct mem_cost_table ldst;
>    const struct fp_cost_table fp[2]; /* SFmode and DFmode.  */
>    const struct vector_cost_table vect;
> +  const struct cbmem_cost_table addr;
>  };
>  

Can we make this a pointer instead and have simple tables that sort of abstract the same meaning - I would like to see if we can share the data here between multiple cores rather than creating 20 copies for the same thing. Initially atleast it would make life much easier if we only played around with 1 cost model on one core and had everything else map to the same thing.

>  
> diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
> index 66e09a8..c5ecdcf 100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -122,6 +122,88 @@ const struct cpu_cost_table generic_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };

I'm curious as to the numbers here - The costs should reflect the relative costs of the addressing modes not the costs of the loads and stores - thus having high numbers here for vector modes may just prevent this from even triggering in auto-inc-dec code ? In my experience with GCC I've never satisfactorily answered the question whether these should be comparable to rtx_costs or not. In an ideal world they should be but I'm never sure. IOW I'm not sure if using COSTS_N_INSNS or plain numbers here is appropriate.

>  
> @@ -225,6 +307,88 @@ const struct cpu_cost_table cortexa53_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -328,6 +492,88 @@ const struct cpu_cost_table cortexa57_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)  /* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -431,6 +677,88 @@ const struct cpu_cost_table xgene1_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (2)  /* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index a598c84..29e94ed 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -1063,6 +1063,88 @@ const struct cpu_cost_table cortexa9_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1166,6 +1248,88 @@ const struct cpu_cost_table cortexa8_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1270,6 +1434,88 @@ const struct cpu_cost_table cortexa5_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1375,6 +1621,88 @@ const struct cpu_cost_table cortexa7_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1478,6 +1806,88 @@ const struct cpu_cost_table cortexa12_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1581,6 +1991,88 @@ const struct cpu_cost_table cortexa15_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -1684,6 +2176,88 @@ const struct cpu_cost_table v7m_extra_costs =
>    /* Vector */
>    {
>      COSTS_N_INSNS (1)	/* alu.  */
> +  },
> +  /* Memory */
> +  {
> +    { 0, 0, 0, 0, 0 },		/* si */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* di */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdi */
> +    { 0, 0, 0, 0, 0 },		/* sf */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* df */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* cdf */
> +    {
> +      0,
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +      - COSTS_N_INSNS (1),
> +    },				/* blk */
> +    {
> +      0,
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1),
> +      COSTS_N_INSNS (1)
> +    },				/* vec64 */
> +    {
> +      0,
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3),
> +      COSTS_N_INSNS (3)
> +    },				/* vec128 */
> +    {
> +      0,
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5),
> +      COSTS_N_INSNS (5)
> +    },				/* vec192 */
> +    {
> +      0,
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7),
> +      COSTS_N_INSNS (7)
> +    },				/* vec256 */
> +    {
> +      0,
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11),
> +      COSTS_N_INSNS (11)
> +    },				/* vec384 */
> +    {
> +      0,
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15),
> +      COSTS_N_INSNS (15)
> +    }				/* vec512 */
>    }
>  };
>  
> @@ -9442,6 +10016,22 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost)
>  	  }								\
>  	while (0);
>  
> +/* Return TRUE if MODE is any of the large INT modes.  */
> +static bool
> +arm_vect_struct_mode_p (machine_mode mode)
> +{
> +  return mode == OImode || mode == EImode || mode == TImode || mode == CImode
> +    || mode == XImode;
> +}
> +
> +
> +/* Return TRUE if MODE is any of the vector modes.  */
> +static bool
> +arm_vector_mode_p (machine_mode mode)
> +{
> +  return arm_vector_mode_supported_p (mode) || arm_vect_struct_mode_p (mode);
> +}
> +
>  /* RTX costs.  Make an estimate of the cost of executing the operation
>     X, which is contained with an operation with code OUTER_CODE.
>     SPEED_P indicates whether the cost desired is the performance cost,
> @@ -9524,16 +10114,83 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
>      case MEM:
>        /* A memory access costs 1 insn if the mode is small, or the address is
>  	 a single register, otherwise it costs one insn per word.  */
> -      if (REG_P (XEXP (x, 0)))
> -	*cost = COSTS_N_INSNS (1);
> -      else if (flag_pic
> -	       && GET_CODE (XEXP (x, 0)) == PLUS
> -	       && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
> -	/* This will be split into two instructions.
> -	   See arm.md:calculate_pic_address.  */
> -	*cost = COSTS_N_INSNS (2);
> -      else
> -	*cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));
> +      {
> +	int cost_old;
> +	int cost_new;
> +	cbmem_cost_table::access_type op;
> +	if (REG_P (XEXP (x, 0)))
> +	  cost_old = COSTS_N_INSNS (1);
> +	else if (flag_pic
> +		 && GET_CODE (XEXP (x, 0)) == PLUS
> +		 && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
> +	  /* This will be split into two instructions.
> +	     See arm.md:calculate_pic_address.  */
> +	  cost_old = COSTS_N_INSNS (2);
> +	else
> +	  cost_old = COSTS_N_INSNS (ARM_NUM_REGS (mode));
> +	switch (GET_CODE (XEXP (x, 0)))
> +	  {
> +	  case REG:
> +	    op = cbmem_cost_table::REG;
> +	    break;
> +	  case POST_INC:
> +	  case POST_DEC:
> +	    op = cbmem_cost_table::POST_INCDEC;
> +	    break;
> +	  case PRE_INC:
> +	  case PRE_DEC:
> +	    op = cbmem_cost_table::PRE_INCDEC;
> +	    break;
> +	  case POST_MODIFY:
> +	    op = cbmem_cost_table::POST_MODIFY;
> +	    break;
> +	  default:
> +	  case PLUS:
> +	    op = cbmem_cost_table::PLUS;
> +	    break;
> +	  }
> +	if (flag_pic
> +	    && GET_CODE (XEXP (x, 0)) == PLUS
> +	    && will_be_in_index_register (XEXP (XEXP (x, 0), 1)))
> +	  cost_new = COSTS_N_INSNS (2);
> +	else
> +	  {
> +            cost_new = COSTS_N_INSNS (1);
> +	    if (arm_vector_mode_p (mode))
> +	      {
> +		cost_new +=
> +		  (ARM_NUM_REGS (mode) <= 2 ? extra_cost->addr.vec64[op]
> +		  : ARM_NUM_REGS (mode) <= 4 ? extra_cost->addr.vec128[op]
> +		  : ARM_NUM_REGS (mode) <= 6 ? extra_cost->addr.vec192[op]
> +		  : ARM_NUM_REGS (mode) <= 8 ? extra_cost->addr.vec256[op]
> +		  : ARM_NUM_REGS (mode) <= 12 ? extra_cost->addr.vec384[op]
> +		  : extra_cost->addr.vec512[op]);
> +	      }
> +	    else if (FLOAT_MODE_P (mode))
> +	      {
> +		cost_new +=
> +		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->addr.sf[op]
> +		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->addr.df[op]
> +					     : extra_cost->addr.cdf[op]);
> +	      }
> +	    else if (mode == BLKmode)
> +	      cost_new += extra_cost->addr.blk[op];
> +            else
> +	      { /* integer modes */
> +		cost_new +=
> +		  (ARM_NUM_REGS (mode) <= 1 ? extra_cost->addr.si[op]
> +		  : ARM_NUM_REGS (mode) <= 2 ? extra_cost->addr.di[op]
> +					     : extra_cost->addr.cdi[op]);
> +	      }
> +	  }
> +	*cost = cost_old;
> +        if (cost_old != cost_new)
> +        {
> +            debug_rtx(x);
> + fprintf(stderr,"old(%d) new(%d)\n", cost_old, cost_new);
> +	    gcc_assert (cost_old == cost_new);
> +        }
> +      }

Right, but this does not change arm_address_costs - so how is this going to work ? I would like this moved into a new function aarch_address_costs and that replacing arm_address_costs only to be called from here.

>  
>        /* For speed optimizations, add the costs of the address and
>  	 accessing memory.  */
> -- 1.9.1
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]