[Patch, ARM] New feature to minimize the literal load for armv7-m target
Richard Earnshaw
rearnsha@arm.com
Wed Nov 20 15:58:00 GMT 2013
On 06/11/13 06:10, Terry Guo wrote:
> Hi,
>
> This patch intends to minimize the use of literal pool for some armv7-m
> targets that have slower speed to load data from flash than to fetch
> instruction from flash. The normal literal load instruction is now replaced
> by MOVW/MOVT instructions. A new option -mslow-flash-data is created for
> this purpose. So far this feature doesn't support PIC code and target that
> isn't based on armv7-m.
>
> Tested with GCC regression test on QEMU for cortex-m3. No new regressions.
> Is it OK to trunk?
>
> BR,
> Terry
>
> 2013-11-06 Terry Guo <terry.guo@arm.com>
>
> * doc/invoke.texi (-mslow-flash-data): Document new option.
> * config/arm/arm.opt (mslow-flash-data): New option.
> * config/arm/arm-protos.h
> (arm_max_const_double_inline_cost): Declare it.
> * config/arm/arm.h (TARGET_USE_MOVT): Always true when
> disable literal pools.
literal pools are disabled.
> (arm_disable_literal_pool): Declare it.
> * config/arm/arm.c (arm_disable_literal_pool): New
> variable.
> (arm_option_override): Handle new option.
> (thumb2_legitimate_address_p): Invalid certain address
> format.
Invalidate. What address formats?
> (arm_max_const_double_inline_cost): New function.
> * config/arm/arm.md (types.md): Include it a little
> earlier.
Include it before ...
> (use_literal_pool): New attribute.
> (enabled): Use new attribute.
> (split pattern): Replace symbol+offset with MOVW/MOVT.
>
>
Comments inline.
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 1781b75..25927a1 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -554,6 +556,9 @@ extern int arm_arch_thumb_hwdiv;
> than core registers. */
> extern int prefer_neon_for_64bits;
>
> +/* Nonzero if shouldn't use literal pool in generated code. */
'if we shouldn't use literal pools'
> +extern int arm_disable_literal_pool;
This should be a bool, values stored in it should be true/false not 1/0.
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 78554e8..de2a9c0 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -864,6 +864,9 @@ int arm_arch_thumb_hwdiv;
> than core registers. */
> int prefer_neon_for_64bits = 0;
>
> +/* Nonzero if shouldn't use literal pool in generated code. */
> +int arm_disable_literal_pool = 0;
Similar comments to above.
> @@ -6348,6 +6361,25 @@ thumb2_legitimate_address_p (enum machine_mode mode, rtx x, int strict_p)
> && thumb2_legitimate_index_p (mode, xop0, strict_p)));
> }
>
> + /* Normally we can assign constant values to its target register without
'to target registers'
> + the help of constant pool. But there are cases we have to use constant
> + pool like:
> + 1) assign a label to register.
> + 2) sign-extend a 8bit value to 32bit and then assign to register.
> +
> + Constant pool access in format:
> + (set (reg r0) (mem (symbol_ref (".LC0"))))
> + will cause the use of literal pool (later in function arm_reorg).
> + So here we mark such format as an invalid format, then compiler
'then the compiler'
> @@ -16114,6 +16146,18 @@ push_minipool_fix (rtx insn, HOST_WIDE_INT address, rtx *loc,
> minipool_fix_tail = fix;
> }
>
> +/* Return maximum allowed cost of synthesizing a 64-bit constant VAL inline.
> + Returns 99 if we always want to synthesize the value. */
Needs to mention that the cost is in terms of 'insns' (see the function
below it).
> +int
> +arm_max_const_double_inline_cost ()
> +{
> + /* Let the value get synthesized to avoid the use of literal pools. */
> + if (arm_disable_literal_pool)
> + return 99;
> +
> + return ((optimize_size || arm_ld_sched) ? 3 : 4);
> +}
> +
> /* Return the cost of synthesizing a 64-bit constant VAL inline.
> Returns the number of insns needed, or 99 if we don't know how to
> do it. */
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index adbc45b..a5991cb 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -534,6 +534,7 @@ Objective-C and Objective-C++ Dialects}.
> -mfix-cortex-m3-ldrd @gol
> -munaligned-access @gol
> -mneon-for-64bits @gol
> +-mslow-flash-data @gol
> -mrestrict-it}
>
> @emph{AVR Options}
> @@ -12295,6 +12296,12 @@ Enables using Neon to handle scalar 64-bits operations. This is
> disabled by default since the cost of moving data from core registers
> to Neon is high.
>
> +@item -mslow-flash-data
> +@opindex mslow-flash-data
> +Assume loading data from flash is slower than fetching instruction.
> +Therefore literal load is minimized for better performance.
> +This option is off by default.
> +
Needs to mention the limitation on which processors can support this, ie
only v7 m-profile.
R.
More information about the Gcc-patches
mailing list