This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, ARM] Suppress Redundant Flag Setting for Cortex-A15
- From: Christophe Lyon <christophe dot lyon at linaro dot org>
- To: Ramana Radhakrishnan <ramrad01 at arm dot com>
- Cc: Ian Bolton <ian dot bolton at arm dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 23 Apr 2014 14:46:15 +0200
- Subject: Re: [PATCH, ARM] Suppress Redundant Flag Setting for Cortex-A15
- Authentication-results: sourceware.org; auth=none
- References: <52e2a013 dot 89e8440a dot 49e1 dot fffffee3SMTPIN_ADDED_BROKEN at mx dot google dot com> <CAJA7tRYw7+4+RbeD+ugiQj7NbKxsgxSUk4XtCdeAyjxG4++0XA at mail dot gmail dot com>
Hi,
On 28 January 2014 13:10, Ramana Radhakrishnan
<ramana.gcc@googlemail.com> wrote:
> On Fri, Jan 24, 2014 at 5:16 PM, Ian Bolton <ian.bolton@arm.com> wrote:
>> Hi there!
>>
>> An existing optimisation for Thumb-2 converts t32 encodings to
>> t16 encodings to reduce codesize, at the expense of causing
>> redundant flag setting for ADD, AND, etc. This redundant flag
>> setting can have negative performance impact on cortex-a15.
>>
>> This patch introduces two new tuning options so that the conversion
>> from t32 to t16, which takes place in thumb2_reorg, can be suppressed
>> for cortex-a15.
>>
>> To maintain some of the original benefit (reduced codesize), the
>> suppression is only done where the enclosing basic block is deemed
>> worthy of optimising for speed.
>>
>> This tested with no regressions and performance has improved for
>> the workloads tested on cortex-a15. (It might be beneficial to
>> other processors too, but that has not been investigated yet.)
>>
>> OK for stage 1?
>
> This is OK for stage1.
>
> Ramana
>
>>
>> Cheers,
>> Ian
>>
>>
>> 2014-01-24 Ian Bolton <ian.bolton@arm.com>
>>
>> gcc/
>> * config/arm/arm-protos.h (tune_params): New struct members.
>> * config/arm/arm.c: Initialise tune_params per processor.
>> (thumb2_reorg): Suppress conversion from t32 to t16 when
>> optimizing for speed, based on new tune_params.
This causes
gcc.target/arm/negdi-1.c
gcc.target/arm/negdi-2.c
to FAIL when GCC is configured as:
--with-mode=ar
--with-cpu=cortex-a15
--with-fpu=neon-vfpv4
both tests used to PASS.
(see http://cbuild.validation.linaro.org/build/cross-validation/gcc/209561/report-build-info.html)
Christophe.