This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Test arm_tune_xscale, not arm_arch_xscale
- From: Richard Earnshaw <rearnsha at gcc dot gnu dot org>
- To: Ian Lance Taylor <ian at airs dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 14 Dec 2004 11:02:38 +0000
- Subject: Re: [PATCH] Test arm_tune_xscale, not arm_arch_xscale
- Organization: GNU
- References: <email@example.com>
On Tue, 2004-12-14 at 05:49, Ian Lance Taylor wrote:
> On XScale instructions like muls which both do a multiplication and
> set the condition codes can be slower than doing a mul and then
> testing the result. This is because muls will stall the pipeline
> until the muls completes, whereas mul will permit non-multiplication
> instructions to be executed provided they do not refer to the result
> of mul.
> Some five years ago one Jeff Law changed the condition for these
> patterns in arm.md to add "&& !arm_is_xscale". Last year Philip
> Blundell split arm_is_xscale into two variables, arm_arch_xscale and
> arm_tune_xscale. When he did this, he changed the conditions on the
> instructions to check arm_arch_xscale. Since the instructions are in
> fact supported on XScale, I think it would be more correct to check
> This is not strictly speaking a regression, so it is presumably not
> suitable for mainline. How about csl-arm-branch?
> 2004-12-14 Ian Lance Taylor <firstname.lastname@example.org>
> * config/arm/arm.md (mulsi3_compare0): Check arm_tune_xscale, not
> (mulsi_compare0_scratch, mulsi3_addsi_compare0): Likewise.
> (mulsi3addsi_compare0_scratch): Likewise.
The original change was done in the days before combine took into
account the cost of a combined insn relative to it's non-combined cost.
This should now be done by changing the rtx costs for XScale rather than
hacking the patterns.
There are a number of newer ARM cores for which muls is significantly
slower than mul+cmp; I don't want to see a long list of cores in this