This is the mail archive of the
mailing list for the GCC project.
Re: [csl-asm?] PR middle-end/11821: Tweak arm_rtx_costs_1
- From: Roger Sayle <roger at eyesopen dot com>
- To: Richard dot Earnshaw at arm dot com
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 20 Nov 2003 19:40:19 -0700 (MST)
- Subject: Re: [csl-asm?] PR middle-end/11821: Tweak arm_rtx_costs_1
On Tue, 18 Nov 2003, Richard Earnshaw wrote:
> Ultimately, the problem here is that the expanders often do cost metrics
> based on the number of insns that they will emit at that stage in the
> compilation, with no account made of how later stages may combine
> instructions to produce more efficient code. This is a particular problem
> on ARM where most shift instructions will end up being combined with
> either an arithmetic or a logical operation, such as shift-and-add. The
> multiplication synthesis algorithm is particularly bad in this respect:
> it makes describing the costs accurately almost impossible.
I've just looked a bit deeper into this, and once again it looks like
this is a failing of the ARM back-end rather than of GCC's synthetic
multiplication (and other RTL expansion) routines.
Inspection of init_expmed reveals that the middle-end not only anticipates
the availablity of combined shift-and-add operations but goes to a great
deal of difficulty to query the back-end to construct accurate cost
tables (shift_cost, shiftadd_cost and shiftsub_cost) which are used by
the middle-end expanders to choose the best insn sequences to generate.
Unfortunately, it looks like the special RTXen it generates to query the
backend, PLUS (reg, MULT (reg, const_int)), etc..., aren't treated or
handled specially by the ARM backend's arm_rtx_costs_1 at all. The x86
backend's ix86_rtx_costs, on the other hand, does check for these
patterns, makes sure the constant multiplier is 2, 4 or 8 and returns
the CPU's cost for using a single "lea" instruction.
I suspect that the GCC's synthetic multiplication routines would
generate better code on the ARM, if arm_rtx_costs was tweaked to
reveal that using the shift-and-add instructions, would get the
shift "for free". Certainly the existing infrastructure could
probably be tweaked to meet the ARM's needs.