This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fix ARM ldm/stm peephole2 loop
On Thu, 2010-08-05 at 08:46 +0100, Richard Earnshaw wrote:
> On Thu, 2010-08-05 at 03:09 +0200, Bernd Schmidt wrote:
> > With -mtune=xscale, arm_gen_load_multiple_1 doesn't actually generate a
> > load-multiple for two loads, but single instructions. This causes a
> > peephole2 loop where we try to optimize the same instructions over and
> > over again.
> >
> > Fixed by teaching multiple_operation_profitable_p about this situation. Ok?
> >
> >
> > Bernd
>
> + if (nops <= 2 && arm_tune_xscale && !optimize_size)
>
> It seems wrong to me to have an explicit test for XScale here. Other
> cores could surely exist that result in a similar loop. Ie the test
> should be on the cause, not on 'it's an xscale'
I think the equivalent condition in arm_gen_load_multiple (and
arm_gen_store_multiple, fwiw) is also written in terms of
arm_tune_xcale. So, to that extent, "it's an xscale" is actually the
cause here, at least until and unless another CPU appears which has the
same pipeline behaviour.
It does seem a little bit fragile to require the conditions in the two
places to match in order to avoid loops, though. Maybe there should be
a comment at the appropriate place in arm_gen_xx_multiple to say that it
needs to stay in sync with the code in multiple_operation_profitable_p,
or maybe those two functions could be reworked to actually use
multiple_operation_profitable_p() rather than duplicating its logic.
p.