This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: i386.md splits 1
- To: law at cygnus dot com
- Subject: Re: i386.md splits 1
- From: Andi Kleen <ak at muc dot de>
- Date: 27 Oct 1998 08:42:35 +0100
- Cc: Jan Hubicka <hubicka at atrey dot karlin dot mff dot cuni dot cz>, Richard Henderson <rth at cygnus dot com>, egcs-patches at cygnus dot com
- References: <18710.909443555@hurl.cygnus.com>
In article <18710.909443555@hurl.cygnus.com>,
Jeffrey A Law <law@cygnus.com> writes:
> In message <19981002124128.51289@atrey.karlin.mff.cuni.cz>you write:
>> > > + (define_insn "ashrsi3_31"
>> > > + [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,d")
>> > > + (ashiftrt:SI (match_operand:SI 1 "nonimmediate_operand" "0,a")
>> > > + (const_int 31)))]
>> > > + "ix86_cpu != PROCESSOR_PENTIUM || optimize_size"
>> > > + "@
>> > > + sar%L0 $31,%0
>> > > + cltd")
>> >
>> > Do you really want to limit this pattern to the pentium?
>> Yes. At pentium ctld takes two cycles, as mov/sar combination does.
>> But mov/sar combination is pairable, so it usually results in faster (and l
>> onger) code.
> What about pentiumpro? Seems like that pattern would be used on the PPro.
> Does anyone know what the right thing to do for PPro is?
Note that at least for the Cyrix M6 it is generally a loss to split complex
instructions into simple ones (they get split into the same riscops anyways,
but with the complex instruction the instruction decoder has less work to do).
They also give a few recommendations that sound surprising for old Intel
optimizers, for example example that it is not worth to try to hard to use
registers (with costly spill code etc.) because L1 cache accesses are as fast.
Now egcs doesn't support the Cyrix directly yet, but it is worth to keep
these issues in mind when changing the i386 backend. Not everything that
is good for Pentiums is good for other x86 CPUs too. This is the worst for
the P5 - a lot of the things that are good for the P5 (e.g. all the stretches
to get pairing instructions) are useless or even harmful for more
advanced x86 implementations.
-Andi