This is the mail archive of the
mailing list for the GCC project.
Re: The new scheduler and x86 CPUs
- To: Bernd Schmidt <bernds at redhat dot com>
- Subject: Re: The new scheduler and x86 CPUs
- From: "Vladimir N. Makarov" <vmakarov at cygnus dot com>
- Date: Mon, 28 Aug 2000 22:56:33 -0400
- CC: Vladimir Makarov <vmakarov at redhat dot com>, "jh at suse dot cz" <jh at suse dot cz>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- References: <3B8C2E14.41CD486B@redhat.com>
Vladimir Makarov wrote:
> Subject: Re: The new scheduler and x86 CPUs
> Date: Tue, 28 Aug 2001 23:39:12 +0100 (BST)
> From: Bernd Schmidt <firstname.lastname@example.org>
> To: Vladimir Makarov <email@example.com>
> CC: <firstname.lastname@example.org>, <email@example.com>
> On Tue, 28 Aug 2001, Vladimir Makarov wrote:
> > > Why is OOO a dead-end approach? If there's parallelism to extract
> > > in a program, then if a compiler can find it, an OOO core can find it
> > > just as well.
> > I see
> > (http://www.intel.com/eBusiness/products/ia64/overview/bm012101.htm#5)
> > that Itanium has the same SPECint as UltrasparcIII practically on the
> > same frequency.
> Is Ultrasparc III out-of-order? I thought it was in-order too. There's
> nothing in
> that suggests it's OOO. So it isn't all that surprising to me that it's
> slow as well.
Even UltrasparcI was an OOO processor.
> > OOO has constraints on speculation (branches to look through, look
> > ahead buffer). Of course you could increase this characteristics, but
> > in this case the penalty of mispredictions is bigger. The control logic
> > to implement OOO is very big (even now as I remember it achieves 1/3-1/2
> > of all control logic). Therefore maximal issue rate for them is 6-8.
> How much parallelism are you likely to extract out of a typical integer
> program, anyway? IMO, 6-8 is quite near or possibly over the limit for
> typical programs.
> > There is no other way to achieve potential (fine grain)
> > parallelism which even for SPECint achieve 30-40 insns per cycle.
> I don't believe this. Can you back up this number? Even if it were
> realistic, is there any known strategy for a compiler to get anywhere
> close? And, is it feasible that actual hardware could be built to make
> use of it?
Potential fine grain parallelism is described in e.g.
Limits of Instruction Level Parallelism: David W. Wall, Proc. 4th ASPLOS.
All potential parallelism is investigated in
On the Limits of Program Parallelism and its Smoothability (1992) Kevin B.
Theobald, Guang R. Gao, Laurie J. Hendren
It can achieve 1000 for some SPEC programs.
Of course, many suggestions in such researches are not realistic.
I don't say that the limit can be achieved, but we know the limits, they are
far away and designers of processor architectures and compilers will try to
get closer to them. It can not be stopped. I only said that EPIC has more
potential to be closer to them in comparison with OOO.
> > Also
> > this approach is simpler with hardware point of view (the logic were
> > used for OOO control could be used for more useful functional units as
> > ALUs), the typical example of this for embedded systems (with low power
> > consumption) could be Transmeta processors.
> Current ia64 systems don't seem to be very low power designs.