This is the mail archive of the
mailing list for the GCC project.
Re: Combined top-down and bottom-up instruction scheduler
- From: Jeff Law <law at redhat dot com>
- To: Vladimir Makarov <vmakarov at redhat dot com>, Aditya K <hiraditya at msn dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Tue, 8 Sep 2015 14:38:16 -0600
- Subject: Re: Combined top-down and bottom-up instruction scheduler
- Authentication-results: sourceware.org; auth=none
- References: <BLU179-W6188AF5C5B442B4282F2F0B6530 at phx dot gbl> <55EF2E2C dot 4000601 at redhat dot com> <55EF39A9 dot 5060002 at redhat dot com>
On 09/08/2015 01:40 PM, Vladimir Makarov wrote:
As I remember it is was written by Mike Tiemann.
Bottom-up scheduler as
Indeed that was one of the key things we were looking to get from the
Haifa scheduler along with improved superscalar support some support for
region scheduling & speculation.
a rule generates worse code than top-down one.
Correct. Latency scheduling just isn't that important for OOO and
instead you look at scheduling to mitigate costs for large latency
operations (ie, cache miss and transcendental functions). You might
also attack secondary issues like throughput at the retirement stage for
Yes, that is true for OOO execution processors which can rearrange insns
and execute them speculatively looking through several branches. For
such processors, software pipelining is more important as the processors
can look only through a few branches as software pipelining could look
through any number of branches. That is why Intel compiler did not have
any insn scheduler (but had software pipelining) until Intel Atom
introduction which was originally in-order processor.
Agreed. This is in-line with what the HP guys were seeing as they
transitioned to the PA8000.
Actually, I believe dealing with variable/unknown latency of load insns
(depending where data are placed in a cache or memory) would be more
important than bottom-up or hybrid scheduler.
A balanced scheduling
dealing with this problem was implemented by Alexander Monakov about 7-8
years ago as a google internship work but it was not included as at that
time its advantages was not confirmed on SPEC2000. It would be
interesting to reconsider and re-evaluate it on modern processors and
scientific benchmarks with big data.
For in-order processors, we also have another scheduler (selective one)
which does additional transformations (like register renaming and
non-modulo software pipelining) which could be more important than
top-down/bottom-up scheduling. And it gave 1-2% improvement on Itanium
SPEC2000 in comparison with haifa scheduler.