This is the mail archive of the
mailing list for the GCC project.
Re: RFC: mips SB-1 DFA scheduler
- From: Jim Wilson <wilson at specifixinc dot com>
- To: Paul Koning <pkoning at equallogic dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 03 Mar 2004 16:25:10 -0800
- Subject: Re: RFC: mips SB-1 DFA scheduler
- References: <email@example.com> <firstname.lastname@example.org.HOWL>
Paul Koning wrote:
If we have four issue slots to fill, don't we want to look ahead a
fair amount MORE than 4 insns to find the four that are the best
choices to issue right now? (If that's not what this "lookahead"
thing is meant to do, its docs could stand some improvement.) I
wonder if that accounts for the fact that you didn't see an
improvement with an issue_rate of 4, as one would expect.
If you look at every target that uses the lookahead parameter, it is set
to the number of instructions that can be issued per cycle. The IA-64
port in particular does it this way, and the IA-64 DFA scheduler was
written by Vlad who ought to know, so clearly this is the way it is
intended to be used. I haven't looked at the details of what this
actually does, so I can't comment on whether using a larger number would
be more helpful.
I had trouble with an issue rate of 4 because aggressive cross-block
scheduling moved too many instructions trying to fill issue slots. In
the process of doing this, it actually made the critical path longer for
the kernel of one of the benchmarks I was looking at. Then there were
interactions with other optimizations, if conversion in particular,
which made things worse. So I got better code by using an issue rate of
3 because there was less cross-block movement. I consider this a
tunable parameter though, and it may turn out later that 4 is a better
Some thing to keep in mind about my testing so far. I haven't used any
real benchmarks yet, like SPEC, so some of the tunable parameters may
not be at the optimal setting. Also, the DFA scheduler as written can
not model the simple alu instructions that can issue to either the
load/store or alu execute units. This would take a lot more work than I
have put into it so far. Since the DFA scheduler does not exactly match
the hardware, optimal settings for some numbers may differ from the
Out of curiosity, does this use any of the DFA scheduling work I did?
I guess it doesn't; looking back at what Chris posted a while back,
that was only the bare skeleton.
I don't know what you are referring to. I wrote this all from scratch.
If there is previous work, I can take a look at it. I saw a non-DFA
scheduler for SB-1 in some old Broadcom tools releases, but I don't know
about any previous DFA scheduler work for the SB-1.
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com