This is the mail archive of the
mailing list for the GCC project.
RE: [PATCH][MIPS] P5600 scheduling
- From: Jaydeep Patil <Jaydeep dot Patil at imgtec dot com>
- To: Richard Sandiford <rdsandiford at googlemail dot com>
- Cc: Rich Fuhler <Rich dot Fuhler at imgtec dot com>, Matthew Fortune <Matthew dot Fortune at imgtec dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 23 May 2014 04:07:33 +0000
- Subject: RE: [PATCH][MIPS] P5600 scheduling
- Authentication-results: sourceware.org; auth=none
- References: <BD7773622145634B952E5B54ACA8E34936A97AF1 at PUMAIL01 dot pu dot imgtec dot org> <87r43pjx4f dot fsf at talisman dot default> <BD7773622145634B952E5B54ACA8E34936A981D8 at PUMAIL01 dot pu dot imgtec dot org> <87d2f5sw4u dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com>
Thanks for the review.
Let me get back to you on the results of -fsched-pressure --param sched-pressure-algorithm=2 options.
Yes, we start the EBB with pressure 0.
From: Richard Sandiford [mailto:email@example.com]
Sent: 22 May 2014 PM 08:23
To: Jaydeep Patil
Cc: Rich Fuhler; Matthew Fortune; firstname.lastname@example.org
Subject: Re: [PATCH][MIPS] P5600 scheduling
Thanks for the write-up and updated patches. I'll try to get to them this weekend. In the meantime...
Jaydeep Patil <Jaydeep.Patil@imgtec.com> writes:
> The -msched-weight option:
> We are using ~650 hot-spot functions from VP9/VP8/H264/MPEG4 codes
> available to us as a test suite. The default Haifa-scheduler worked
> well for most of the functions, but excess spilling was observed in
> cases where register pressure was more than ~20. The -fsched-pressure
> flag proved good in some cases, but the algorithm focuses more on
> reducing register pressure.
> We observed increase in stalls (but less spilling) with the -fsched-
> pressure option. When the register pressure goes beyond a certain
> threshold, the -msched-weight option tries to keep it down by
> promoting certain instructions up in the order. It has been
> implemented as part of TARGET_SCHED_REORDER target hook (function
> mips_sched_weight). The change is generic and there is nothing
> specific to MIPS.
> When the register pressure goes beyond 15 then an instruction with
> maximum forward dependency is promoted ahead of the instruction at
> Scheduling of an INSN with maximum forward dependency enables early
> scheduling of instructions dependent on it.
> When the register pressure goes beyond 25 and if consumer of the
> instruction in question (INSN) has higher priority than the
> instruction at READY[NREADY-1] then INSN is promoted. This chooses an
> INSN which has a high priority instruction dependent on it. This
> triggers the scheduling of that consumer instruction as early as
> possible to free up the registers used by that instruction.
Yeah, this sounds similar to what I was seeing for Cortex-A8 with the default -fsched-pressure (which is tuned for and known to work well on x86). Did you try with:
-fsched-pressure --param sched-pressure-heuristic=2
? That isn't advertised in the documentation because we don't want to make it a user-level option that would then need to be supported in future. But if you find that the above works better than plain -fsched-pressure, which is equivalent to:
-fsched-pressure --param sched-pressure-heuristic=1
then we could consider making sched-pressure-heuristic=2 the default for MIPS. I'd certainly be interested to know how it compares with -msched-weight.
FWIW the write-up of the alternative -fsched-pressure is here:
It sounds like the Linaro guys have a performance fix pending, so if running the tests takes a lot of your time or a lot of resources, it might be worth waiting until that's submitted.
E.g. one of the things I noticed with the default -fsched-pressure at the time -- not sure whether this has changed since -- is that the pressure at the start of the EBB was based on the total number of live values, including those that are live across a loop but not used in it. So in many cases the starting pressure was too high and so like you say the heuristic was too conservative. That was one of the main things that the alternative heuristic was supposed to help with.
It looks like you solve that by starting with a pressure of 0 for each EBB, is that right? That's a bit more aggressive than what I did, since AIUI starting with 0 will ignore loop invariants.
On the other hand, being more aggressive (i.e. closer to what you'd get with the default scheduling heuristic) also means that it's more likely to be usable by default.