This is the mail archive of the
mailing list for the GCC project.
Re: Finding insns to reorder using dataflow
- From: Jeff Law <law at redhat dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Fri, 14 Aug 2015 09:31:19 -0600
- Subject: Re: Finding insns to reorder using dataflow
- Authentication-results: sourceware.org; auth=none
- References: <55CC7A43 dot 5010209 at arm dot com> <55CCC3E2 dot 3030003 at redhat dot com> <55CDAF74 dot 7010004 at arm dot com>
On 08/14/2015 03:05 AM, Kyrill Tkachov wrote:
The problem I'm trying to solve can be expressed in this way: "An
insn that satisfies predicate pred_p (insn) cannot appear exactly N
insns apart from another insn 'insn2' that satisfies pred_p (insn2).
N is a constant". So, the problem here is that this restriction is
not something expressed in terms of cycles or DFA states, but rather
distance in the instruction stream.
I wasn't really suggesting to model it in DFA states, but instead use
the dependency analysis + hooks. The dependency analysis in particular
when it's safe to interchange two insns.
Given the additional information, I think you'd want to note when an
insn fires and satisfies pred_p, and associate a counter with each
firing. THe active counters are bumped (decremented?) at each firing
(so you can track how many insns appear after the one that satisfied
pred_p). Note that for insns which generate multiple assembly
instructions, you need to decrement the counter by the number of
assembly instructions they emit.
Then when sorting the ready list, if you have an insn that satisfies
pred_p and an active counter has just reached zero, make sure some other
insn fires (what if there aren't any other ready insns? Is this a
correctness or performance issue?)
I don't think I can do this reliably during sched2 because there is
still splitting that can be done that will create more insns that
will invalidate any book keeping that I do there.
Right. You need everything split and you need accurate insn length
information for every insn in the backend that isn't split. If this is
a correctness issue, then you also have to deal with final deleting
insns behind your back as well.
Many years ago I did something which required 100% accurate length
information from the backend. It was painful, very very painful.
Ultimately it didn't work out and the code was scrapped.
You'll have to work through them -- I haven't kept close tabs on the
various hooks we have, I just know we have them.
However, during TARGET_MACHINE_DEPENDENT_REORG I can first split all
insns and then call schedule_insns () to do another round of
scheduling. However, I'm a bit confused by all the different
scheduler hooks and when each one is called in relation to the
The problem I see is that once one of these insns fire, other new insns
will be added to the ready list. So you have to keep some kind of state
about how many instructions back one of these insns fired and consult
that data when making a decision about the next instruction to fire.
I'd need to keep some kind of bitfield recording for the previous N
instructions in the stream whether they satisfy pred_p. Where would I
record that? Can I just do everything in TARGET_SCHED_REORDER? i.e.
given a ready list check that no pred_p insns in it appear N insns
apart from another such insn (using my bitfield as a lookup helper),
reorder insns as appropriate and then record the order of the pred_p
insns in the bitfield. Would the scheduler respect the order of the
insns that was set by TARGET_SCHED_REORDER and not do any further
All this will fall apart if this is a correctness issue since you'd have
to issue a nop or somesuch. Though I guess you might be able to arrange
to get a nop into the scheduled stream. If this is a correctness issue,
tackling it in the assembler may make more sense.