Opportunities to do this are discovered by a mini-pass over the
instructions after generating dependencies and before scheduling a
block. At that point we have all the information required to ensure that
a candidate dep between two instructions is only used to show the
register dependence, and to ensure that every insn with a memory
reference is only subject to at most one dep causing a pattern change.
The dep_t structure is extended to hold an optional pointer to a
"replacement description", which holds information about what to change
when a dependency is broken. The time when this replacement is applied
differs depending on whether the changed insn is the DEP_CON (in which
case the pattern is changed whenever the broken dependency becomes the
last one), or the DEP_PRO, in which case we make the change when the
corresponding DEP_CON has been scheduled. This ensures that the ready
list always contains insns with the correct pattern.
A few additional bits are needed in the dep structure: one to hold
information about whether a dependency occurs multiple times, and one to
distinguish dependencies that are purely for register values from those
with other meanings (e.g. memory references).
Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than
HARD_DEP to indicate that we don't want to schedule an insn in the
current block.
A possible future extension would be to also allow autoinc addressing
modes as the increment insn.
Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf
(quite a number of changes were necessary to make it work there). It was
originally written for a mips target and tested there in the context of
a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked
like anything other than noise. Ok?