[PATCH,RFC] Disallow reordering of x87 insns while scheduling
Roger Sayle
roger@eyesopen.com
Mon Apr 18 05:39:00 GMT 2005
On Sun, 17 Apr 2005, Jeffrey A Law wrote:
> I also vaguely remember that the DFA didn't really model FXCH correctly.
The problem is that the schedulers (both the original and the new
DFA descriptions) run on the RTL stream before reg-stack.c, so they
don't even see any FXCH instructions. The difficulty is that they
then reorder insns without appreciation that what look like arbitrary
choices sometimes requires compensation code to be inserted afterwards.
If might be possible to fake some of this in the DFA descriptions.
Perhaps requiring more benefit before permuting the order of two FP
insns. I also toyed with the idea of "as-late-as-possible" scheduling
instead of GCC's current "as-soon-as-possible", or even other non-greedy
variants such as "force-directed-scheduling".
Of course, the original RTL order prior to scheduling isn't
guaranteed to be optimal with respect to FXCH placement, and
potentially scheduling could eliminate FXCHs. Far more commonly
however, the depth-first traversal of operands during gimplification
and RTL expansion result in stack friendly instruction ordering.
Binary operators commonly have their two required operands as the
last thing on the stack, etc...
I'm not claiming that the new DFA isn't an improvement over the
old one, just that GCC's algorithms have little awareness of the
x87 register stack compare to rival compilers for Intel/AMD CPUs.
One of my earlier investigations was whether reg-stack should
perform some reordering itself. Currently we can generate code
such as "push const1; push const2; fxch", where reg-stack could be
taught that if the last two instructions before a "swap" are
independent, then reordering them would avoid the need to swap.
However, my initial analysis revealed that many of these strange
operand orderings were not in the original RTL, but an unfortunate
accidental side-effect of scheduling.
A longer term approach hinted at in my previous e-mails, might
be to move reg-stack before scheduling, and use the proposed
patch to prevent the current DFA from reordering FP insns, but
interleave integer instructions. This could then evolve into
reg-stack scheduling the floating point insns itself, using
the usual dependency chains, whilst leaving current scheduler
to fine tune things. As mentioned before, the current scheduler
doesn't even see the fxch and pop instructions generated by
regstack, and certainly isn't able to overlap them with integer
ops.
The fp-reg to fp-reg dependency thing is effectively equivalent
to RTH's proposal to use index registers, REG_REF. His idea was
to introduce an autoincremented/autodecremented top-of-stack
pseudo that modeled stack state, to prevent reordering. Of
course, the same effect can be achieved by having all FP insns
clobber (use&set) the same "tos" register, and even simpler still
(i.e. not requiring a back-end rewrite) is to implicitly model,
this "tos" register as done in my patch.
I'm sure RTH will correct my exagerated over simplification.
Roger
--
More information about the Gcc-patches
mailing list