[RFC] PR/27390, latent complex-mode bug in regstack
Roger Sayle
roger@eyesopen.com
Tue Jun 6 00:33:00 GMT 2006
On Mon, 5 Jun 2006, Paolo Bonzini wrote:
> This patch fixes two problems in regstack involving complex modes.
Hi Paolo,
Sorry again for the delay in commenting on PR target/27390, it's
another case of someone Cc'ing sayle@gcc.gnu.org in bugzilla. :-(
I believe something like your first?/second? patch, that only affects
subst_stack_regs_pat is the correct fix. You describe them as patch
one and patch two in your original message, but they appear in reversed
order in the ENOPATCH follow-up. :-)
The issue (which it sounds like you've already figured out), is that
move_nan_for_stack_reg must be called in the same order as the
instructions are processed/encountered in the insn stream in order to
keep the "regstack" state (which is updated in move_nan_for_stack_reg)
current/accurate.
The problems you described in bugzilla, are from your original attempt
calling "move_nan_for_stack_reg" on insn, then inserting a new insn
immediately before it, and calling "move_nan_for_stack_reg" on that:
which confused the "regstack" state as to which registers were where
on the stack after this sequence.
Notice, that this resulted in the invalid code:
flds .LC0
flds .LC0
fstp %st
fstp %st(1)
when what the compiler intended to generate was
flds .LC0
flds .LC0
fstp %st(1)
fstp %st
or
flds .LC0
flds .LC0
fstp %st
fstp %st
Hence your revised patch that inserts the new insn first, calls
move_nan_for_stack_reg on it, and then calls move_nan_for_stack
on the current/final insn is the correct strategy.
Ideally, we can probably do even better by converting this into
flds .LC0
fld %st
avoiding two references to the constant pool, by first loading
a NaN, and then duplicating the top of stack. Not that uninitialized
complex values are a common occurence :-). By peaking into the
inner workings for "move_nan_for_stack_reg" you can probably avoid
creating the CLOBBER pattern you currently use, and call directly
to "move_for_stack_reg", first for the NaN then for the copy/dup.
This avoids creating dummy RTL only to immediately overwrite/leak it.
Does all this make sense? (and rationalizes the strange behaviours
you've been seeing). Let me know if your happy to investigate the above
refinements or you'd prefer to commit something similar to what you have
(to resolve the regressions) and leave it to someone else to tidy up
a bit.
Many thanks for investigating this. It's now clear that this was a
pre-existing latent bug that your reload preference changes had
inadvertantly exposed.
Roger
--
More information about the Gcc-patches
mailing list