[Bug rtl-optimization/58679] [4.9 regression] ICE in create_pre_exit, at mode-switching.c:421 with -mavx after r202915

Sat Oct 26 13:25:00 GMT 2013

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58679

--- Comment #6 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #1)

> There is no return value copy insn.
> 
> The assumption in the mode switching pass, that there is a return value copy
> is not correct anymore, due to recent enhancements in the RA. The mode
> switching pass should perform life analysis by itself. Simply looking for a
> return copy is not correct anymore.

You got that backwards.  The point of most of the code in create_pre_exit
is to look for a return value copy, to make sure it is not separated any
more from the return statement unless that can't be helped.

The reason is that for likely_spilled registers, there is a double
hazard of clobbering the return value:
- The mode switching code itself might clobber the return value.
- A successive pass might clobber the return value.
  At the time the mode switching code was written - not sure if that
  has changed in the meantime - the value in a likely_spilled FUNCTION_VALUE
  register was only guaranteed to be preserved if the return value copy was
  the very last instruction before the return insn (or the end of the exit
  block).  Every pass is supposed to keep the return value copy in that
  position if a likely_spilled register FUNCTION_VALUE register is involved.

Thus, we still have to look for a return value copy.

The change in the rtl representation means that we can no longer use
REG_DEAD / REG_UNUSED notes in optimize_mode_switching to keep track of
the set of live registers that EMIT_MODE_SET has to avoid using as scratch
registers.
Also, new cases of disappearing return value copies can mean one of two
things:
- the removal is a legit optimization, and the sanity checks in
create_pre_exit should be accordingly relaxed.
- the removal increases likely_spilled register pressure in a manner that can
  lead to wrong code, and must be suppressed.

There is also a wider issue: the documentation in passes.texi is incorrect now.
It says:
"Unlike the reload pass, intermediate LRA decisions are reflected in
 RTL as much as possible."
Now, with the removal of REG_DEAD / REG_UNUSED notes, the complete opposite
is true.  These notes are a bit fuzzy after reload, but their fuzzyness is
well-defined - i.e. the notes pertain to a death in the same BB, unless
superceded by a simultaneous (with use) or later set.
But with lra, we don't have them at all anymore - that means a lot of target
code has to be re-written to use DF.
Some of this seems quite preposterous to do in place- e.g. using df inside
an output pattern.  So maybe we need to put more clobbers for opportunistic
register usage into rtl - we might use additional peephole2 patterns,
e.g. one after branch shortening for the arc casesi patterns.

SH splitters could also use such inserted clobbers, not sure where the extra
peephole2 pass should be inserted.

That raises the question - should we able to tag peephole2 patterns to
be run during specific peephole2 passes / pass sets?
Or should we call them peepholen, with n being an integer?

Or should we use a different approach, and use DF to re-create REG_DEAD notes?
Will that work now in the later stages of the compiler too?