Hi Guys,
I recently discovered a bug in the current Redundant Extension
Elimination optimization. If the candidate extension instruction
increases the number of hard registers used, the pass does not check
to see if these extra registers are live between the definition and
the extension.
For example:
(insn 44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn 45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
[...]
(insn 71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn 88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))
(This is on the RL78 target where HImode values occupy two hard
registers and QImode values only one. The bug however is generic, not
RL78 specific).
The REE pass transforms this into:
(insn 44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
(insn 45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18))))
[...]
(insn 71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
[...]
(insn 88 deleted)
Note how the new set at insn 45 clobbers the value loaded by insn 44
into r11.
The patch below is my attempt to fix this. It is not correct
however. I could not work out how to determine if a given hard
register is live at any point between two insns. So instead I used
the liveness information associated with the basic block containing
the definition. If one of the extended registers is live or becomes
live in this block, and this block is not the same block as the one
containing the extension insn, then I stop the optimization. This
works for the RL78 for all of the cases that I could find.
So ... comments please ?
Will gcc ever generate a single basic block that contains both the
definition and the extension instructions, but also where the extra
hard registers are used for another purpose as well ?
Tested with no regressions (or fixes) on an x86-pc-linux-gnu target.
Also tested with no regression and 7 fixes on an rl78-elf target.
Cheers
Nick
gcc/ChangeLog
2015-11-18 Nick Clifton <nickc@redhat.com>
* ree.c (regs_live_between): New function.
(add_removable_extension): If the extension uses more hard
registers than the definition then check that the extra hard
registers are not live between the definition and the
extension.
@@ -1080,6 +1123,30 @@
}
}
+ /* Fourth, if the extended version occupies more registers than
+ the original version then make sure that these extra registers
+ are not live between the definition and the extension. */
+ if (HARD_REGNO_NREGS (REGNO (dest), mode)
+ > HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))