This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Combiner fix for PR79910
- From: Jeff Law <law at redhat dot com>
- To: Bernd Schmidt <bschmidt at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 14 Mar 2017 17:03:41 -0600
- Subject: Re: Combiner fix for PR79910
- Authentication-results: sourceware.org; auth=none
- References: <15d38062-fd33-7de7-5bba-79f17adefc65@redhat.com>
On 03/10/2017 04:24 PM, Bernd Schmidt wrote:
In this PR, we have a few insns involved in two instruction combinations:
insn 16: set r100
insn 27: some calculation
insn 28: some calculation
insn 32: using r100
insn 33: using r100
insn 35: some calculation
Then we combine insns 27, 28 and 33, producing two output insns, As a
result, insn 28 (i2) now obtains a use of r100. But insn 32, which is
not involved in this combination, has the LOG_LINKS entry for that
register, and we don't know that we need to update it. As a result, the
second combination, involving regs 16, 32 and 35 (based on the remaining
LOG_LINK for r100), produces incorrect code, as we don't realize there's
a use of r100 before insn 32.
The following fixes it. Bootstrapped and tested on x86_64-linux, ok (on
all branches)?
Bernd
79910.diff
PR rtl-optimization/79910
* combine.c (record_used_regs): New static function.
(try_combine): Handle situations where there is an additional
instruction between I2 and I3 which needs to have a LOG_LINK
updated.
PR rtl-optimization/79910
* gcc.dg/torture/pr79910.c: New test.
What a nasty little problem.
I don't like that we have to build these bitmaps due to the compile-time
cost. Would it make sense to only build them when i0/i1 exist?
We don't do 4->3 combinations, just 4->2 and 3->2, so it's only the
i2pattern where we might need to conjure up a LOG_LINK, right?
We don't do 4->3 combinations, right? So we only have to care about
when there's going to be an newi2pat, right (3->2 or 4->2).
We don't ever create a newi1pat (for a 4->3 combination), right? So we
only have to worry about testing when there's a newi2pat, right?
Do you need to handle 4->3, in which case the new bits could be on newi1pat?
Index: gcc/combine.c
===================================================================
--- gcc/combine.c (revision 245685)
+++ gcc/combine.c (working copy)
@@ -2559,6 +2560,57 @@ can_split_parallel_of_n_reg_sets (rtx_in
return true;
}
@@ -4051,6 +4124,33 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
return 0;
}
+ auto_bitmap new_regs_in_i2;
+ if (newi2pat)
+ {
+ /* We need to discover situations where we introduce a use of a
+ register into I2, where none of the existing LOG_LINKS contain
+ a reference to it. This can happen if previously I3 referenced
+ the reg, and there is an additional use between I2 and I3. We
+ must remove the LOG_LINKS entry from that additional use and
+ distribute it along with our own ones. */
+ note_uses (&newi2pat, record_used_regs, (bitmap)new_regs_in_i2);
+ bitmap_and_compl_into (new_regs_in_i2, i2_regset);
+ bitmap_and_compl_into (new_regs_in_i2, links_regset);
+
+ /* Here, we first look for situations where a hard register use
+ moved, and just give up. This should happen approximately
+ never, and it's not worth it to deal with possibilities like
+ multi-word registers. Later, when fixing up LOG_LINKS, we
+ deal with the case where a pseudo use moved. */
+ if (prev_nonnote_insn (i3) != i2)
+ for (unsigned r = 0; r < FIRST_PSEUDO_REGISTER; r++)
+ if (bitmap_bit_p (new_regs_in_i2, r))
if (prev_nonnote_insn (i3) != i2
&& bitmap_first_set_bit (new_regs_in_i2) < FIRST_PSEUDO_REGISTER)
?
Overall it seems reasonable. If possible, let's avoid the calls to
create the bitmaps in case where we know we'll never be creating a new
use in i2.
I'm not wed to using bitmap_first_set_bit. It's clearer and may be
faster since you're not doing a ton of calls into bitmap_bit_p. The
bitmap_first_set_bit implementation under the hood finds the word within
the first bitmap element that is nonzero, then uses ctzl on that to get
its bit.
Jeff