This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves
- From: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- To: Bernd Schmidt <bschmidt at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: Jeff Law <law at redhat dot com>
- Date: Tue, 17 Nov 2015 10:17:29 +0000
- Subject: Re: [PATCH][RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves
- Authentication-results: sourceware.org; auth=none
- References: <5649E333 dot 4090904 at arm dot com> <564A2339 dot 3030308 at redhat dot com> <564AEE94 dot 3070708 at arm dot com> <564AF80C dot 3010605 at arm dot com>
On 17/11/15 09:49, Kyrill Tkachov wrote:
On 17/11/15 09:08, Kyrill Tkachov wrote:
Hi Bernd,
On 16/11/15 18:40, Bernd Schmidt wrote:
On 11/16/2015 03:07 PM, Kyrill Tkachov wrote:
I've explained in the comments in the patch what's going on but the
short version is trying to change the destination of a defining insn
that feeds into an extend insn is not valid if the defining insn
doesn't feed directly into the extend insn. In the ree pass the only
way this can happen is if there is an intermediate conditional move
that the pass tries to handle in a special way. An equivalent fix
would have been to check on that path (when copy_needed in
combine_reaching_defs is true) that the state->copies_list vector
(that contains the conditional move insns feeding into the extend
insn) is empty.
I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source:
/* Considering transformation of
(set (reg1) (expression))
...
(set (reg2) (any_extend (reg1)))
into
(set (reg2) (any_extend (expression)))
(set (reg1) (reg2))
... */
I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion?
Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought
it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions
and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any
measurable difference.
Actually, I tried it out. And while a check reg_used_between_p fixed the testcase, it caused code quality regressions
on aarch64. Seems it's too aggressive in restricting ree.
I'll have a closer look.
Ok, so the testcases that regress code-quality-wise on aarch64 look like this before ree:
(insn 48 57 49 7 (set (reg:SI 7 x7)
(zero_extend:SI (mem:QI (plus:DI (reg/f:DI 1 x1)
(const_int 2 [0x2]))))))
(insn 49 48 52 7 (set (reg/v:SI 2 x2)
(reg:SI 7 x7)))
(insn 52 49 53 7 (set (reg:DI 8 x8)
(zero_extend:DI (reg:SI 7 x7))))
ree wants to transform this into:
(insn 48 57 296 7 (set (reg:DI 8 x8)
(zero_extend:DI (mem:QI (plus:DI (reg/f:DI 1 x1)
(const_int 2 [0x2]))))))
(insn 296 48 49 7 (set (reg:DI 7 x7)
(reg:DI 8 x8)))
(insn 49 296 53 7 (set (reg/v:SI 2 x2)
(reg:SI 7 x7)))
which is valid, but we reject that with the reg_used_between_p check because x7 is used in
the intermediate insn 49. Note that no conditional move is present here.
So, I think that the crucial element here is that the destination of the def_insn should
feed directly into the extend, and that is what my original patch was testing for.
So, I'd like to keep my original proposed patch as is, although I think I'll add a couple of
testcases from the duplicate PRs to it for the testsuite.
What do you think?
Thanks,
Kyrill
Kyrill
Would you prefer to use !reg_used_between_p here?
The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix,
+ || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn))))
Shouldn't this really be !rtx_equal_p?
Maybe, will it behave the right way if the two regs have different modes or when subregs are involved?
(will we even hit such a case in this path?)
Thanks,
Kyrill
Bernd