This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Conditional negation elimination in tree-ssa-phiopt.c
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- Cc: Jeff Law <law at redhat dot com>, Richard Biener <richard dot guenther at gmail dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Wed, 13 Aug 2014 15:57:31 +0100
- Subject: Re: Conditional negation elimination in tree-ssa-phiopt.c
- Authentication-results: sourceware.org; auth=none
- References: <53E8C81B dot 1070303 at arm dot com> <53E91FD8 dot 2000005 at redhat dot com> <CAFiYyc0fk5GybcAZRujABGAssB5udHVfAabpH=pbZVi2N2HQyg at mail dot gmail dot com> <53E9ED14 dot 2080405 at arm dot com> <53EA2338 dot 7010004 at redhat dot com> <53EA2EAA dot 9060402 at arm dot com> <53EA2FD2 dot 3040700 at arm dot com> <53EA48FC dot 3090907 at arm dot com>
On 12/08/14 18:03, Kyrill Tkachov wrote:
>
> On 12/08/14 16:16, Kyrill Tkachov wrote:
>> On 12/08/14 16:11, Kyrill Tkachov wrote:
>>> On 12/08/14 15:22, Jeff Law wrote:
>>>> On 08/12/14 04:31, Kyrill Tkachov wrote:
>>>>> On 12/08/14 10:39, Richard Biener wrote:
>>>>>> On Mon, Aug 11, 2014 at 9:56 PM, Jeff Law <law@redhat.com> wrote:
>>>>>>> On 08/11/14 07:41, Kyrill Tkachov wrote:
>>>>>>>> I haven't been able to get combine to match the comparison+xor+neg+plus
>>>>>>>> RTL and it seems like it would be just a workaround to undo the
>>>>>>>> tree-level transformation.
>>>>>>> Yea, it'd just be a workaround, but it's probably the easiest way to
>>>>>>> deal
>>>>>>> with this problem. Can you describe in further detail why you
>>>>>>> weren't able
>>>>>>> to get this to work?
>>>>>> Too many instructions to combine I guess. You might want to add
>>>>>> intermediate "combine" insn-and-splits. If that's still a no-go then
>>>>>> read on.
>>>> My guess was too many insns as well.. But that's often solvable.
>>>>
>>>>> From the combine dump I can see that it tried to combine up to:
>>>>> (set (reg/i:SI 0 x0)
>>>>> (plus:SI (xor:SI (neg:SI (reg:SI 84 [ D.2565 ]))
>>>>> (reg:SI 73 [ D.2564 ]))
>>>>> (reg:SI 84 [ D.2565 ])))
>>>> And did it find a match for this? What happens if (just for testing
>>>> purposes), you create a pattern for this? Does combine then try
>>>> something even more complex, possibly getting your conditional negation?
>>> I managed to get combine to recognise this pattern:
>>> (set (match_operand:GPI 0 "register_operand" "=r")
>>> (plus:GPI (xor:GPI (neg:GPI (match_operand:GPI 1
>>> "register_operand" "r"))
>>> (match_operand:GPI 2 "register_operand" "r"))
>>> (match_dup 1)))
>>>
>>> Now what I need is for operand 1 to instead be a cc_reg comparison, but
>>> when I change operand 1 to this pattern:
>>>
>>> (set (match_operand:GPI 0 "register_operand" "=r")
>>> (plus:GPI (xor:GPI (neg:GPI (match_operator:GPI 1
>>> "aarch64_comparison_operator"
>>> [(match_operand:CC 2 "cc_register" "")
>>> (const_int 0)]))
>>> (match_operand:GPI 3 "register_operand" "r"))
>>> (match_dup 1)))
>>>
>>> This doesn't match. Is there any way to express this in a combineable
>>> pattern?
>> argh, Thunderbird enforced the 80 character limit...
>>
>> The pattern that matched was:
>> (set (match_operand:GPI 0 "register_operand" "=r")
>> (plus:GPI
>> (xor:GPI
>> (neg:GPI (match_operand:GPI 1 "register_operand" "r"))
>> (match_operand:GPI 2 "register_operand" "r"))
>> (match_dup 1)))
>
> If I match that to a define_and_split and split it into two sets:
> (set (reg1) (neg reg2))
> (set (plus (xor (reg1) (reg3))
> (reg2)))
>
> and then add a define_insn with the full pattern:
>
> (set (match_operand:GPI 0 "register_operand" "=r")
> (plus:GPI
> (xor:GPI
> (neg:GPI
> (match_operator:GPI 1 "aarch64_comparison_operator"
> [(match_operand:CC 2 "cc_register" "")
> (const_int 0)]))
> (match_operand:GPI 3 "register_operand" "r"))
> (match_dup 1)))
>
> Then this does manage to match the full thing that I want. But I had to
> add another define_split to break up the plus+xor Frankenstein's monster
> back into two separate insns for the cases where it picks up
> non-conditional-negate patterns.
>
> Does that seem reasonable or too much of a hack? Any plus+xor rtxs that
> get left after combine should be split up again relatively quickly in
> split1 and shouldn't inhibit further optimisation too badly, no?
>
>
The problem with the frankenmonster patterns is that they tend to
proliferate into the machine description, and before you know where you
are the back-end is full of them.
Furthermore, they are very sensitive to the greedy first-match nature of
combine: a better, later, combination is missed because a less good,
earlier, optimization matched. If the first insn in the sequence is
merged into an earlier instruction then you can end up with a junk
sequence that completely fails to simplify. That ends up with
super-frankenmonster patterns to deal with all the subcases and the
problems grow exponentially from there.
I really do think that the best solution would be to try and catch this
during expand if possible and generate the right pattern from the start;
then you don't risk combine failing to come to the rescue after several
intermediate transformations have taken place.
> Kyrill
>>
>> And the one that failed to combine (with operand 1 substituted for a
>> cc_reg comparison) was:
>>
>> (set (match_operand:GPI 0 "register_operand" "=r")
>> (plus:GPI
>> (xor:GPI
>> (neg:GPI
>> (match_operator:GPI 1 "aarch64_comparison_operator"
>> [(match_operand:CC_ZERO 2 "cc_register" "")
>> (const_int 0)]))
>> (match_operand:GPI 3 "register_operand" "r"))
>> (match_dup 1)))
>>
>>> Thanks,
>>> Kyrill
>>>
>>>
>>>>> On the other hand, I did manage to write a peephole2 that detected the
>>>>> sequence of compare+neg+xor+plus and transformed it into the
>>>>> if_then_else form that our current conditional negation pattern has,
>>>>> although I'm not sure how flexible this is.
>>>> Probably not very. We really should be looking at combine. In fact, I
>>>> would argue that we should be looking at combine regardless of whether
>>>> or not we twiddle expansion as humans or machine generated code could
>>>> look like this...
>>>>
>>>> jeff
>>>>
>>>>
>>>
>>
>>
>
>
>