[PATCH][RTL-ifcvt] Improve conditional select ops on immediates
Kyrill Tkachov
kyrylo.tkachov@arm.com
Mon Aug 3 14:12:00 GMT 2015
On 03/08/15 14:43, Kyrill Tkachov wrote:
> On 03/08/15 14:37, Uros Bizjak wrote:
>> On Mon, Aug 3, 2015 at 3:02 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>> On 03/08/15 13:33, Uros Bizjak wrote:
>>>> Hello!
>>>>
>>>>> 2015-07-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>>>
>>>>> * ifcvt.c (noce_try_store_flag_constants): Make logic of the case
>>>>> when diff == STORE_FLAG_VALUE or diff == -STORE_FLAG_VALUE more
>>>>> explicit. Prefer to add the flag whenever possible.
>>>>> (noce_process_if_block): Try noce_try_store_flag_constants before
>>>>> noce_try_cmove.
>>>>>
>>>>> 2015-07-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>>>>
>>>>> * gcc.target/aarch64/csel_bfx_1.c: New test.
>>>>> * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.
>>>> This patch regressed following tests on x86_64:
>>>>
>>>> FAIL: gcc.target/i386/cmov2.c scan-assembler sbb
>>>> FAIL: gcc.target/i386/cmov3.c scan-assembler cmov[^3]
>>> Sorry for that :(
>>> I could have sworn they passed for me during my run...
>>>
>>>> While cmov3 case is questionable by itself in light of PR56309 [1],
>>>> the cnov2 case regressed from:
>>>>
>>>> cmpl %edi, %esi
>>>> sbbl %eax, %eax
>>>> andl $-10, %eax
>>>> addl $5, %eax
>>>> ret
>>>>
>>>> to:
>>>>
>>>> xorl %eax, %eax
>>>> cmpl %esi, %edi
>>>> setbe %al
>>>> negl %eax
>>>> andl $10, %eax
>>>> subl $5, %eax
>>>> ret
>>>>
>>>> Please note that sbb (aka {*x86_movsicc_0_m1} ) is not generated
>>>> anymore. Non-QImode setcc instructions are problematic on x86, since
>>>> they need to be zero-extended to their full length.
>> IMO, we can check if the target is able to generate insn that
>> implements conditional move between 0 and -1 for certain comparison
>> mode, like:
>>
>> #(insn:TI 27 26 28 2 (parallel [
>> # (set (reg:SI 0 ax [orig:87 D.1845 ] [87])
>> # (if_then_else:SI (ltu:SI (reg:CC 17 flags)
>> # (const_int 0 [0]))
>> # (const_int -1 [0xffffffffffffffff])
>> # (const_int 0 [0])))
>> # (clobber (reg:CC 17 flags))
>> # ]) cmov2.c:9 947 {*x86_movsicc_0_m1}
>> # (expr_list:REG_DEAD (reg:CC 17 flags)
>> # (expr_list:REG_UNUSED (reg:CC 17 flags)
>> # (nil))))
>> sbbl %eax, %eax # 27 *x86_movsicc_0_m1 [length = 2]
>>
>> this is the key insn, and as shown above, further asm sequence is
>> similar between patched and unpatched compiler.
> Hmm yes.
> We have a HAVE_conditional_move check, but that's not fine grained enough.
> How about trying an emit_conditional_move and checking that the result is a
> single insn?
Hmm, looking deeper into it, I see that:
/* if (test) x = a; else x = b;
=> x = (-(test != 0) & (b - a)) + a; */
is exactly the transformation that:
cmpl %edi, %esi
sbbl %eax, %eax
andl $-10, %eax
addl $5, %eax
is supposed to represent, right? It seems that noce_try_store_flag_constants is just
not emitting the optimal rtx forms for x86
So, if before the patch the code in noce_try_store_flag_constants was never being called
what performed this transformation? Turns out it's the conditional move expander in
emit_conditional_move that ends up calling the ix86_expand_int_movcc that performs lots of
these complex transformations, including this one in an x86-specific way.
I suppose that's why noce_try_store_flag_constants was tried after noce_try_cmove.
Because, it is expected that noce_try_store_flag_constants will call the conditional move
expander and the target-specific conditional move expander is expected to do all the complex
transformations beneficial to the target. I suppose the comment "Try only simple constants and registers here"
above noce_try_cmove is somewhat misleading then.
I see a couple of avenues here. We try both noce_try_cmove and noce_try_store_flag_constants, compare their
seq_cost and pick the cheapest, which sounds somewhat ugly to me, or in noce_try_store_flag_constants we
try to be more fine-grained and try to restrict some of the transformations, but on what criteria?
Kyrill
>
> Kyrill
>
>
>> Uros.
>>
More information about the Gcc-patches
mailing list