[PATCH][10/n] Remove GENERIC stmt combining from SCCVN
Kyrill Tkachov
kyrylo.tkachov@arm.com
Mon Jul 6 14:56:00 GMT 2015
On 06/07/15 15:46, Richard Biener wrote:
> On Mon, 6 Jul 2015, Kyrill Tkachov wrote:
>
>> Hi Richard,
>>
>> On 01/07/15 14:03, Richard Biener wrote:
>>> This merges the complete comparison patterns from the match-and-simplify
>>> branch, leaving incomplete implementations of fold-const.c code alone.
>>>
>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>>>
>>> Richard.
>>>
>>> 2015-07-01 Richard Biener <rguenther@suse.de>
>>>
>>> * fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
>>> X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
>>> ~X CMP C -> X CMP' ~C to ...
>>> * match.pd: ... patterns here.
>>>
>>>
>>> +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
>>> + ??? The transformation is valid for the other operators if overflow
>>> + is undefined for the type, but performing it here badly interacts
>>> + with the transformation in fold_cond_expr_with_comparison which
>>> + attempts to synthetize ABS_EXPR. */
>>> +(for cmp (eq ne)
>>> + (simplify
>>> + (cmp (minus @0 @1) integer_zerop)
>>> + (cmp @0 @1)))
>> This broke some tests on aarch64:
>> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
>> w[0-9]+
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
>> w[0-9]+, lsl 3
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
>> x[0-9]+
>> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
>> x[0-9]+, lsl 3
>>
>> To take subs.c as an example:
>> There's something odd going on:
>> The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
>> but
>> not the long long case, but the int case (foo) is the place where the rtl ends
>> up being:
>>
>> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>> (minus:SI (reg/v:SI 76 [ x ])
>> (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>> (nil))
>> (insn 10 9 11 2 (set (reg:CC 66 cc)
>> (compare:CC (reg/v:SI 76 [ x ])
>> (reg/v:SI 77 [ y ])))
>>
>> instead of the previous:
>>
>> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
>> (minus:SI (reg/v:SI 76 [ x ])
>> (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>>
>> (insn 10 9 11 2 (set (reg:CC 66 cc)
>> (compare:CC (reg/v:SI 74 [ l ])
>> (const_int 0 [0])))
>>
>>
>> so the tranformed X CMP Y does not get matched by combine into a subs.
>> Was the transformation before the patch in fold-const.c not getting triggered?
> It was prevented from getting triggered by restricting the transform
> to single uses (a fix I am testing right now).
>
> Note that in case you'd write
>
> int l = x - y;
> if (l == 0)
> return 5;
>
> /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
> z = x - y ;
>
> the simplification will happen anyway because the redundancy
> computing z has not yet been eliminated (a reason why such
> single-use checks are not 100% the very much "correct" thing to do).
Ok, thanks. Andreas pointed out PR 66739 to me. I had not noticed it.
Sorry for the noise.
Kyrill
>
>> In aarch64 we have patterns to match:
>> [(set (reg:CC_NZ CC_REGNUM)
>> (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
>> (match_operand:GPI 2 "register_operand" "r"))
>> (const_int 0)))
>> (set (match_operand:GPI 0 "register_operand" "=r")
>> (minus:GPI (match_dup 1) (match_dup 2)))]
>>
>>
>> Should we add a pattern to match:
>> [(set (reg:CC CC_REGNUM)
>> (compare:CC (match_operand:GPI 1 "register_operand" "r")
>> (match_operand:GPI 2 "register_operand" "r")))
>> (set (match_operand:GPI 0 "register_operand" "=r")
>> (minus:GPI (match_dup 1) (match_dup 2)))]
>>
>> as well?
> No, I don't think so.
>
> Richard.
>
>> Kyrill
>>
>>> +
>>> +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
>>> + signed arithmetic case. That form is created by the compiler
>>> + often enough for folding it to be of value. One example is in
>>> + computing loop trip counts after Operator Strength Reduction. */
>>> +(for cmp (tcc_comparison)
>>> + scmp (swapped_tcc_comparison)
>>> + (simplify
>>> + (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
>>> + /* Handle unfolded multiplication by zero. */
>>> + (if (integer_zerop (@1))
>>> + (cmp @1 @2))
>>> + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>> + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>>> + /* If @1 is negative we swap the sense of the comparison. */
>>> + (if (tree_int_cst_sgn (@1) < 0)
>>> + (scmp @0 @2))
>>> + (cmp @0 @2))))
>>> +
>>> +/* Simplify comparison of something with itself. For IEEE
>>> + floating-point, we can only do some of these simplifications. */
>>> +(simplify
>>> + (eq @0 @0)
>>> + (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> + || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
>>> + { constant_boolean_node (true, type); }))
>>> +(for cmp (ge le)
>>> + (simplify
>>> + (cmp @0 @0)
>>> + (eq @0 @0)))
>>> +(for cmp (ne gt lt)
>>> + (simplify
>>> + (cmp @0 @0)
>>> + (if (cmp != NE_EXPR
>>> + || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> + || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
>>> + { constant_boolean_node (false, type); })))
>>> +
>>> +/* Fold ~X op ~Y as Y op X. */
>>> +(for cmp (tcc_comparison)
>>> + (simplify
>>> + (cmp (bit_not @0) (bit_not @1))
>>> + (cmp @1 @0)))
>>> +
>>> +/* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */
>>> +(for cmp (tcc_comparison)
>>> + scmp (swapped_tcc_comparison)
>>> + (simplify
>>> + (cmp (bit_not @0) CONSTANT_CLASS_P@1)
>>> + (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
>>> + (scmp @0 (bit_not @1)))))
>>> +
>>> +
>>> /* Unordered tests if either argument is a NaN. */
>>> (simplify
>>> (bit_ior (unordered @0 @0) (unordered @1 @1))
>>>
>>
More information about the Gcc-patches
mailing list