This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][AArch64] PR target/84164: Relax predicate in *aarch64_<optab>_reg_di3_mask2


On 02/02/18 15:43, Kyrill Tkachov wrote:
> Hi Richard,
> 
> On 02/02/18 15:25, Richard Earnshaw (lists) wrote:
>> On 02/02/18 15:10, Kyrill Tkachov wrote:
>>> Hi all,
>>>
>>> In this [8 Regression] PR we ICE because we can't recognise the insn:
>>> (insn 59 58 43 7 (set (reg:DI 124)
>>>          (rotatert:DI (reg:DI 125 [ c ])
>>>              (subreg:QI (and:SI (reg:SI 128)
>>>                      (const_int 65535 [0xffff])) 0)))
>>
>> Aren't we heading off down the wrong path here?
>>
>> (subreg:QI (and:SI (reg:SI 128) (const_int 65535 [0xffff])) 0))
>>
>> can be simplified to
>>
>> (subreg:QI (reg:SI 128) 0)
>>
>> since the AND operation is redundant as we're only looking at the bottom
>> 8 bits.
> 
> I have tried implementing such a transformation in combine [1]
> but it was not clear that it was universally beneficial.
> See the linked thread and PR 70119 for the discussion (the thread
> continues into the next month).

Is that really the same thing?  The example there was using a mask that
was narrower than the subreg and thus not redundant.  The case here is
where the mask is completely redundant because we are only looking at
the bottom 8 bits of the result (which are not changed by the AND
operation).

R.

> 
> This patch fixes the existing patterns, such as they are,
> with the explicit subreg matching and associated warts.
> 
> We can resurrect the combine simplification discussion at some point
> but for the GCC 8 it would be safer to fix the existing pattern.
> 
> Thanks,
> Kyrill
> 
> [1] https://gcc.gnu.org/ml/gcc/2016-02/msg00357.html
> 
>> R.
>>
>>> This was created by the *aarch64_reg_<mode>3_neg_mask2 splitter.
>>> The point of these splitters and patterns is to eliminate redundant
>>> masking of the shift (or rotate in this case) amount when shifting
>>> by a variable amount.  For example in AND x3, x3, 0xffff ; ROR x1,
>>> x2, x3
>>> we can eliminate the AND because the ROR instruction implicitly "MOD"s
>>> the shift amount in X3 by 64. So this pattern is supposed to match the
>>> following:
>>>
>>> (define_insn "*aarch64_<optab>_reg_di3_mask2"
>>>    [(set (match_operand:DI 0 "register_operand" "=r")
>>>      (SHIFT:DI
>>>        (match_operand:DI 1 "register_operand" "r")
>>>        (match_operator 4 "subreg_lowpart_operator"
>>>         [(and:SI (match_operand:SI 2 "register_operand" "r")
>>>              (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]
>>>
>>> The rotation amount mask is supposed to be operand 3 but the predicate
>>> for it here
>>> is aarch64_shift_imm_di which only allows values in [0, 63], whereas we
>>> want to allow
>>> any value that doesn't touch the bottom GET_MODE_BITSIZE (DImode) bits,
>>> which is what
>>> the pattern predicate tests. So the predicate on operands 3 is too
>>> strict.
>>>
>>> This patch just makes it const_int_operand since the pattern predicates
>>> has the correct
>>> validation for its values. This is in line with what the
>>> *aarch64_reg_<mode>3_neg_mask2
>>> and *aarch64_reg_<mode>3_minus_mask splitters accept (and they are the
>>> ones that generate
>>> this insn pattern).
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>>
>>> Ok for trunk?
>>> Thanks,
>>> Kyrill
>>>
>>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>      PR target/84164
>>>      * config/aarch64/aarch64.md (*aarch64_<optab>_reg_di3_mask2):
>>>      Use const_int_operand predicate for operand 3.
>>>
>>> 2018-02-02  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>      PR target/84164
>>>      * gcc.c-torture/compile/pr84164.c: New test.
>>>
>>> aarch64-mask.patch
>>>
>>>
>>> diff --git a/gcc/config/aarch64/aarch64.md
>>> b/gcc/config/aarch64/aarch64.md
>>> index
>>> 04b2d203fa168bebcf6f93a13e3bd67a5998935a..eef0d1a780dd1c886e1bebb9992c552fb9d5b57c
>>> 100644
>>> --- a/gcc/config/aarch64/aarch64.md
>>> +++ b/gcc/config/aarch64/aarch64.md
>>> @@ -4358,8 +4358,8 @@ (define_insn "*aarch64_<optab>_reg_di3_mask2"
>>>         (match_operand:DI 1 "register_operand" "r")
>>>         (match_operator 4 "subreg_lowpart_operator"
>>>          [(and:SI (match_operand:SI 2 "register_operand" "r")
>>> -             (match_operand 3 "aarch64_shift_imm_di" "Usd"))])))]
>>> -  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode)-1)) == 0)"
>>> +            (match_operand 3 "const_int_operand" "n"))])))]
>>> +  "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (DImode) - 1)) == 0)"
>>>   {
>>>     rtx xop[3];
>>>     xop[0] = operands[0];
>>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr84164.c
>>> b/gcc/testsuite/gcc.c-torture/compile/pr84164.c
>>> new file mode 100644
>>> index
>>> 0000000000000000000000000000000000000000..d663fe5d6649e495f3e956e6a3bc938236a4bf91
>>>
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr84164.c
>>> @@ -0,0 +1,17 @@
>>> +/* PR target/84164.  */
>>> +
>>> +int b, d;
>>> +unsigned long c;
>>> +
>>> +static inline __attribute__ ((always_inline)) void
>>> +bar (int e)
>>> +{
>>> +  e += __builtin_sub_overflow_p (b, d, c);
>>> +  c = c << ((short) ~e & 3) | c >> (-((short) ~e & 3) & 63);
>>> +}
>>> +
>>> +int
>>> +foo (void)
>>> +{
>>> +  bar (~1);
>>> +}
>>>
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]