[PATCH, v2] Recognize a missed usage of a sbfiz instruction

Tue May 15 08:17:00 GMT 2018

Hi Luis,

On 14/05/18 21:41, Luis Machado wrote:
> On 05/11/2018 06:46 AM, Kyrill Tkachov wrote:
>> Hi Luis,
>>
>> On 10/05/18 11:31, Luis Machado wrote:
>>>
>>> On 05/09/2018 10:44 AM, Kyrill Tkachov wrote:
>>>>
>>>> On 09/05/18 13:30, Luis Machado wrote:
>>>>> Hi Kyrill,
>>>>>
>>>>> On 05/08/2018 11:09 AM, Kyrill Tkachov wrote:
>>>>>> Hi Luis,
>>>>>>
>>>>>> On 07/05/18 15:28, Luis Machado wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 02/08/2018 10:45 AM, Luis Machado wrote:
>>>>>>>> Hi Kyrill,
>>>>>>>>
>>>>>>>> On 02/08/2018 09:48 AM, Kyrill Tkachov wrote:
>>>>>>>>> Hi Luis,
>>>>>>>>>
>>>>>>>>> On 06/02/18 15:04, Luis Machado wrote:
>>>>>>>>>> Thanks for the feedback Kyrill. I've adjusted the v2 patch based on your
>>>>>>>>>> suggestions and re-tested the changes. Everything is still sane.
>>>>>>>>>
>>>>>>>>> Thanks! This looks pretty good to me.
>>>>>>>>>
>>>>>>>>>> Since this is ARM-specific and fairly specific, i wonder if it would be
>>>>>>>>>> reasonable to consider it for inclusion at the current stage.
>>>>>>>>>
>>>>>>>>> It is true that the target maintainers can choose to take
>>>>>>>>> such patches at any stage. However, any patch at this stage increases
>>>>>>>>> the risk of regressions being introduced and these regressions
>>>>>>>>> can come bite us in ways that are very hard to anticipate.
>>>>>>>>>
>>>>>>>>> Have a look at some of the bugs in bugzilla (or a quick scan of the gcc-bugs list)
>>>>>>>>> for examples of the ways that things can go wrong with any of the myriad of GCC components
>>>>>>>>> and the unexpected ways in which they can interact.
>>>>>>>>>
>>>>>>>>> For example, I am now working on what I initially thought was a one-liner fix for
>>>>>>>>> PR 84164 but it has expanded into a 3-patch series with a midend component and
>>>>>>>>> target-specific changes for 2 ports.
>>>>>>>>>
>>>>>>>>> These issues are very hard to catch during review and normal testing, and can sometimes take months of deep testing by
>>>>>>>>> fuzzing and massive codebase rebuilds to expose, so the closer the commit is to a release
>>>>>>>>> the higher the risk is that an obscure edge case will be unnoticed and unfixed in the release.
>>>>>>>>>
>>>>>>>>> So the priority at this stage is to minimise the risk of destabilising the codebase,
>>>>>>>>> as opposed to taking in new features and desirable performance improvements (like your patch!)
>>>>>>>>>
>>>>>>>>> That is the rationale for delaying committing such changes until the start
>>>>>>>>> of GCC 9 development. But again, this is up to the aarch64 maintainers.
>>>>>>>>> I'm sure the patch will be a perfectly fine and desirable commit for GCC 9.
>>>>>>>>> This is just my perspective as maintainer of the arm port.
>>>>>>>>
>>>>>>>> Thanks. Your explanation makes the situation pretty clear and it sounds very reasonable. I'll put the patch on hold until development is open again.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Luis
>>>>>>>
>>>>>>> With GCC 9 development open, i take it this patch is worth considering again?
>>>>>>>
>>>>>>
>>>>>> Yes, I believe the latest version is at:
>>>>>> https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00239.html ?
>>>>>>
>>>>>> +(define_insn "*ashift<mode>_extv_bfiz"
>>>>>> +  [(set (match_operand:GPI 0 "register_operand" "=r")
>>>>>> +    (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
>>>>>> +                      (match_operand 2 "aarch64_simd_shift_imm_offset_<mode>" "n")
>>>>>> +                      (match_operand 3 "aarch64_simd_shift_imm_<mode>" "n"))
>>>>>> +             (match_operand 4 "aarch64_simd_shift_imm_<mode>" "n")))]
>>>>>> +  ""
>>>>>> +  "sbfiz\\t%<w>0, %<w>1, %4, %2"
>>>>>> +  [(set_attr "type" "bfx")]
>>>>>> +)
>>>>>> +
>>>>>
>>>>> Indeed.
>>>>>
>>>>>>
>>>>>> Can you give a bit more information about what are the values for operands 2,3 and 4 in your example testcases?
>>>>>
>>>>> For sbfiz32 we have 3, 0 and 19 respectively. For sbfiz64 we have 6, 0 and 38.
>>>>>
>>>>>> I'm trying to understand why the value of operand 3 (the bit position the sign-extract starts from) doesn't get validated
>>>>>> in any way and doesn't play any role in the output...
>>>>>
>>>>> This may be an oversight. It seems operand 3 will always be 0 in this particular case i'm covering. It starts from 0, gets shifted x bits to the left and then y < x bits to the right). The operation is essentially an ashift of the bitfield followed by a sign-extension of the msb of the bitfield being extracted.
>>>>>
>>>>> Having a non-zero operand 3 from RTL means the shift amount won't translate directly to operand 3 of sbfiz (the position). Then we'd need to do a calculation where we take into account operand 3 from RTL.
>>>>>
>>>>> I'm wondering when such a RTL pattern, with a non-zero operand 3, would be generated though.
>>>>
>>>> I think it's best to enforce that operand 3 is a zero. Maybe just match const_int 0 here directly.
>>>> Better safe than sorry with these things.
>>>
>>> Indeed. I've updated the original patch with that change now.
>>>
>>> Bootstrapped and regtested on aarch64-linux.
>>>
>>
>> I think you might also want to enforce that the sum of operands 2 and 3 doesn't exceed the width of the mode
>> (see other patterns in aarch64.md that generate bfx-style instructions for similar restrictions).
>
> Updated now in the attached patch. Everything still sane with bootstrap and testsuite on aarch64-linux.
>

Thanks!
This looks good to me now.
You'll need a final approval from a maintainer.

Kyrill

>>
>> Otherwise the patch looks good to me (it will still need approval from a maintainer).
>>
>> For the future I think there's also an opportunity to match the SImode version of this pattern that zero-extends to DImode,
>> making use of the implicit zero-extension that writing to a W-dreg provides. But that would be a separate patch.
>
> Indeed. I'll have a TODO for this.
>
>>
>> Thanks, and sorry for the respins,
>
> Thanks for reviewing it!
>
> Luis