[PATCH v3] arm: Low overhead loop handle long range branches [PR98931]

Richard Earnshaw (lists) Richard.Earnshaw@arm.com
Thu Feb 11 14:29:58 GMT 2021


On 10/02/2021 17:44, Andrea Corallo via Gcc-patches wrote:
> Andrea Corallo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> 
>> "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> writes:
>>
>>> On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
>>>> Jakub Jelinek <jakub@redhat.com> writes:
>>>>
>>>>> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches wrote:
>>>>>>>    "TARGET_32BIT && TARGET_HAVE_LOB"
>>>>>>> -  "le\t%|lr, %l0")
>>>>>>> +  "*
>>>>>>> +  if (get_attr_length (insn) == 4)
>>>>>>> +    return \"le\\t%|lr, %l0\";
>>>>>>> +  else
>>>>>>> +    return \"subs\\t%|lr, #1\;bne\\t%l0\";
>>>>>>> +  "
>>>>>>
>>>>>> Why not
>>>>>> {
>>>>>>   if (get_attr_length (insn) == 4)
>>>>>>     return "le\t%|lr, %l0";
>>>>>>   else
>>>>>>     return "subs\t%|lr, #1;bne\t%l0";
>>>>>> }
>>>>>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>>>>>> but one needs to backslash prefix a lot which makes it less readable?
>>>>>
>>>>> Where "more modern" is introduced 19.5 years ago ;)
>>>>>
>>>>> 	Jakub
>>>>
>>>> I guess we really like traditions :)
>>>>
>>>> Attached second version addressing this.
>>>>
>>>> Thanks
>>>>
>>>>   Andrea
>>>>
>>>
>>> You're missing a clobber of the condition codes in the RTL.  This wasn't
>>> needed before, but is now.
>>>
>>> R.
>>
>> Hi Richard,
>>
>> thanks for reviewing, I guess this is going to be a good learning moment
>> for me.
>>
>> What we originally expand is:
>>
>> (insn 2396 2360 2397 3 (parallel [
>>             (set (reg:CC_NZ 100 cc)
>>                 (compare:CC_NZ (plus:SI (reg:SI 14 lr)
>>                         (const_int -1 [0xffffffffffffffff]))
>>                     (const_int 0 [0])))
>>             (set (reg:SI 14 lr)
>>                 (plus:SI (reg:SI 14 lr)
>>                     (const_int -1 [0xffffffffffffffff])))
>>         ]) "p1.c":4:21 -1
>>      (nil))
>> (jump_insn 2397 2396 2365 3 (set (pc)
>>         (if_then_else (ne (reg:CC_NZ 100 cc)
>>                 (const_int 0 [0]))
>>             (label_ref:SI 2361)
>>             (pc))) "p1.c":4:21 273 {arm_cond_branch}
>>      (expr_list:REG_DEAD (reg:CC_NZ 100 cc)
>>         (int_list:REG_BR_PROB 1062895996 (nil)))
>>  -> 2361)
>>
>> Combine recognizing cc:CC_NZ as a dead reg and rewriting the two insns
>> as:
>>
>> (jump_insn 2397 2396 2365 3 (parallel [
>>             (set (pc)
>>                 (if_then_else (ne (reg:SI 14 lr)
>>                         (const_int 1 [0x1]))
>>                     (label_ref:SI 2361)
>>                     (pc)))
>>             (set (reg:SI 14 lr)
>>                 (plus:SI (reg:SI 14 lr)
>>                     (const_int -1 [0xffffffffffffffff])))
>>         ]) "p1.c":4:21 1047 {*doloop_end_internal}
>>      (int_list:REG_BR_PROB 1062895996 (nil))
>>  -> 2361)
>>
>> I originally thought that because the write of reg:CC_NZ is explicit in
>> the first pattern we expand this was sufficient, but I now understand
>> I'm wrong and combine should produce a pattern still expressing this.
>> Now the question is how to do that.
>>
>> If I add the clobber '(clobber (reg:CC CC_REGNUM))' inside the parallel
>> of *doloop_end_internal as last element of the vector we ICE in
>> 'add_clobbers' called during combine, apparently 'add_clobbers' does not
>> handle the insn_code_number.
>>
>> If I add it as second element of the parallel combine is not combining
>> the two insns.
>>
>> If I place the clobber outside the parallel as a second element of the
>> insn vector combine is crashing in 'recog_for_combine_1'.
>>
>> So the question is probably: where should the clobber be positioned
>> canonically to have this working?
>>
>> Thanks!
>>
>>   Andrea
> 
> Righ, I've been explained by a knowledgeable colleague that the
> 'parallel' is implicit in the 'define_insn' and there's no need to
> express it (interestgly this is confusing the code generating
> 'add_clobbers').
> 
> The attached patch rewrites the pattern as such and adds the missing
> clobber.
> 
> Tests are running, okay for trunk when done with these?
> 
> Regards
> 
>   Andrea
> 
+  [(set (attr "length")
+        (if_then_else
+            (lt (minus (pc) (match_dup 0)) (const_int 1024))
+	    (const_int 4)
+	    (const_int 6)))
+   (set_attr "type" "branch")])

Shouldn't that be using "ltu" rather than "lt", so that if, for some
reason, the branch has been retargeted to come after the branch, then
the test will still fail and we'll get the comparison variant back.

Otherwise OK.

R.



More information about the Gcc-patches mailing list