[PATCH v2] arm: Low overhead loop handle long range branches [PR98931]
Andrea Corallo
andrea.corallo@arm.com
Wed Feb 10 10:13:32 GMT 2021
"Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> writes:
> On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
>> Jakub Jelinek <jakub@redhat.com> writes:
>>
>>> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches wrote:
>>>>> "TARGET_32BIT && TARGET_HAVE_LOB"
>>>>> - "le\t%|lr, %l0")
>>>>> + "*
>>>>> + if (get_attr_length (insn) == 4)
>>>>> + return \"le\\t%|lr, %l0\";
>>>>> + else
>>>>> + return \"subs\\t%|lr, #1\;bne\\t%l0\";
>>>>> + "
>>>>
>>>> Why not
>>>> {
>>>> if (get_attr_length (insn) == 4)
>>>> return "le\t%|lr, %l0";
>>>> else
>>>> return "subs\t%|lr, #1;bne\t%l0";
>>>> }
>>>> instead? Seems the arm backend uses "*..." more than the more modern {},
>>>> but one needs to backslash prefix a lot which makes it less readable?
>>>
>>> Where "more modern" is introduced 19.5 years ago ;)
>>>
>>> Jakub
>>
>> I guess we really like traditions :)
>>
>> Attached second version addressing this.
>>
>> Thanks
>>
>> Andrea
>>
>
> You're missing a clobber of the condition codes in the RTL. This wasn't
> needed before, but is now.
>
> R.
Hi Richard,
thanks for reviewing, I guess this is going to be a good learning moment
for me.
What we originally expand is:
(insn 2396 2360 2397 3 (parallel [
(set (reg:CC_NZ 100 cc)
(compare:CC_NZ (plus:SI (reg:SI 14 lr)
(const_int -1 [0xffffffffffffffff]))
(const_int 0 [0])))
(set (reg:SI 14 lr)
(plus:SI (reg:SI 14 lr)
(const_int -1 [0xffffffffffffffff])))
]) "p1.c":4:21 -1
(nil))
(jump_insn 2397 2396 2365 3 (set (pc)
(if_then_else (ne (reg:CC_NZ 100 cc)
(const_int 0 [0]))
(label_ref:SI 2361)
(pc))) "p1.c":4:21 273 {arm_cond_branch}
(expr_list:REG_DEAD (reg:CC_NZ 100 cc)
(int_list:REG_BR_PROB 1062895996 (nil)))
-> 2361)
Combine recognizing cc:CC_NZ as a dead reg and rewriting the two insns
as:
(jump_insn 2397 2396 2365 3 (parallel [
(set (pc)
(if_then_else (ne (reg:SI 14 lr)
(const_int 1 [0x1]))
(label_ref:SI 2361)
(pc)))
(set (reg:SI 14 lr)
(plus:SI (reg:SI 14 lr)
(const_int -1 [0xffffffffffffffff])))
]) "p1.c":4:21 1047 {*doloop_end_internal}
(int_list:REG_BR_PROB 1062895996 (nil))
-> 2361)
I originally thought that because the write of reg:CC_NZ is explicit in
the first pattern we expand this was sufficient, but I now understand
I'm wrong and combine should produce a pattern still expressing this.
Now the question is how to do that.
If I add the clobber '(clobber (reg:CC CC_REGNUM))' inside the parallel
of *doloop_end_internal as last element of the vector we ICE in
'add_clobbers' called during combine, apparently 'add_clobbers' does not
handle the insn_code_number.
If I add it as second element of the parallel combine is not combining
the two insns.
If I place the clobber outside the parallel as a second element of the
insn vector combine is crashing in 'recog_for_combine_1'.
So the question is probably: where should the clobber be positioned
canonically to have this working?
Thanks!
Andrea
More information about the Gcc-patches
mailing list