This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
- From: "Richard Earnshaw (lists)" <Richard dot Earnshaw at arm dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: Ramana Radhakrishnan <ramana dot radhakrishnan at arm dot com>
- Date: Thu, 15 Dec 2016 10:00:14 +0000
- Subject: Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
- Authentication-results: sourceware.org; auth=none
- References: <583F02B0.3030406@foss.arm.com> <325eb8f2-a5e0-62af-be0c-0d85a53812a4@arm.com>
On 15/12/16 09:55, Richard Earnshaw (lists) wrote:
> On 30/11/16 16:47, Kyrill Tkachov wrote:
>> Hi all,
>>
>> In this awkward ICE we have a *load_multiple pattern that is being
>> transformed in reload from:
>> (insn 55 67 151 3 (parallel [
>> (set (reg:SI 0 r0)
>> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>> (set (reg:SI 158 [ c+4 ])
>> (mem/u/c:SI (plus:SI (reg/f:SI 147)
>> (const_int 4 [0x4])) [2 c+4 S4 A32]))
>> ]) arm-crash.c:25 393 {*load_multiple}
>> (expr_list:REG_UNUSED (reg:SI 0 r0)
>> (nil)))
>>
>>
>> into the invalid:
>> (insn 55 67 70 3 (parallel [
>> (set (reg:SI 0 r0)
>> (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>> (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>> (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>> S4 A32])
>> (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>> (const_int 4 [0x4])) [2 c+4 S4 A32]))
>> ]) arm-crash.c:25 393 {*load_multiple}
>> (nil))
>>
>> The operands of *load_multiple are not validated through constraints
>> like LRA is used to, but rather through
>> a match_parallel predicate which ends up calling ldm_stm_operation_p to
>> validate the multiple sets.
>> But this means that LRA cannot reason about the constraints properly.
>> This two-regiseter load should not have used *load_multiple anyway, it
>> should have used *ldm2_ from ldmstm.md
>> and indeed it did until the loop2_invariant pass which copied the ldm2_
>> pattern:
>> (insn 27 23 28 4 (parallel [
>> (set (reg:SI 0 r0)
>> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>> (set (reg:SI 1 r1)
>> (mem/u/c:SI (plus:SI (reg/f:SI 147)
>> (const_int 4 [0x4])) [2 c+4 S4 A32]))
>> ]) "ldm.c":25 385 {*ldm2_}
>> (nil))
>>
>> into:
>> (insn 55 19 67 3 (parallel [
>> (set (reg:SI 0 r0)
>> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>> (set (reg:SI 158)
>> (mem/u/c:SI (plus:SI (reg/f:SI 147)
>> (const_int 4 [0x4])) [2 c+4 S4 A32]))
>> ]) "ldm.c":25 404 {*load_multiple}
>> (expr_list:REG_UNUSED (reg:SI 0 r0)
>> (nil)))
>>
>> Note that it now got recognised as load_multiple because the second
>> register is not a hard register but the pseudo 158.
>> In any case, the solution suggested in the PR (and I agree with it) is
>> to restrict *load_multiple to after reload.
>> The similar pattern *load_multiple_with_writeback also has a similar
>> condition and the comment above *load_multiple says that
>> it's used to generate epilogues, which is done after reload anyway. For
>> pre-reload load-multiples the patterns in ldmstm.md
>> should do just fine.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
>
> I don't think this is right. Firstly, these patterns look to me like
> the ones used for memcpy expansion, so not recognizing them could lead
> to compiler aborts.
>
> Secondly, the bug is when we generate
>
> (insn 55 67 70 3 (parallel [
> (set (reg:SI 0 r0)
> (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
> (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
> (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
> S4 A32])
> (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
> (const_int 4 [0x4])) [2 c+4 S4 A32]))
> ]) arm-crash.c:25 393 {*load_multiple}
> (nil))
sorry, pasted the wrong bit of code.
That should read when we generate:
(insn 55 19 67 3 (parallel [
(set (reg:SI 0 r0)
(mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
(set (reg:SI 158)
(mem/u/c:SI (plus:SI (reg/f:SI 147)
(const_int 4 [0x4])) [2 c+4 S4 A32]))
]) "ldm.c":25 404 {*load_multiple}
(expr_list:REG_UNUSED (reg:SI 0 r0)
(nil)))
ie when we put a pseudo into the register load list.
>
> These patterns are supposed to enforce that the load (store) target
> register is a hard register that is higher numbered than the hard
> register in the memory slot that precedes it (thus satisfying the
> constraints for a normal ldm/stm instruction.
>
> The real question is why did ldm_stm_operation_p permit this modified
> pattern through, or was the pattern validation bypassed incorrectly by
> the loop2 invariant code when it copied the insn and made changes?
>
> R.
>
>> Thanks,
>> Kyrill
>>
>> 2016-11-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>
>> PR target/71436
>> * config/arm/arm.md (*load_multiple): Add reload_completed to
>> matching condition.
>>
>> 2016-11-30 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>>
>> PR target/71436
>> * gcc.c-torture/compile/pr71436.c: New test.
>>
>> arm-ldm.patch
>>
>>
>> commit 996d28e2353badd1b29ef000f94d40c7dab9010f
>> Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>> Date: Tue Nov 29 15:07:30 2016 +0000
>>
>> [ARM] Restrict *load_multiple pattern till after LRA
>>
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index 74c44f3..22d2a84 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>"
>>
>> ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
>> ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
>> +;; The operands are validated through the load_multiple_operation
>> +;; match_parallel predicate rather than through constraints so enable it only
>> +;; after reload.
>> (define_insn "*load_multiple"
>> [(match_parallel 0 "load_multiple_operation"
>> [(set (match_operand:SI 2 "s_register_operand" "=rk")
>> (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
>> ])]
>> - "TARGET_32BIT"
>> + "TARGET_32BIT && reload_completed"
>> "*
>> {
>> arm_output_multireg_pop (operands, /*return_pc=*/false,
>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> new file mode 100644
>> index 0000000..ab08d5d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> @@ -0,0 +1,35 @@
>> +/* PR target/71436. */
>> +
>> +#pragma pack(1)
>> +struct S0
>> +{
>> + volatile int f0;
>> + short f2;
>> +};
>> +
>> +void foo (struct S0 *);
>> +int a, d;
>> +static struct S0 b[5];
>> +static struct S0 c;
>> +void fn1 ();
>> +void
>> +main ()
>> +{
>> + {
>> + struct S0 e;
>> + for (; d; fn1 ())
>> + {
>> + {
>> + a = 3;
>> + for (; a >= 0; a -= 1)
>> + {
>> + {
>> + e = c;
>> + }
>> + b[a] = e;
>> + }
>> + }
>> + }
>> + }
>> + foo (b);
>> +}
>>
>