This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA


On 15/12/16 09:55, Richard Earnshaw (lists) wrote:
> On 30/11/16 16:47, Kyrill Tkachov wrote:
>> Hi all,
>>
>> In this awkward ICE we have a *load_multiple pattern that is being
>> transformed in reload from:
>> (insn 55 67 151 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158 [ c+4 ])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>>
>> into the invalid:
>> (insn 55 67 70 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>>             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>>                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>> S4 A32])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (nil))
>>
>> The operands of *load_multiple are not validated through constraints
>> like LRA is used to, but rather through
>> a match_parallel predicate which ends up calling ldm_stm_operation_p to
>> validate the multiple sets.
>> But this means that LRA cannot reason about the constraints properly.
>> This two-regiseter load should not have used *load_multiple anyway, it
>> should have used *ldm2_ from ldmstm.md
>> and indeed it did until the loop2_invariant pass which copied the ldm2_
>> pattern:
>> (insn 27 23 28 4 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 1 r1)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 385 {*ldm2_}
>>      (nil))
>>
>> into:
>> (insn 55 19 67 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 404 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>> Note that it now got recognised as load_multiple because the second
>> register is not a hard register but the pseudo 158.
>> In any case, the solution suggested in the PR (and I agree with it) is
>> to restrict *load_multiple to after reload.
>> The similar pattern *load_multiple_with_writeback also has a similar
>> condition and the comment above *load_multiple says that
>> it's used to generate epilogues, which is done after reload anyway. For
>> pre-reload load-multiples the patterns in ldmstm.md
>> should do just fine.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
> 
> I don't think this is right.  Firstly, these patterns look to me like
> the ones used for memcpy expansion, so not recognizing them could lead
> to compiler aborts.
> 
> Secondly, the bug is when we generate
> 
>  (insn 55 67 70 3 (parallel [
>              (set (reg:SI 0 r0)
>                  (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>              (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>                          (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>  S4 A32])
>                  (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>          ]) arm-crash.c:25 393 {*load_multiple}
>       (nil))

sorry, pasted the wrong bit of code.

That should read when we generate:

(insn 55 19 67 3 (parallel [
            (set (reg:SI 0 r0)
                (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
            (set (reg:SI 158)
                (mem/u/c:SI (plus:SI (reg/f:SI 147)
                        (const_int 4 [0x4])) [2 c+4 S4 A32]))
        ]) "ldm.c":25 404 {*load_multiple}
     (expr_list:REG_UNUSED (reg:SI 0 r0)
        (nil)))

ie when we put a pseudo into the register load list.

> 
> These patterns are supposed to enforce that the load (store) target
> register is a hard register that is higher numbered than the hard
> register in the memory slot that precedes it (thus satisfying the
> constraints for a normal ldm/stm instruction.
> 
> The real question is why did ldm_stm_operation_p permit this modified
> pattern through, or was the pattern validation bypassed incorrectly by
> the loop2 invariant code when it copied the insn and made changes?
> 
> R.
> 
>> Thanks,
>> Kyrill
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * config/arm/arm.md (*load_multiple): Add reload_completed to
>>     matching condition.
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * gcc.c-torture/compile/pr71436.c: New test.
>>
>> arm-ldm.patch
>>
>>
>> commit 996d28e2353badd1b29ef000f94d40c7dab9010f
>> Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>> Date:   Tue Nov 29 15:07:30 2016 +0000
>>
>>     [ARM] Restrict *load_multiple pattern till after LRA
>>
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index 74c44f3..22d2a84 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>"
>>  
>>  ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
>>  ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
>> +;; The operands are validated through the load_multiple_operation
>> +;; match_parallel predicate rather than through constraints so enable it only
>> +;; after reload.
>>  (define_insn "*load_multiple"
>>    [(match_parallel 0 "load_multiple_operation"
>>      [(set (match_operand:SI 2 "s_register_operand" "=rk")
>>            (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
>>          ])]
>> -  "TARGET_32BIT"
>> +  "TARGET_32BIT && reload_completed"
>>    "*
>>    {
>>      arm_output_multireg_pop (operands, /*return_pc=*/false,
>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> new file mode 100644
>> index 0000000..ab08d5d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> @@ -0,0 +1,35 @@
>> +/* PR target/71436.  */
>> +
>> +#pragma pack(1)
>> +struct S0
>> +{
>> +  volatile int f0;
>> +  short f2;
>> +};
>> +
>> +void foo (struct S0 *);
>> +int a, d;
>> +static struct S0 b[5];
>> +static struct S0 c;
>> +void fn1 ();
>> +void
>> +main ()
>> +{
>> +  {
>> +    struct S0 e;
>> +    for (; d; fn1 ())
>> +      {
>> +        {
>> +          a = 3;
>> +          for (; a >= 0; a -= 1)
>> +            {
>> +              {
>> +                e = c;
>> +              }
>> +              b[a] = e;
>> +            }
>> +        }
>> +      }
>> +  }
>> +  foo (b);
>> +}
>>
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]