[PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

Wed May 28 09:37:00 GMT 2014

On Wed, May 28, 2014 at 9:30 AM, Richard Earnshaw <rearnsha@arm.com> wrote:
> Ah, light dawns (maybe).

>
> I guess the problems stem from the attempts to combine Neon with ARMv5.
>  Neon shouldn't be used with anything prior to ARMv7, since that's the
> earliest version of the architecture that can support it.
>
> I guess that what is happening is that we see we have Neon, so start to
> generate a Neon-based copy sequence, but then notice that we don't have
> misaligned access (something that must exist if we have Neon) and
> generate VLDR instructions in a mistaken attempt to work around the
> first inconsistency.

>
> Maybe we should tie -mfpu=neon to having at least ARMv7 (though ARMv6
> also has misaligned access support).

Sigh. I can think of 2-3 options.

1. Allow Neon to imply unaligned access if the core architecture is below v7-a.
2.  Disallow Neon for any earlier options - options breakage and
regression from earlier releases ? We just have to document this some
place - This is what you suggest.
3. Allow Neon to imply unaligned access only in the Neon unit but not
on the integer side but I suspect that breaks the original testcase.

And :

1 is confusing because there is no real implementation unless you want
to drop down the architecture level on real v7-a implementation and
you probably want lower architecture levels to imply no unaligned
access on the core. (i.e. no unaligned ldr(h) instructions).

2 - clean break but probably need an override switch of
I-know-what-I-am-doing in case people try such combinations for tuning
purposes.

3 - Again confusing and a source of more testsuite failures and other cases.

Sounds like #2 maybe the least worst option.

regards
Ramana

>
> R.
>
> On 28/05/14 00:03, Maciej W. Rozycki wrote:
>> On Tue, 27 May 2014, Kyrill Tkachov wrote:
>>
>>>>>   This change however has regressed gcc.dg/vect/vect-72.c on the
>>>>> arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
>>>> And what are all the configure flags you are using in case some one
>>>> has to reproduce this issue ?
>>>
>>> Second that. My recently built 4.8 (gcc version 4.8.2 20130531) for vect-72
>>> with options:
>>>  -O2 -ftree-vectorize -march=armv5te -mfpu=neon -mfloat-abi=hard
>>> -fno-vect-cost-model -fno-common
>>>
>>> gives code the same as your original one:
>>> .L14:
>>>         sub     r1, r3, #16
>>>         add     r3, r3, #16
>>>         vld1.8  {q8}, [r1]
>>>         cmp     r3, r0
>>>         vst1.64 {d16-d17}, [r2:64]!
>>>         bne     .L14
>>>         ldr     r3, .L22+12
>>>         add     ip, r3, #128
>>>         add     r2, r3, #129
>>
>>  I have this:
>>
>> .L14:
>>       sub     r1, r3, #16     @ 130   *arm_addsi3/7   [length = 4]
>>       add     r3, r3, #16     @ 135   *arm_addsi3/2   [length = 4]
>>       vld1.8  {q8}, [r1]      @ 131   *movmisalignv16qi_neon_load     [length = 4]
>>       cmp     r3, r0  @ 136   *arm_cmpsi_insn/3       [length = 4]
>>       vst1.64 {d16-d17}, [r2:64]!     @ 133   *neon_movv16qi/2        [length = 8]
>>       bne     .L14    @ 137   arm_cond_branch [length = 4]
>>
>> without and this:
>>
>> .L14:
>>       vldr    d16, [r3, #-16] @ 130   *neon_movv16qi/4        [length = 8]
>>       vldr    d17, [r3, #-8]
>>       add     r3, r3, #16     @ 133   *arm_addsi3/2   [length = 4]
>>       cmp     r3, r1  @ 134   *arm_cmpsi_insn/3       [length = 4]
>>       vst1.64 {d16-d17}, [r2:64]!     @ 131   *neon_movv16qi/2        [length = 8]
>>       bne     .L14    @ 135   arm_cond_branch [length = 4]
>>
>> with your change applied respectively so clearly it's making its intended
>> effect of disabling the use of movmisalignv16qi_neon_load.
>>
>>  However VLD1.8 can also be produced from other RTL patterns, or maybe you
>> don't have `unaligned_access' set to zero for some reason, which you
>> should (for ARMv5TE) according to this gcc/config/arm/arm.c piece:
>>
>>   /* Enable -munaligned-access by default for
>>      - all ARMv6 architecture-based processors
>>      - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
>>      - ARMv8 architecture-base processors.
>>
>>      Disable -munaligned-access by default for
>>      - all pre-ARMv6 architecture-based processors
>>      - ARMv6-M architecture-based processors.  */
>>
>>   if (unaligned_access == 2)
>>     {
>>       if (arm_arch6 && (arm_arch_notm || arm_arch7))
>>       unaligned_access = 1;
>>       else
>>       unaligned_access = 0;
>>     }
>>   else if (unaligned_access == 1
>>          && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
>>     {
>>       warning (0, "target CPU does not support unaligned accesses");
>>       unaligned_access = 0;
>>     }
>>
>> -- can you build vect-72.c with -dp and see which pattern VLD1.8 is
>> produced from in your case?
>>
>>  As to the GCC configure options I have as I say nothing there beyond
>> making -march=armv5te the default (surely you're not interested in
>> --prefix, etc.).  For the record the test framework adds these options on
>> top of that for this particular case:
>>
>> -fno-diagnostics-show-caret -mfpu=neon -mfloat-abi=softfp -ffast-math
>> -ftree-vectorize -fno-vect-cost-model -fno-common -O2
>> -fdump-tree-vect-details
>>
>> but these should be no different in your case (except perhaps from
>> `-mfloat-abi=', but that shouldn't matter as this is no FP code).
>>
>>  I have double-checked with current (r210984) 4.8 now and the issue is
>> still there.
>>
>>   Maciej
>>
>
>