[[ARM/AArch64][testsuite] 03/36] Add vmax, vmin, vhadd, vhsub and vrhadd tests.

Fri Jan 23 15:21:00 GMT 2015

On 23 January 2015 at 12:42, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> On 23 January 2015 at 11:18, Tejas Belagod <tejas.belagod@arm.com> wrote:
>> On 22/01/15 21:31, Christophe Lyon wrote:
>>>
>>> On 22 January 2015 at 16:22, Tejas Belagod <tejas.belagod@arm.com> wrote:
>>>>
>>>> On 22/01/15 14:28, Christophe Lyon wrote:
>>>>>
>>>>>
>>>>> On 22 January 2015 at 12:19, Tejas Belagod <tejas.belagod@arm.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On 21/01/15 15:07, Christophe Lyon wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 19 January 2015 at 17:54, Marcus Shawcroft
>>>>>>> <marcus.shawcroft@gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19 January 2015 at 15:43, Christophe Lyon
>>>>>>>> <christophe.lyon@linaro.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 19 January 2015 at 14:29, Marcus Shawcroft
>>>>>>>>> <marcus.shawcroft@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 16 January 2015 at 17:52, Christophe Lyon
>>>>>>>>>> <christophe.lyon@linaro.org> wrote:
>>>>>>>>>>
>>>>>>>>>>>> OK provided, as per the previous couple, that we don;t regression
>>>>>>>>>>>> or
>>>>>>>>>>>> introduce new fails on aarch64[_be] or aarch32.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This patch shows failures on aarch64 and aarch64_be for vmax and
>>>>>>>>>>> vmin
>>>>>>>>>>> when the input is -NaN.
>>>>>>>>>>> It's a corner case, and my reading of the ARM ARM is that the
>>>>>>>>>>> result
>>>>>>>>>>> should the same as on aarch32.
>>>>>>>>>>> I haven't had time to look at it in more details though.
>>>>>>>>>>> So, not OK?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> They should have the same behaviour in aarch32 and aarch64. Did you
>>>>>>>>>> test on HW or a model?
>>>>>>>>>>
>>>>>>>>> I ran the tests on qemu for aarch32 and aarch64-linux, and on the
>>>>>>>>> foundation model for aarch64*-elf.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Leave this one out until we understand why it fails. /Marcus
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I've looked at this a bit more.
>>>>>>> We have
>>>>>>> fmax    v0.4s, v0.4s, v1.4s
>>>>>>> where v0 is a vector of -NaN (0xffc00000) and v1 is a vector of 1.
>>>>>>>
>>>>>>> The output is still -NaN (0xffc00000), while the test expects
>>>>>>> defaultNaN (0x7fc00000).
>>>>>>>
>>>>>>
>>>>>> In the AArch32 execution state, Advanced SIMD FP arithmetic always uses
>>>>>> the
>>>>>> DefaultNaN setting regardless of the DN-bit value in the FPSCR. In
>>>>>> AArch64
>>>>>> execution state, result of Advanced SIMD FP arithmetic operations
>>>>>> depend
>>>>>> on
>>>>>> the value of the DN-bit i.e. either propagate the input NaN or generate
>>>>>> DefaultNaN depending on the value of DN.
>>>>>
>>>>>
>>>>>
>>>>> Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I
>>>>> can see only the latter (no diff between aarch32 and aarch64 in
>>>>> FPProcessNan pseudo-code)
>>>>>
>>>>
>>>> If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec -
>>>> under DN:
>>>>
>>>> "The value of this bit only controls scalar floating-point arithmetic.
>>>> Advanced SIMD arithmetic always uses the Default NaN setting, regardless
>>>> of
>>>> the value of the DN bit."
>>>>
>>>> Also on page 3180 for the description of VMAX(vector FP), it says:
>>>> "
>>>> *  max(+0.0, -0.0) = +0.0
>>>> * If any input is a NaN, the corresponding result element is the default
>>>> NaN.
>>>> "
>>>>
>>> Oops I was looking at FMAX (vector) pg 936.
>>>
>>>> The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to
>>>> FPMax() which is on pg. 2285
>>>>
>>>> // StandardFPSCRValue()
>>>> // ====================
>>>> FPCRType StandardFPSCRValue()
>>>> return ‘00000’ : FPSCR.AHP : ‘11000000000000000000000000’
>>>>
>>>> Here bit-25(FPSCR.DN) is set to 1.
>>>>
>>>
>>> So, we should get defaultNaN too on aarch64, and no need to try to
>>> force DN to 1 in gdb?
>>>
>>> What can be wrong?
>>>
>>
>> On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading
>> the same document.
>>
>> Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD)
>> for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which
>> means generate DefaultNaN() regardless.
>>
>> OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the
>> pseudocode gets just FPCR.
>>
>>
> Ok, that was my initial understanding but our discussion confused me.
>
> And that's why I tried to force DN = 1 in gdb before single-stepping over
> fmax    v0.4s, v0.4s, v1.4s
>
> but it changed nothing :-(
> Hence my question about a gdb possible bug or misuse.

Hmm... user error, I missed one bit
set $fpcr=0x2000000
works under gdb.

> I'll try modifying the test to have it force DN=1.
>
Forcing DN=1 in the test makes it pass.

I am going to look at adding that cleanly to my test, and resubmit it.

Thanks, and sorry for the noise.

>> Thanks,
>> Tejas.
>>
>>
>>>> Thanks,
>>>> Tejas.
>>>>
>>>>
>>>>>> If you're running your test in the AArch64 execution state, you'd want
>>>>>> to
>>>>>> define the DN bit and modify the expected results accordingly or have
>>>>>> the
>>>>>> test poll at runtime what the DN-bit is set to and check expected
>>>>>> results
>>>>>> dynamically.
>>>>>
>>>>>
>>>>> Makes sense, I hadn't noticed the different aarch64 spec here.
>>>>>
>>>>>> I think the test already has expected behaviour for AArch32 execution
>>>>>> state
>>>>>> by expecting DefaultNaN regardless.
>>>>>
>>>>>
>>>>> Yes.
>>>>>
>>>>>>> I have executed the test under GDB on AArch64 HW, and noticed that
>>>>>>> fpcr
>>>>>>> was 0.
>>>>>>> I forced it to have DN==1:
>>>>>>> set $fpcr=0x1000000
>>>>>>> but this didn't change the result.
>>>>>>>
>>>>>>> Does setting fpcr.dn under gdb actually work?
>>>>>>>
>>>>>>
>>>>>> It should. Possibly a bug, patches welcome :-).
>>>>>>
>>>>> :-)
>>>>>
>>>>
>>>>
>>>
>>
>>