[Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

Fri Dec 23 07:49:00 GMT 2016

> 在 2016年12月23日，10:47，Paul Hua <paul.hua.gm@gmail.com> 写道：
> 
> On aarch64 target the result are 1.332268e-17.
> On x86 with fma target the result are also 1.332268e-17.
> 
> so, I don't think the Loongson's madd.fmt/msub.fmt is incorrect.

OMG. Will this behavior make some app wrong working abnormal?

> 
> We should do something for usage of fused madd, the all things has
> been tested an fedora21 remix for loongson(1).

We met lots of ftbfs on Debian with madd.fmt enabled.
So maybe FMA should not be used for normal applications at all?
and for generic linux distributions FMA shouldn’t be enabled by default.

> 
> 1, gcc：add loongson to the list of targets with a fused implementation.

If so, I think we should do this.

> 2, glibc：__fma() function，we should use madd.fmt like aarch64.
> see patch: http://www.loongnix.org/cgit/glibc-2.20/commit/?id=14023742e6ef571b61439d0d7bb7939e663fe624

Your patch shouldn’t be apply to upstream, as you don’t consider old CPUs without FMA.
In fact all MIPS <= MIPSr5 don’t have FMA.

> 3, kernel: the emulation  when a float exception taken.

The big problem is that Loongson use the same encode for (unfused) madd.fmt to (fused) madd.fmt.
We cannot trap this in kernel.

> 
> (1),http://www.loongnix.org/index.php/Loongnix-20161130%E7%89%88%E6%9C%AC%E5%9C%A8Fedora21_remix%E7%B3%BB%E7%BB%9F%E4%B8%8A%E5%8F%91%E5%B8%83
> 
> Thanks,
> Paul
> 
> On Fri, Dec 23, 2016 at 12:38 AM, Yunqiang Su <Yunqiang.Su@imgtec.com> wrote:
>> 
>>> 在 2016年12月23日，00:18，Richard Sandiford <rdsandiford@googlemail.com> 写道：
>>> 
>>> Yunqiang Su <Yunqiang.Su@imgtec.com> writes:
>>>>> 在 2016年12月22日，23:48，Yunqiang Su <Yunqiang.Su@imgtec.com> 写道：
>>>>> 
>>>>>> 
>>>>>> 在 2016年12月22日，23:31，Richard Sandiford
>>>>>> <rdsandiford@googlemail.com> 写道：
>>>>>> 
>>>>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes:
>>>>>>> Sandra Loosemore <sandra@codesourcery.com> writes:
>>>>>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote:
>>>>>>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no',
>>>>>>>>> and runtime option -m(no-)unfused-madd4,
>>>>>>>>> to disable generate madd.fmt instructions.
>>>>>>>> 
>>>>>>>> Your patch also needs a documentation change so that the new
>>>>>>>> command-line option is listed in the GCC manual with other MIPS target
>>>>>>>> options.
>>>>>>> 
>>>>>>> Any opinions on option names to control this? Is it best to target
>>>>>>> the specific
>>>>>>> feature that is non-compliant on loongson or apply a general -mfix-loongson
>>>>>>> type option?
>>>>>>> 
>>>>>>> I'm not sure I have a strong opinion either way but there do seem to be
>>>>>>> multiple possible variants.
>>>>>> 
>>>>>> Wasn't sure from this thread whether Loongson simply had a fused
>>>>>> implementation (without intermediate rounding) or whether the
>>>>>> instructions gave numerically incorrect results for some inputs.
>>>>> 
>>>>> I test to define ISA_HAS_FUSED_MADD4 true and
>>>>> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case.
>>>>> With ISA_HAS_FUSED_MADD4, the result is about 1e-17,
>>>>> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17,
>>>>> both of the are incorrect (the expect value is 0).
>>>>> 
>>>>> The test case is
>>>>> 
>>>>> #include <stdio.h>
>>>>> 
>>>>> double a = 0.6;
>>>>> double b = 0.4;
>>>>> double c = 0.6;
>>>>> double d = 0.4;
>>>>> 
>>>>> int main(void)
>>>>> {
>>>>>      double x = a * b - c * d;
>>>>>      printf("%le\n", x);
>>>>>      return 0;
>>>>> }
>>>>> 
>>>>> 
>>>>>> It sounds from a later thread like it's generating incorrect results,
>>>>>> is that right?  If so, then FWIW I agree an -mfix option would be more
>>>>>> consistent.  E.g. one of the -mfix-vr4120 errata was an incorrect
>>>>>> integer division result and one of the -mfix-sb1 errata was an incorrect
>>>>>> single-precision float division result.  The latter case could have been
>>>>>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option
>>>>>> gave more control.
>>>>>> 
>>>>>> If instead the problem is that the instructions are fused then that's
>>>>>> also what the original MIPS 4 parts did, so maybe an option to control
>>>>>> fusedness would make sense.
>>>>> 
>>>>> The result to thread it fused or unfused, is different, while neither of them
>>>>> is correct.
>>>> 
>>>> ohh, the result are same, and neither is correct.
>>>> both of them are 1.332268e-17.
>>> 
>>> That's the expected result for an implementation in which the subtraction
>>> is fused with the first multiplication without intermediate rounding.
>>> The second 0.4 * 0.6 isn't exactly representable and rounds down, to a
>>> value slightly less than 0.24.  Then the fused operation subtracts this
>>> value from the exact result of the first 0.4 * 0.6 (0.24), giving a
>>> value slightly greater than 0.
>> 
>> I see. But I think 1.332268e-17 is too big than expected.
>> 1-e17 for double is totally unacceptable.
>> 
>>> 
>>> I see Paul Hua's patch does add Loongson to the list of targets
>>> with a fused implementation (should have checked earlier, sorry).
>>> So I think after that patch we would do the right thing.  (In particular,
>>> -ffp-contract=off would then disable the fusing.)
>> 
>> will -ffp-contract=off disable some other optimization?
>> If so, I don’t think that will be an ideal choice for distributions, like Debian.
>> 
>>> 
>>> Thanks,
>>> Richard
>>