This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, spu] Improve precision of divsf3 on SPU
I see. Given that the original implementation is inaccurate, I think
it is ok to have a new implementation that is inaccurate in a different
way.
I'm ok with either of the patches, as long as -ffast-math generates the
previous version, i.e., without the extra adjustment.
I assume we will eventually add the -mfloat=accurate and
-mdouble=accurate options to generate fully accurate answers.
Trevor
* Ulrich Weigand <uweigand@de.ibm.com> [2008-06-26 04:55]:
> Trevor Smigiel wrote:
>
> > Is this result still consistent with round-to-zero?
>
> Not in all cases. I understand the original code always guarantees the
> result is smaller or equal in magnitude to the true quotient. However,
> it is not always equal to the *nearest* such number (i.e. the result
> to be expected in round-to-zero mode), but sometimes 1 ulp less than
> this.
>
> This 1 ulp off tends to occur very frequently if the true quotient is
> actually representable exactly. This leads to the quite surprising
> behaviour that "simple" divisors like 1.0 or 2.0 nearly always yield
> wrong results (e.g. the identity x / 1.0 == x does not hold).
>
> The patch tries to fix this by checking whether the number 1 ulp larger
> fits the real quotient better. The intent is to still remain lower that
> the true result (to respect round-to-zero), but the code does not always
> achieve this.
>
> There are two reasons for this:
>
> - If the dividend is negative, the check is actually incorrect; we'd
> have to check the error term for <= 0 in this case, but we always
> check it for >= 0.
>
> - If the dividend is very small in magnitude (< 2^-100), the computation
> of the error term can underflow to zero, so we accidentally treat a
> too-large result as if it were the exact result.
>
> The first of these problems can be fixed by multiplying the error term
> with -1.0 for negative dividends. The patch below implements this; it
> is slightly less efficient than the original patch, but it may be
> preferable to the original version as it avoids that systematic error.
>
> It seems the second problem can only be fixed by much more elaborate
> code (e.g. normalizing the input operands and computing the result
> exponent by hand, as the simdmath _divf4.h code does) ... I don't
> think we should do that for the "fast" inline implementation -- if this
> deviates from round-to-zero for input values near the limits of
> representable values, that should be an acceptable trade-off.
>
> The alternative would be to provide a fully exact algorithm (along the
> lines of the simdmath implementation) as libgcc function.
>
> What do you think?
>
> Bye,
> Ulrich
>
>
> Patch below was tested on spu-elf with no regressions, fixes the same
> set of test cases that were fixed by the initial patch.