[PATCH] [PR86710][PR116826] match.pd: Fold logarithmic identities.
Jennifer Schmitz
jschmitz@nvidia.com
Fri Oct 11 08:00:56 GMT 2024
> On 8 Oct 2024, at 10:44, Richard Biener <rguenther@suse.de> wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Thu, 3 Oct 2024, Jennifer Schmitz wrote:
>
>>
>>
>>> On 1 Oct 2024, at 14:27, Richard Biener <rguenther@suse.de> wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, 1 Oct 2024, Jennifer Schmitz wrote:
>>>
>>>> This patch implements 4 rules for logarithmic identities in match.pd
>>>> under -funsafe-math-optimizations:
>>>> 1) logN(1.0/a) -> -logN(a). This avoids the division instruction.
>>>> 2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1).
>>>> 3) logN(a) + logN(b) -> logN(a*b). This reduces the number of calls to
>>>> log function.
>>>> 4) logN(a) - logN(b) -> logN(a/b). Same as 4).
>>>> Tests were added for float, double, and long double.
>>>>
>>>> The patch was bootstrapped and regtested on aarch64-linux-gnu and
>>>> x86_64-linux-gnu, no regression.
>>>> Additionally, SPEC 2017 fprate was run. While the transform does not seem
>>>> to be triggered, we also see no non-noise impact on performance.
>>>> OK for mainline?
>>>
>>> Since log can set errno we have the builtins affect global memory and
>>> thus have VDEFs, this posses issues for match.pd which does not assign
>>> new VDEFs upon materializing the result, esp. for the case where
>>> you duplicate a call. There's a similar issue for -frounding-math
>>> where intermediate FP status changes can be lost. match.pd simply
>>> follows the SSA use-def chains without regarding memory side-effects.
>>>
>>> The transforms are guarded by flag_unsafe_math_optimizations but here
>>> I think we need !HONOR_SIGN_DEPENDENT_ROUNDING, !flag_trapping_math
>>> (exception state might be different for logN(a) - logN(b) -> logN(a/b),
>>> at least WRT INEXACT?), and !flag_errno_math (because of the VDEFs).
>>>
>>> + /* Simplify logN(C/a) into logN(C)-logN(a). */
>>> + (simplify
>>> + (logs (rdiv:s REAL_CST@0 @1))
>>> + (minus (logs @0) (logs @1)))
>>>
>>> I think you want
>>>
>>> (minus (logs! @0) (logs @1))
>>>
>>> here to make sure we constant-fold.
>>>
>>> + (simplify
>>> + (minus (logs:s @0) (logs:s @1))
>>> + (logs (rdiv @0 @1))))
>>>
>>> I think that's somewhat dangerous for @1 == 0 given log for
>>> zero arg results in -HUGE_VAL but a FP division by gives a NaN.
>>> I'm not exactly sure whether !HONOR_INFINITIES && !HONOR_NANS
>>> is good enough here.
>>>
>>> Your testcases probably all trigger during GENERIC folding,
>>> bypassing the VDEF issue - you might want to try assigning
>>> the comparison operands to tempoaries to run into the actual
>>> issues.
>> Dear Richard,
>> Thanks for the review and suggesting the additional flags. I added
>> - !HONOR_SIGN_DEPENDENT_ROUNDING
>> - !flag_trapping_math
>> - !flag_errno_math
>> - !HONOR_INFINITIES
>> - !HONOR_NANS
>> as guard before the patterns.
>> Can we add anything else to account for HUGE_VAL or will !HONOR_INFINITIES && !HONOR_NANS be enough? Or do you have a suggestion how I can check this?
>> I validated again on aarch64 and x86_64.
>
> Can you change your patch attachment format to be at least text/plain
> and ideally just inline it or use git send-mail? This way reviewing
> is much easier.
>
> I'll quote it here for reference:
>
> (if (flag_unsafe_math_optimizations)
>
> [...]
>
> + (if (! HONOR_SIGN_DEPENDENT_ROUNDING (type)
> + && ! HONOR_NANS (type) && ! HONOR_INFINITIES (type)
> + && ! flag_trapping_math
> + && ! flag_errno_math)
> + (for logs (LOG LOG2 LOG10)
> + /* Simplify logN(1.0/a) into -logN(a). */
> + (simplify
> + (logs (rdiv:s real_onep@0 @1))
> + (negate (logs @1)))
> +
> + /* Simplify logN(C/a) into logN(C)-logN(a). */
> + (simplify
> + (logs (rdiv:s REAL_CST@0 @1))
> + (minus (logs! @0) (logs @1)))
> +
> + /* Simplify logN(a)+logN(b) into logN(a*b). */
> + (simplify
> + (plus (logs:s @0) (logs:s @1))
> + (logs (mult @0 @1)))
> +
> + /* Simplify logN(a)-logN(b) into logN(a/b). */
> + (simplify
> + (minus (logs:s @0) (logs:s @1))
> + (logs (rdiv @0 @1)))))
>
> I'm OK with the extra guards added but I'm also not a IEEE FP expert
> here. In the previous review I did mention the extra constraints
> for specific sub-patterns IIRC.
>
> As a general note we might want to implement a HONOR_ERRNO (combined_fn).
> Probably a bad name, builtin_can_set_errno (combined_fn) might be better,
> for example above I'm not sure whether all of log(), log2() and log10()
> can set errno and how we generally should handle errno when
> -fno-math-errno isn't given and GCC wasn't configured specifically
> to target for example glibc. For glibc it might be nice if we could
> tell it we're not interested in errno from math functions and in this
> way convey -fno-math-errno to it for example via a crtfastmath like
> mechanism (or by calling alternate entry points).
>
> So, the patch is OK in case Joseph doesn't have any comments during
> this week.
Thanks, then I will commit it at the end of the day, if Joseph does not have further comments.
Best,
Jennifer
>
> Thanks,
> Richard.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4641 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20241011/813a6b6a/attachment-0001.p7s>
More information about the Gcc-patches
mailing list