[PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

Fri Sep 6 14:10:19 GMT 2024

> On 6 Sep 2024, at 16:00, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Sep 06, 2024 at 01:46:01PM +0000, Jennifer Schmitz wrote:
>> In the pattern X - (X / Y) * Y to X % Y, this patch guards the
>> simplification for vector types by a check for:
>> 1) Support of the mod optab for vectors OR
>> 2) Application during early gimple passes (using PROP_gimple_any).
>> This is to prevent reverting vectorization of modulo to div/mult/sub
>> if the target does not support vector mod optab, while still allowing
>> the simplification during early gimple passes (as tested, for example,
>> in gcc.dg/fold-minus-1.c).
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
>> 
>> gcc/
>>      PR tree-optimization/116569
>>      * generic-match-head.cc (optimize_early_gimple_p): Add inline
>>      function with mask for early gimple passes.
>>      * gimple-match-head.cc (optimize_early_gimple_p): Likewise.
>>      * match.pd: Guard simplification to trunc_mod with check for
>>      mod optab support.
>> 
>> gcc/testsuite/
>>      PR tree-optimization/116569
>>      * gcc.dg/torture/pr116569.c: New test.
> 
> This is certainly wrong.
> PROP_gimple_any is set already at the end of gimplification, so certainly
> doesn't include any other early gimple passes.
> And, not all statements are folded during gimplification, e.g. in OpenMP
> regions folding is postponed until the omp lowering pass and folded only
> there (at which point PROP_gimple_any is already set).
> 
> What exactly are you trying to ensure this optimization goes before?
> For non-VL vectors I guess vector lowering, but that is done far later
> and we already have a different predicate for that.
> For VL vectors, what transforms that if user write % ?

There’s currently no way to write this in a generic VLA way. The SVE intrinsics for this would be opaque to GIMPLE and the generic vector extension doesn’t support VLA for now.
The problem is the fold-minus-1.c test case that wants to see the fold happen early on, and I think that makes sense from a canonicalization POV but when the vectorizer has expanded a vector mod later on we don’t want to put it back together.
I agree gimple_any doesn’t look like the right thing. Is there a better check to use?
Thanks,
Kyrill

> 
>        Jakub
>