This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] [PR31699] fix wrong peeling for alignment in vectorizer


Hi,

Turns out that when we peel the first few iterations of a loop in order to
align a store in the loop, we don't compute correctly the number of
iterations we want to peel when we have multiple data-types in the loop.
This is because the computation assumes that the vectorization factor is
equal to the number of elements in a vector, but that doesn't hold for all
the datarefs in the loop if their types are of different sizes (e.g. if we
have shorts and ints in the loop, then, for targets with vector size
16-bytes, the number of int elements in a vector is 4, while the
vectorization factor of the loop is 8). The fix is just to use
TYPE_VECTOR_SUBPARTS instead of VF as the number-of-elements-in-a-vector.

This patch fixes that, adds two testcases, and xfails vectorization in a
few existing tests which are sensitive to the order in which we consider
which stores to align by peeling. This is a deficiency that was always
there but just exposed now with this fixe (I'll open a missed optimization
PR for this).

One of the new tests requires int-to-float conversion, so I added a target
keyword for that, and updated a couple of existing tests to use that as
well.

Thanks to Uros for reporting the problem, and testing that the patch fixes
the PR.

The patch was bootstapped with vectorization enabled and passed full
testsing with no regressions on i386-linux.
Also bootstrapped with vectorization enabled on powerpc-linux, and tested
on the vectorized testcases.

ok for mainline?

thanks,
dorit

        PR tree-optimization/31699
        * tree-vect-analyze.c (vect_update_misalignment_for_peel): Remove
wrong
        code.
        (vect_enhance_data_refs_alignment): Compute peel amount using
        TYPE_VECTOR_SUBPARTS instead of vf.
        * tree-vect-transform.c (vect_gen_niters_for_prolog_loop):
Likewise.

        PR tree-optimization/31699
        * lib/target-supports.exp
(check_effective_target_vect_intfloat_cvt):
        New.
        (check_effective_target_vect_floatint_cvt): New.
        * gcc.dg/vect/vect-floatint-conversion-1.c: Use new keyword instead
        of specific targets.
        * gcc.dg/vect/vect-intfloat-conversion-1.c: Likewise.
        * gcc.dg/vect/vect-multitypes-1.c: One less loop gets vectorized.
        * gcc.dg/vect/vect-multitypes-4.c: Likewise.
        * gcc.dg/vect/vect-iv-4.c: Likewise.
        * gcc.dg/vect/vect-multitypes-11.c: New.
        * gcc.dg/vect/pr31699.c: New.

(See attached file: pr31699.txt)

Attachment: pr31699.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]