[Bug target/80846] auto-vectorized AVX2 horizontal sum should narrow to 128b right away, to be more efficient for Ryzen and Intel

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Jul 20 16:37:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80846

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Author: jakub
Date: Thu Jul 20 16:36:18 2017
New Revision: 250397

URL: https://gcc.gnu.org/viewcvs?rev=250397&root=gcc&view=rev
Log:
        PR target/80846
        * config/i386/i386.c (ix86_expand_vector_init_general): Handle
        V2TImode and V4TImode.
        (ix86_expand_vector_extract): Likewise.
        * config/i386/sse.md (VMOVE): Enable V4TImode even for just
        TARGET_AVX512F, instead of only for TARGET_AVX512BW.
        (ssescalarmode): Handle V4TImode and V2TImode.
        (VEC_EXTRACT_MODE): Add V4TImode and V2TImode.
        (*vec_extractv2ti, *vec_extractv4ti): New insns.
        (VEXTRACTI128_MODE): New mode iterator.
        (splitter for *vec_extractv?ti first element): New.
        (VEC_INIT_MODE): New mode iterator.
        (vec_init<mode>): Consolidate 3 expanders into one using
        VEC_INIT_MODE mode iterator.

        * gcc.target/i386/avx-pr80846.c: New test.
        * gcc.target/i386/avx2-pr80846.c: New test.
        * gcc.target/i386/avx512f-pr80846.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/avx-pr80846.c
    trunk/gcc/testsuite/gcc.target/i386/avx2-pr80846.c
    trunk/gcc/testsuite/gcc.target/i386/avx512f-pr80846.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog


More information about the Gcc-bugs mailing list