[Bug target/89226] codegen for copying a 512-bit object fails to use avx instructions

hjl.tools at gmail dot com gcc-bugzilla@gcc.gnu.org
Thu Aug 5 12:31:36 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED
   Target Milestone|---                         |12.0

--- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12:

[hjl@gnu-skl-2 gcc]$ ./xgcc -B./ -S -O3 -march=haswell x.cc 
[hjl@gnu-skl-2 gcc]$ cat x.s
        .file   "x.cc"
        .text
        .p2align 4
        .globl  _Z5copy1RK9dumb_pairRS_
        .type   _Z5copy1RK9dumb_pairRS_, @function
_Z5copy1RK9dumb_pairRS_:
.LFB5664:
        .cfi_startproc
        vmovdqa (%rdi), %ymm15
        vmovdqa %ymm15, (%rsi)
        vmovdqa 32(%rdi), %ymm15
        vmovdqa %ymm15, 32(%rsi)
        vzeroupper
        ret
        .cfi_endproc
.LFE5664:
        .size   _Z5copy1RK9dumb_pairRS_, .-_Z5copy1RK9dumb_pairRS_
        .p2align 4
        .globl  _Z5copy2RK10smart_pairRS_
        .type   _Z5copy2RK10smart_pairRS_, @function
_Z5copy2RK10smart_pairRS_:
.LFB5670:
        .cfi_startproc
        vmovdqa (%rdi), %ymm0
        vmovdqa 32(%rdi), %ymm1
        vmovdqa %ymm0, (%rsi)
        vmovdqa %ymm1, 32(%rsi)
        vzeroupper
        ret
        .cfi_endproc
.LFE5670:
        .size   _Z5copy2RK10smart_pairRS_, .-_Z5copy2RK10smart_pairRS_
        .ident  "GCC: (GNU) 12.0.0 20210805 (experimental) [master revision
f7aa81892eb:82bfff3e5fa:c16f21c7cf97ce48967e42d3b5d22ea169a9c2c8]"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-skl-2 gcc]$

*** This bug has been marked as a duplicate of bug 90773 ***


More information about the Gcc-bugs mailing list