[Bug target/96562] New: Rather poor assembly generated for copy-list-initialization in return statement.

maxim.yegorushkin at gmail dot com gcc-bugzilla@gcc.gnu.org
Mon Aug 10 22:04:33 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96562

            Bug ID: 96562
           Summary: Rather poor assembly generated for
                    copy-list-initialization in return statement.
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: maxim.yegorushkin at gmail dot com
  Target Milestone: ---

Rather poor assembly generated for trivial code.

The following code:

    template<class P, class SizeT>
    struct Span {
        P begin_;
        SizeT size_;
    };

    Span<char*, unsigned> f(char* p, char* q) {
        return {p, static_cast<unsigned>(q - p)};
    }

When compiled with gcc-6.1 to gcc-10.2 with options "-O3 -march=skylake
-mtune=skylake" produces unexpectedly long and sub-optimal assembly code:

    f(unsigned char*, unsigned char*):
        mov     QWORD PTR [rsp-16], 0
        mov     QWORD PTR [rsp-24], rdi
        sub     rsi, rdi
        vmovdqa xmm1, XMMWORD PTR [rsp-24]
        vpinsrd xmm0, xmm1, esi, 2
        vmovdqa XMMWORD PTR [rsp-24], xmm0
        mov     rax, QWORD PTR [rsp-24]
        mov     rdx, QWORD PTR [rsp-16]
        ret

clang with the same options produces the expected assembly:

    f(unsigned char*, unsigned char*):
        mov     rdx, rsi
        mov     rax, rdi
        sub     edx, eax
        ret

Is there a way to make gcc produce the expected assembly, please?

https://gcc.godbolt.org/z/bacGW8


More information about the Gcc-bugs mailing list