This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/58112] New: Ineffective addressing mode used in loop.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58112

            Bug ID: 58112
           Summary: Ineffective addressing mode used in loop.
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: neleai at seznam dot cz

Hi, in following testcase gcc -O3 generates following loop:

        movq    %rsi, %r9
        subq    %rdx, %r9
        movq    %r9, %rdi
        movq    %r9, %rsi
        leaq    16(%r9), %r8
        addq    $32, %rdi
        addq    $48, %rsi
        .p2align 4,,10
        .p2align 3
.L14:
        movdqu  (%rdx,%r9), %xmm0
        addq    $64, %rdx
        movdqa  %xmm0, -64(%rdx)
        movdqu  -64(%rdx,%r8), %xmm0
        movdqa  %xmm0, -48(%rdx)
        movdqu  -64(%rdx,%rdi), %xmm0
        movdqa  %xmm0, -32(%rdx)
        movdqu  -64(%rdx,%rsi), %xmm0
        movdqa  %xmm0, -16(%rdx)
        cmpq    %rdx, %rcx
        jne     .L14
        rep; ret

It saves one addq $64, %rsi instruction. However it occupies four extra
registers, and address calculations done at each iteration cost more and lead
to bigger code than instruction saved.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]