Bug 114823 - Missed optimization of redundant loops
Summary: Missed optimization of redundant loops
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2024-04-23 10:11 UTC by Yi
Modified: 2024-04-25 06:09 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-04-23 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Yi 2024-04-23 10:11:55 UTC
Hello, for the reduced code below, its second loop is redundant. There seems to be a missed optimization.

https://godbolt.org/z/McqeYdnfY

int a[1024];
int b[1024];

void func()
{
    for (int i = 0; i < 1024; i+=1) {
        a[i] = b[i] * 2;
    }

    for (int i = 0; i < 1024; i+=1) {
        a[i] = b[i] * 2;
    }
}

GCC -O3:
func:
        xor     eax, eax
.L2:
        movdqa  xmm0, XMMWORD PTR b[rax]
        add     rax, 16
        pslld   xmm0, 1
        movaps  XMMWORD PTR a[rax-16], xmm0
        cmp     rax, 4096
        jne     .L2
        xor     eax, eax
.L3:
        movdqa  xmm0, XMMWORD PTR b[rax]
        add     rax, 16
        pslld   xmm0, 1
        movaps  XMMWORD PTR a[rax-16], xmm0
        cmp     rax, 4096
        jne     .L3
        ret

Expected code:
func:
        xor     eax, eax
.L2:
        movdqa  xmm0, XMMWORD PTR b[rax]
        add     rax, 16
        pslld   xmm0, 1
        movaps  XMMWORD PTR a[rax-16], xmm0
        cmp     rax, 4096
        jne     .L2
        ret

Thank you very much for your time and effort! We look forward to hearing from you.
Comment 1 Richard Biener 2024-04-23 15:53:15 UTC
Confirmed.  loop fusion would detect the redundancy.