[Bug target/97891] [x86] Consider using registers on large initializations

Thu Nov 19 09:33:26 GMT 2020

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891

--- Comment #5 from andysem at mail dot ru ---
Using a register is beneficial even for bytes and words if there are multiple
of mov instructions. But there has to be a single reg0 for all movs.

I'm not very knowlegeable about gcc internals, but would it be beneficial to
implement this on a higher level than instruction transformation? I.e. so that
instead of this:

    a = 0;
    b = 0;
    c = 0;

we have:

    any reg0 = 0; // any represents a type compatible with any fundamental or
enum type
    a = reg0;
    b = reg0;
    c = reg0;

This way, reg0 would be in a single register, and that xorl instruction could
be subject to other tree optimizations.

With tree-level optimization, another thing to note is vectorizer. I know gcc
can sometimes merge adjacent initializations without padding to a larger single
instruction initializazion. For example:

struct A
{
    long a1;
    long a2;

    A() :
        a1(0), a2(0)
    {
    }
};

void test(A* p, unsigned int count)
{
    for (unsigned int i = 0; i < count; ++i)
    {
        p[i] = A();
    }
}

test(A*, unsigned int):
        testl   %esi, %esi
        je      .L1
        leal    -1(%rsi), %eax
        pxor    %xmm0, %xmm0
        salq    $4, %rax
        leaq    16(%rdi,%rax), %rax
.L3:
        movups  %xmm0, (%rdi)
        addq    $16, %rdi
        cmpq    %rax, %rdi
        jne     .L3
.L1:
        ret

I would like this to still work.