This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] load/store widening question

On 02/19/2015 12:25 PM, Ramana Radhakrishnan wrote:
On Thu, Feb 19, 2015 at 9:17 AM, Marat Zakirov <> wrote:
Hi all!

During my investigation I found that GCC does not performs load/store
widening ( Could you
please answer is it so? And is there any plans to make it? I also would like
to know is there any need to make load/store widening exclusively in ASan
phase just for reducing number of ASAN_CHECKS?

Example from the bug:

$ cat t2.c

int a[2];
int b[2];

int main ()
   b[0] = a[0];
   b[1] = a[1];
   return 0;

The answer is it depends. GCC can have SLP spot this in a generic form
across ports as in the example below.

AArch64 :

     adrp    x0, a    // 5    *movdi_aarch64/11    [length = 4]
     add    x0, x0, :lo12:a    // 6    add_losym_di    [length = 4]
     adrp    x1, b    // 8    *movdi_aarch64/11    [length = 4]
     add    x1, x1, :lo12:b    // 9    add_losym_di    [length = 4]
     ldr    d0, [x0]    // 7    *aarch64_simd_movv2si/1    [length = 4]
     mov    w0, 0    // 15    *movsi_aarch64/4    [length = 4]
     str    d0, [x1]    // 10    *aarch64_simd_movv2si/2    [length = 4]
     ret    // 40    simple_return    [length = 4]

Or AArch32 without neon, the standard ldm peepholes / ldrd peepholes spot this.

     @ args = 0, pretend = 0, frame = 0
     @ frame_needed = 0, uses_anonymous_args = 0
     @ link register save eliminated.
     movw    r2, #:lower16:a
     movw    r3, #:lower16:b
     movt    r2, #:upper16:a
     movt    r3, #:upper16:b
     ldmia    r2, {r1, r2}
     mov    r0, #0
     stmia    r3, {r1, r2}
     bx    lr

It will be interesting to see if the number of checks can be reduced
but I suspect you'll hit quite a few phase ordering issues and you'll
have quite a few variances between ports to make this work sensibly.


$ gcc t2.c -O3 -S

$ cat t2.s


         movl    a(%rip), %eax
         movl    %eax, b(%rip)
         movl    a+4(%rip), %eax
         movl    %eax, b+4(%rip)
         xorl    %eax, %eax

I will be very appreciate for your answers and thoughts.


Thank you very much Ramana.
I also would like x86 maintainers to explain why x86 GCC didn't handle given example?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]