[Bug tree-optimization/78821] GCC7: Copying whole 32 bits structure field by field not optimised into copying whole 32 bits at once

Fri Dec 16 11:51:00 GMT 2016

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2016-12-16
                 CC|                            |rguenth at gcc dot gnu.org
          Component|c                           |tree-optimization
            Version|unknown                     |7.0
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  We have several passes doing the necessary analyses but none with
the goal of eliminating this case.  The closest match is probably the bswap
pass
which has related missed optimization bugs for bswap via memory.  The
store-merging pass OTOH has the "sink" analysis part -- identifying adjacent
stores.

Together the passes analysis could handle this case (and bswap via memory).

Another case that would probably benefit from moving load/store analysis
and dataflow to some common code.

Note that SLP vectorization on 32bit with SSE disabled would handle the
case in this bug (another pass with some of the required analysis).  It's
not done at the moment because of a bug in alignment analysis (thought
I fixed that ...).  Really fixing that yields

fct:
.LFB0:
        .cfi_startproc
        movl    v, %eax
        movl    %eax, u
        ret
        .cfi_endproc
.LFE0:
        .size   fct, .-fct
        .p2align 4,,15
        .globl  fct2
        .type   fct2, @function
fct2:
.LFB1:
        .cfi_startproc
        movl    v, %eax
        movl    %eax, u
        ret

with -O3 -m32 -mno-sse (yeah, neither the BB vectorizer nor the backend
is very clever in the "vector" sizes it tries/allows).