struct u1 { float x; float y; }; float foo(struct u1 u) { return u.x + u.y; } compiles into gcc-4.5 -O3 tgcc.c -c -o tgcc.o 0x0000000000000000 <+0>: movq %xmm0,-0x20(%rsp) 0x0000000000000006 <+6>: mov -0x20(%rsp),%rax 0x000000000000000b <+11>: mov %eax,-0x14(%rsp) 0x000000000000000f <+15>: shr $0x20,%rax 0x0000000000000013 <+19>: mov %eax,-0x10(%rsp) 0x0000000000000017 <+23>: movss -0x14(%rsp),%xmm0 0x000000000000001d <+29>: addss -0x10(%rsp),%xmm0 0x0000000000000023 <+35>: retq The instructions dealing with rax/eax can be elided if the movss and addss load from the correct stack locations. A better sequence, avoiding memory, might be pshufd %xmm0, %xmm1, 1 addss %xmm1, %xmm0 retq
Confirmed. We already expand in this funny way: ;; Generating RTL for gimple basic block 2 ;; return D.2720_3; (insn 6 5 7 t.c:8 (set (reg:SI 65) (subreg:SI (reg/v:DI 62 [ u ]) 0)) -1 (nil)) (insn 7 6 8 t.c:8 (parallel [ (set (reg:DI 66) (ashiftrt:DI (reg/v:DI 62 [ u ]) (const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) -1 (nil)) (insn 8 7 9 t.c:8 (set (reg:SI 67) (subreg:SI (reg:DI 66) 0)) -1 (nil)) (insn 9 8 10 t.c:8 (set (reg:SF 63) (plus:SF (subreg:SF (reg:SI 65) 0) (subreg:SF (reg:SI 67) 0))) -1 (nil)) (insn 10 9 11 t.c:8 (set (reg:SF 61 [ <retval> ]) (reg:SF 63)) -1 (nil))
Reloads are happening so this is a target issue really.
For what it's worth, this generates similarly terrible code on ARM with gcc 5.2.1. It also occurs if the struct members are ints: struct u1 { int x, y; }; int foo(struct u1 u) { return u.x + u.y; } % arm-none-eabi-gcc -O3 -S foo.c foo: sub sp, sp, #8 add r3, sp, #8 stmdb r3, {r0, r1} ldmia sp, {r0, r3} add r0, r0, r3 add sp, sp, #8 bx lr % arm-none-eabi-gcc --version arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.2.1 20151202 (release) [ARM/embedded-5-branch revision 231848]