Bug 43640 - Struct with two floats generates poor code
Summary: Struct with two floats generates poor code
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: unknown
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: argument, return
  Show dependency treegraph
 
Reported: 2010-04-04 06:40 UTC by Steven Fuerst
Modified: 2021-08-16 01:16 UTC (History)
1 user (show)

See Also:
Host: x86_64-linux
Target: x86_64-linux
Build: x86_64-linux
Known to work:
Known to fail:
Last reconfirmed: 2010-04-04 19:07:13


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Fuerst 2010-04-04 06:40:39 UTC
struct u1
{
	float x;
	float y;
};

float foo(struct u1 u)
{
	return u.x + u.y;
}

compiles into
gcc-4.5 -O3 tgcc.c -c -o tgcc.o

   0x0000000000000000 <+0>:     movq   %xmm0,-0x20(%rsp)
   0x0000000000000006 <+6>:     mov    -0x20(%rsp),%rax
   0x000000000000000b <+11>:    mov    %eax,-0x14(%rsp)
   0x000000000000000f <+15>:    shr    $0x20,%rax
   0x0000000000000013 <+19>:    mov    %eax,-0x10(%rsp)
   0x0000000000000017 <+23>:    movss  -0x14(%rsp),%xmm0
   0x000000000000001d <+29>:    addss  -0x10(%rsp),%xmm0
   0x0000000000000023 <+35>:    retq

The instructions dealing with rax/eax can be elided if the movss and addss load from the correct stack locations.

A better sequence, avoiding memory, might be

pshufd %xmm0, %xmm1, 1
addss %xmm1, %xmm0
retq
Comment 1 Richard Biener 2010-04-04 19:07:13 UTC
Confirmed.  We already expand in this funny way:

;; Generating RTL for gimple basic block 2

;; return D.2720_3;

(insn 6 5 7 t.c:8 (set (reg:SI 65)
        (subreg:SI (reg/v:DI 62 [ u ]) 0)) -1 (nil))

(insn 7 6 8 t.c:8 (parallel [
            (set (reg:DI 66)
                (ashiftrt:DI (reg/v:DI 62 [ u ])
                    (const_int 32 [0x20])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil))

(insn 8 7 9 t.c:8 (set (reg:SI 67)
        (subreg:SI (reg:DI 66) 0)) -1 (nil))

(insn 9 8 10 t.c:8 (set (reg:SF 63)
        (plus:SF (subreg:SF (reg:SI 65) 0)
            (subreg:SF (reg:SI 67) 0))) -1 (nil))

(insn 10 9 11 t.c:8 (set (reg:SF 61 [ <retval> ])
        (reg:SF 63)) -1 (nil))
Comment 2 Andrew Pinski 2010-04-09 03:10:10 UTC
Reloads are happening so this is a target issue really.
Comment 3 Seth LaForge 2016-03-10 19:57:14 UTC
For what it's worth, this generates similarly terrible code on ARM with gcc 5.2.1. It also occurs if the struct members are ints:

struct u1 { int x, y; };
int foo(struct u1 u) { return u.x + u.y; }

% arm-none-eabi-gcc -O3 -S foo.c

foo:
	sub	sp, sp, #8
	add	r3, sp, #8
	stmdb	r3, {r0, r1}
	ldmia	sp, {r0, r3}
	add	r0, r0, r3
	add	sp, sp, #8
	bx	lr

% arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.2.1 20151202 (release) [ARM/embedded-5-branch revision 231848]