Bug 70763 - Use SSE for DImode load/store
Summary: Use SSE for DImode load/store
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 7.0
: P3 enhancement
Target Milestone: 8.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2016-04-22 14:30 UTC by H.J. Lu
Modified: 2022-01-11 10:29 UTC (History)
1 user (show)

See Also:
Host:
Target: i386
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-04-27 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2016-04-22 14:30:27 UTC
On i386, we should use SSE for DImode load/store:

[hjl@gnu-6 pr70155d]$ cat x1.i
extern long long a, b;

void
foo (void)
{
  a = b;
}
[hjl@gnu-6 pr70155d]$ cat x2.i
struct foo
{
  long long i;
}__attribute__ ((packed));

extern struct foo x, y;

void
foo (void)
{
  x = y;
}
[hjl@gnu-6 pr70155d]$ cat x5.i
extern long long a;

void
foo (void)
{
  a = 0;
}
[hjl@gnu-6 pr70155d]$ cat x6.i
extern long long a;

void
foo (void)
{
  a = -1;
}
[hjl@gnu-6 pr70155d]$ make x1.s x2.s x5.s x6.s
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 -msse2 -m32 -fno-asynchronous-unwind-tables -S -o x1.s x1.i
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 -msse2 -m32 -fno-asynchronous-unwind-tables -S -o x2.s x2.i
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 -msse2 -m32 -fno-asynchronous-unwind-tables -S -o x5.s x5.i
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 -msse2 -m32 -fno-asynchronous-unwind-tables -S -o x6.s x6.i
[hjl@gnu-6 pr70155d]$ cat  x1.s x2.s x5.s x6.s
	.file	"x1.i"
	.text
	.p2align 4,,15
	.globl	foo
	.type	foo, @function
foo:
	movl	b, %eax
	movl	b+4, %edx
	movl	%eax, a
	movl	%edx, a+4
	ret
	.size	foo, .-foo
	.ident	"GCC: (GNU) 7.0.0 20160422 (experimental)"
	.section	.note.GNU-stack,"",@progbits
	.file	"x2.i"
	.text
	.p2align 4,,15
	.globl	foo
	.type	foo, @function
foo:
	movl	y, %eax
	movl	y+4, %edx
	movl	%eax, x
	movl	%edx, x+4
	ret
	.size	foo, .-foo
	.ident	"GCC: (GNU) 7.0.0 20160422 (experimental)"
	.section	.note.GNU-stack,"",@progbits
	.file	"x5.i"
	.text
	.p2align 4,,15
	.globl	foo
	.type	foo, @function
foo:
	movl	$0, a
	movl	$0, a+4
	ret
	.size	foo, .-foo
	.ident	"GCC: (GNU) 7.0.0 20160422 (experimental)"
	.section	.note.GNU-stack,"",@progbits
	.file	"x6.i"
	.text
	.p2align 4,,15
	.globl	foo
	.type	foo, @function
foo:
	movl	$-1, a
	movl	$-1, a+4
	ret
	.size	foo, .-foo
	.ident	"GCC: (GNU) 7.0.0 20160422 (experimental)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-6 pr70155d]$ 

They all can use SSE loa/store.
Comment 1 Andrew Pinski 2022-01-11 10:29:03 UTC
x5 and x6 don't use sse but that is ok as it is an instruction to create the 0/-1 and then one store. Right now it is just two stores.

So closing as fixed for GCC 8.