This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/48701] New: [missed optimization] GCC fails to use aliasing of ymm and xmm registers
- From: "kretz at kde dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 20 Apr 2011 13:26:02 +0000
- Subject: [Bug target/48701] New: [missed optimization] GCC fails to use aliasing of ymm and xmm registers
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48701
Summary: [missed optimization] GCC fails to use aliasing of ymm
and xmm registers
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: kretz@kde.org
The two functions in the attached test case demonstrate the problem. The
intermediate stores/loads on the stack really should be optimized away.
testStore output now:
vmovdqa %xmm1,-0x30(%rsp)
vmovdqa %xmm0,-0x20(%rsp)
vmovdqa -0x30(%rsp),%ymm0
vmovdqa %ymm0,(<blackhole>)
should be either:
vinsertf128 $1,%xmm0,%ymm1,%ymm0
vmovdqa %ymm0,(<blackhole>)
or:
vmovdqa %xmm1,(<blackhole>)
vmovdqa %xmm0,0x10(<blackhole>)
depending on the target microarchitecture and accompanying code.
likewise the testLoad output now is:
vmovdqa (<blackhole>),%ymm0
vmovdqa %ymm0,-0x30(%rsp)
vmovdqa -0x20(%rsp),%xmm1
vmovdqa -0x30(%rsp),%xmm0
and should be either:
vmovdqa (<blackhole>),%ymm0
vextractf128 $1,%ymm0,%xmm1
or:
vmovdqa (<blackhole>),%xmm0
vmovdqa 0x10(<blackhole>),%xmm1