[Bug rtl-optimization/50567] New: Reload pass generates sub-optimal spill code for registers in presence of a vec_concat insn
siddhesh.poyarekar at gmail dot com
gcc-bugzilla@gcc.gnu.org
Thu Sep 29 15:25:00 GMT 2011
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50567
Bug #: 50567
Summary: Reload pass generates sub-optimal spill code for
registers in presence of a vec_concat insn
Classification: Unclassified
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: siddhesh.poyarekar@gmail.com
Reduced program:
typedef long long __m128i __attribute__ ((__vector_size__ (16)));
__m128i process(char *mem1, char *mem2)
{
long long frag1, frag2;
frag2 = frag1 = *((long long *) mem1);
if (mem2 > mem1)
frag2 = *((long long *) mem2);
return (__m128i){frag2, frag1};
}
Generates redundant spills during the reload pass. IRA does not spill anything:
process:
.LFB0:
.cfi_startproc
movq (%rdi), %rax
cmpq %rsi, %rdi
movq %rax, %rdx
jae .L2
movq (%rsi), %rdx
.L2:
movq %rdx, -16(%rsp) <== here onwards
movq -16(%rsp), %xmm1
pinsrq $1, %rax, %xmm1
movdqa %xmm1, %xmm0
ret
This seems to happen because the pinsrq instruction (the vec_concat
implementation for x86_64) takes an SSE register for in and out and due to
this, the reload pass generates the spill code to move %rdx to %xmm1 as well as
the move from %xmm1 to %xmm0.
Ideally, the code generated should look like this:
process:
.LFB0:
.cfi_startproc
movq (%rdi), %rax
cmpq %rsi, %rdi
movq %rax, %rdx
jae .L2
movq (%rsi), %rdx
.L2:
movq %rdx, %xmm0
pinsrq $1, %rax, %xmm0
ret
More information about the Gcc-bugs
mailing list