This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/80425] New: Extra inter-unit register move with zero-extension
- From: "ubizjak at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 14 Apr 2017 06:41:30 +0000
- Subject: [Bug rtl-optimization/80425] New: Extra inter-unit register move with zero-extension
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80425
Bug ID: 80425
Summary: Extra inter-unit register move with zero-extension
Product: gcc
Version: 7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
The testcase is taken from PR80381:
--cut here--
#include <x86intrin.h>
__m512i
f1 (__m512i x, int a)
{
return _mm512_srai_epi32 (x, a);
}
--cut here--
When compiled with -O2 -mavx512f -mtune=intel, the resulting assembly reads:
f1:
movl %edi, %edi # 8 *zero_extendsidi2/4 [length = 2]
vmovq %rdi, %xmm1 # 21 *movdi_internal/20 [length = 6]
vpsrad %xmm1, %zmm0, %zmm0 # 13 ashrv16si3/1 [length = 6]
ret # 24 simple_return_internal [length = 1]
(insn 8) and (insn 21) could be merged to
vmovd %edx, %xmm0 # 13 *zero_extendsidi2/10 [length = 6]
Register allocator somehow avoids zero-extension to SSE reg in (insn 8) and
generates input reload (insn 21) for (insn 13):
Inserting insn reload before:
21: r100:DI=r196:DI
...
Choosing alt 19 in insn 21: (0) ?*Yi (1) r {*movdi_internal}
RA could choose the same (?*Yi, r) alternative in the (insn 12).
REE pass also doesn't merge (insn 8) and (insn 21).