This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/55653] New: Unnecessary initialization of vector register
- From: "josh.m.conner at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 11 Dec 2012 19:34:25 +0000
- Subject: [Bug middle-end/55653] New: Unnecessary initialization of vector register
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55653
Bug #: 55653
Summary: Unnecessary initialization of vector register
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: middle-end
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: josh.m.conner@gmail.com
When initializing all lanes of a vector register, I notice that the register is
first initialized to zero and then all lanes of the vector are independently
initialized, resulting in extra code.
Specifically, I'm looking at the aarch64 target, with the following source:
void
fmla_loop (double * restrict result, double * restrict mul1,
double mul2, int size)
{
int i;
for (i = 0; i < size; i++)
result[i] = result[i] + mul1[i] * mul2;
}
Compiled with:
aarch64-linux-gnu-gcc -std=c99 -O3 -ftree-vectorize -S -o test.s test.c
The resultant code to initialize a vector register with two instances of mul2
is:
adr x3, .LC0
ld1 {v3.2d}, [x3]
ins v3.d[0], v0.d[0]
ins v3.d[1], v0.d[0]
...
.LC0:
.word 0
.word 0
.word 0
.word 0
Where the first two instructions (that initialize the vector register) are
unnecessary, as is the space for .LC0.
Note that this initialization is being performed here in store_constructor:
/* Inform later passes that the old value is dead. */
if (!cleared && !vector && REG_P (target))
emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
right after another check to see if the vector needs to be cleared out (and
determine that it doesn't).
Instead of the emit_move_insn, that code used to be:
emit_insn (gen_rtx_CLOBBER (VOIDmode, target));
But was changed in r101169, with the comment:
"The expr.c change elides an extra move that's creeped in since we
changed clobbered values to get new registers in reload."
(see full checkin text here:
http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01584.html)
It's not clear to me whether this can be changed back, or if later passes
should be recognizing this initialization as redundant, or whether we need a
new expand pattern to match vector fill (vector duplicate). At any rate, the
code is certainly not ideal as it stands.
Thanks!