This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/31334] Bad codegen for vector initializer with constants prop'd into a vector initializer
- From: "pinskia at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 25 Mar 2007 07:26:11 -0000
- Subject: [Bug target/31334] Bad codegen for vector initializer with constants prop'd into a vector initializer
- References: <bug-31334-8585@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #3 from pinskia at gcc dot gnu dot org 2007-03-25 09:26 -------
Here is a patch which fixes the problem:
Index: ../../gcc/config/rs6000/rs6000.c
===================================================================
--- ../../gcc/config/rs6000/rs6000.c (revision 123180)
+++ ../../gcc/config/rs6000/rs6000.c (working copy)
@@ -2588,6 +2588,7 @@
if (n_var == 0)
{
+ rtx const_vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0));
if (mode != V4SFmode && all_const_zero)
{
/* Zero register. */
@@ -2595,10 +2596,10 @@
gen_rtx_XOR (mode, target, target)));
return;
}
- else if (mode != V4SFmode && easy_vector_constant (vals, mode))
+ else if (mode != V4SFmode && easy_vector_constant (const_vec, mode))
{
/* Splat immediate. */
- emit_insn (gen_rtx_SET (VOIDmode, target, vals));
+ emit_insn (gen_rtx_SET (VOIDmode, target, const_vec));
return;
}
else if (all_same)
@@ -2606,7 +2607,7 @@
else
{
/* Load from constant pool. */
- emit_move_insn (target, gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0)));
+ emit_move_insn (target, const_vec);
return;
}
}
-------- CUT---------------
The problem is that we get a parallel:V4SI with vals which is obviously not an
easy_vector_constant so we instead create a CONST_VECTOR to check if we have an
easy vector constant in this case. Yes we should most likely get a VECTOR_CST
but the reason why we don't is most likely because we prop'd the constant into
vector initializer and we never convert that vector initializer into a
VECTOR_CST (I had a patch once for doing that in fold but I never got around to
fully testing it).
Note the above patch was not bootstrapped or tested. It was tested only on my
testcase in comment #2 and the orignal testcase.
For the orginal testcase we get:
L2:
vadduwm v13,v13,v0
vadduwm v0,v0,v12
bdnz L2
for the inner most loop which now looks good.
I am thinking we should not split this splat up until after reload anyways so
it can be pulled out of a loop.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31334