This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/31334] Bad codegen for vector initializer with constants prop'd into a vector initializer
- From: "pinskia at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 25 Mar 2007 20:55:17 -0000
- Subject: [Bug target/31334] Bad codegen for vector initializer with constants prop'd into a vector initializer
- References: <bug-31334-8585@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #5 from pinskia at gcc dot gnu dot org 2007-03-25 22:55 -------
(In reply to comment #4)
> I do not believe the patch will help with the original missed optimization
> because the backend never sees a direct assignment from the CONSTRUCTOR -- it
> already is placed in memory. The example in comment #2 is different.
Did I miss something because for the original example the back-end does see the
CONSTRUCTOR in rs6000_expand_vector_init as a parallel with the mode of V4SI.
The expansion of vect_cst_.30 + {4, 4, 4, 4} calls rs6000_expand_vector_init
with vals being:
(parallel:V4SI ([ (const_int 4),(const_int 4),(const_int 4),(const_int 4) ]) )
So when we call easy_vector_constant with vals, we always get false as it is
not a const_vector. My patch changes it so we call easy_vector_constant with a
const_vector as we already proved in the loop above it is made up of constant
elements as n_var is zero.
For the orginal testcase, the assembly now looks like (on powerpc-darwin):
_main1:
mfspr r0,256
stw r0,-4(r1)
oris r0,r0,0xc00c
mtspr 256,r0
lis r2,ha16(LC0)
la r2,lo16(LC0)(r2)
vspltisw v0,4
lvx v1,0,r2
li r2,23
mtctr r2
vor v12,v0,v0
mr r0,r3
vor v13,v1,v1
vadduwm v0,v1,v0
L2:
vadduwm v13,v13,v0
vadduwm v0,v0,v12
bdnz L2
vsldoi v0,v13,v13,8
addi r2,r1,-20
lwz r12,-4(r1)
vadduwm v0,v0,v13
vsldoi v1,v0,v0,12
vadduwm v1,v1,v0
stvewx v1,0,r2
lwz r3,-20(r1)
add r3,r3,r0
mtspr 256,r12
blr
.const
.align 4
LC0:
.long 0
.long 1
.long 2
.long 3
Notice how we have vspltisw outside the loop.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31334