This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/31334] Bad codegen for vector initializer with constants prop'd into a vector initializer



------- Comment #5 from pinskia at gcc dot gnu dot org  2007-03-25 22:55 -------
(In reply to comment #4)
> I do not believe the patch will help with the original missed optimization
> because  the backend never sees a direct assignment from the CONSTRUCTOR -- it
> already is placed in memory.  The example in comment #2 is different.

Did I miss something because for the original example the back-end does see the
CONSTRUCTOR in rs6000_expand_vector_init as a parallel with the mode of V4SI.  
The expansion of vect_cst_.30 + {4, 4, 4, 4} calls rs6000_expand_vector_init
with vals being:
(parallel:V4SI  ([ (const_int 4),(const_int 4),(const_int 4),(const_int 4) ]) )

So when we call easy_vector_constant with vals, we always get false as it is
not a const_vector.  My patch changes it so we call easy_vector_constant with a
const_vector as we already proved in the loop above it is made up of constant
elements as n_var is zero.


For the orginal testcase, the assembly now looks like (on powerpc-darwin):
_main1:
        mfspr r0,256
        stw r0,-4(r1)
        oris r0,r0,0xc00c
        mtspr 256,r0
        lis r2,ha16(LC0)
        la r2,lo16(LC0)(r2)
        vspltisw v0,4
        lvx v1,0,r2
        li r2,23
        mtctr r2
        vor v12,v0,v0
        mr r0,r3
        vor v13,v1,v1
        vadduwm v0,v1,v0
L2:
        vadduwm v13,v13,v0
        vadduwm v0,v0,v12
        bdnz L2
        vsldoi v0,v13,v13,8
        addi r2,r1,-20
        lwz r12,-4(r1)
        vadduwm v0,v0,v13
        vsldoi v1,v0,v0,12
        vadduwm v1,v1,v0
        stvewx v1,0,r2
        lwz r3,-20(r1)
        add r3,r3,r0
        mtspr 256,r12
        blr
        .const
        .align 4
LC0:
        .long   0
        .long   1
        .long   2
        .long   3


Notice how we have vspltisw outside the loop.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31334


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]