This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH] PING:fold two vector_cst in const_binop
> -----Original Message-----
> From: Richard Guenther [mailto:richard.guenther@gmail.com]
> Sent: 07 April 2009 14:00
> To: Bingfeng Mei
> Cc: gcc-patches
> Subject: Re: [PATCH] PING:fold two vector_cst in const_binop
>
> On Tue, Apr 7, 2009 at 2:19 PM, Bingfeng Mei
> <bmei@broadcom.com> wrote:
> > Hello, Richard,
> >
> > The following is the patch I submitted before, which was
> not appropriate for late stage of 4.4. I retested the patch
> on current trunk.
> >
> > --
> > GCC fails to fold two vector_cst in ccp1 pass into one for
> vector modes that are natively supported by target. For
> example, the following test will fail in powerpc target with
> -maltivec turned on (v4si is supported by altivec natively).
> Ironically, if the vector mode is not supported natively by
> target, the folding can be successful through scalar operations :-)
> >
> > /* { dg-do compile } */
> > /* { dg-options "-O2 -fdump-tree-ccp1" } */
> >
> > typedef unsigned int v4si __attribute__ ((vector_size (16)));
> >
> > v4si c;
> >
> > void foo()
> > {
> > ?v4si a = { 1, 2, 3, 4 };
> > ?v4si b = { 5, 6, 7, 8 };
> > ?c = a + b;
> > }
> >
> > /* { dg-final { scan-tree-dump-times "c =.* { 6, 8, 10, 12
> }" 1 "ccp1" } } */
> > /* { dg-final { cleanup-tree-dump "ccp1" } } */
> >
> >
> > GCC produces following intermediate code (.ccp1):
> >
> > ;; Function foo (foo)
> >
> > foo ()
> > {
> > ?v4si c.0;
> >
> > <bb 2>:
> > ?c.0_3 = { 1, 2, 3, 4 } + { 5, 6, 7, 8 };
> > ?c ={v} c.0_3;
> > ?return;
> >
> > }
> >
> > The following patch fixes this problem. With the patch, GCC
> produces following intermediate code (.ccp1)
> >
> >
> > ;; Function foo (foo)
> >
> > foo ()
> > {
> > <bb 2>:
> > ?c ={v} { 6, 8, 10, 12 };
> > ?return;
> >
> > }
> >
> > The patch was tested on x86_64-unknown-linux-gnu. I am not
> sure how to generalize above test case, since different
> targets support different vector modes. OK for mainline?
> >
> > Bingfeng Mei
> > Broadcom UK
> >
> >
> > 2009-04-07 Bingfeng Mei <bmei@broadcom.com>
> >
> > ? ? ? ?* fold-const.c (const_binop): Combine two VECTOR_CST
> under operation CODE to produce a new one
> >
> > Index: fold-const.c
> > ===================================================================
> > --- fold-const.c ? ? ? ?(revision 145659)
> > +++ fold-const.c ? ? ? ?(working copy)
> > @@ -1998,6 +1998,51 @@
> > ? ? ? ?return build_complex (type, real, imag);
> > ? ? }
> >
> > + ?if (TREE_CODE (arg1) == VECTOR_CST)
>
> && TREE_CODE (arg2) == VECTOR_CST)
>
> as if we are ever called for shifts the shift argument may be
> a scalar.
>
> > + ? ?{
> > + ? ? ?tree type = TREE_TYPE(arg1);
> > + ? ? ?int count = TYPE_VECTOR_SUBPARTS (type), i;
> > + ? ? ?bool fail = false;
> > + ? ? ?tree elements1, elements2, list = NULL_TREE;
> > + ? ? ?elements1 = TREE_VECTOR_CST_ELTS (arg1);
> > + ? ? ?elements2 = TREE_VECTOR_CST_ELTS (arg2);
> > +
> > + ? ? ?for (i = 0; i < count; i++)
> > + ? ? ? {
> > + ? ? ? ? ?tree elem1, elem2, elem;
> > +
> > + ? ? ? ? ?/* The trailing elements can be empty and should
> be treated as 0 */
> > + ? ? ? ? ?if(!elements1)
> > + ? ? ? ? ? ?elem1 = build_int_cst(TREE_TYPE(type), 0);
>
> Hmm, where is this documented? build_zero_vector fills in
> explicit zero members. Also using an integer type here is wrong
> for floating point vectors. Please instead bail out if this happens
> on either vector.
I don't know where it is documented. But it is how GCC is implemented currently, and I just try to preserve the same behaviour :-).
For example:
typedef unsigned int v4si __attribute__ ((vector_size (16)));
v4si bar()
{
v4si a = {1, 2};
return a;
}
will produces on PowerPC.
.file "tst2.c"
.gnu_attribute 4, 1
.gnu_attribute 8, 2
.gnu_attribute 12, 1
.section ".text"
.align 2
.globl bar
.type bar, @function
bar:
lis 9,.LC0@ha
la 9,.LC0@l(9)
lvx 2,0,9
blr
.size bar, .-bar
.section .rodata.cst16,"aM",@progbits,16
.align 4
.LC0:
.4byte 1
.4byte 2
.4byte 0
.4byte 0
.ident "GCC: (GNU) 4.5.0 20090407 (experimental) [trunk revision 143368]"
>
> > + ? ? ? ? ?else
> > + ? ? ? ? ? ?{
> > + ? ? ? ? ? ? ?elem1 = TREE_VALUE(elements1);
> > + ? ? ? ? ? ? ?elements1 = TREE_CHAIN (elements1);
> > + ? ? ? ? ? ?}
> > +
> > + ? ? ? ? ?if(!elements2)
> > + ? ? ? ? ? ?elem2 = build_int_cst(TREE_TYPE(type), 0);
> > + ? ? ? ? ?else
> > + ? ? ? ? ? ?{
> > + ? ? ? ? ? ? ?elem2 = TREE_VALUE(elements2);
> > + ? ? ? ? ? ? ?elements2 = TREE_CHAIN (elements2);
> > + ? ? ? ? ? ?}
> > +
> > + ? ? ? ? ?elem = const_binop (code, elem1, elem2, notrunc);
> > +
> > + ? ? ? ? ?/* It is possible that const_binop cannot handle
> the given
> > + ? ? ? ? ? ?code and return NULL_TREE */
> > + ? ? ? ? ?if(elem == NULL_TREE)
> > + ? ? ? ? ? ?{
> > + ? ? ? ? ? ? ?fail = true;
> > + ? ? ? ? ? ? ?break;
>
> Just return NULL_TREE directly in this case,
>
> > + ? ? ? ? ? ?}
> > +
> > + ? ? ? ? ?list = tree_cons (NULL_TREE, elem, list);
> > + ? ? ? }
> > + ? ? ?if(!fail)
>
> no need for the fail variable.
>
> > + ? ? ? ?return build_vector(type, nreverse(list));
>
> Instead of nreversing here build the element list in the correct
> order.
>
> Please re-sumbit with the suggested changes and updated
> testing.
>
> Thanks,
> Richard.
>
> > + ? ?}
> > ? return NULL_TREE;
> > ?}
> >
> >
>
>