This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Calling conventions for vector types on SPARC
- From: Falk Hueffner <falk dot hueffner at student dot uni-tuebingen dot de>
- To: "Paolo Bonzini" <bonzini at gnu dot org>
- Cc: <gcc-patches at gcc dot gnu dot org>, "Richard Henderson" <rth at redhat dot com>,"Eric Botcazou" <ebotcazou at libertysurf dot fr>
- Date: 21 Feb 2004 14:45:29 +0100
- Subject: Re: Calling conventions for vector types on SPARC
- References: <000701c3f87d$59d72be0$7adc1d97@philo>
"Paolo Bonzini" <bonzini@gnu.org> writes:
> I have a patch, for example, which implements parallelized add and
> subtraction. On lno branch it should be able to convert on all
> machines
>
> restrict char *a, *b, *c;
> for (i = 0; i < N; i++)
> a[i] = b[i] + c[i];
>
> to something like (after peeling the first N%4 iterations)
>
> restrict int *va = a, *vb = b, *vc = c;
> for (i = 0; i < N%4; i++)
> va[i] = (vb[i] & 0x7f7f7f7f) + (vc[i] & 0x7f7f7f7f)) ^
> ((vb[i] ^ vc[i]) & 0x80808080);
Here's my current version of this, which I never got around to testing
properly, maybe it is useful to you (note indentation is wrong to keep
diff smaller. Oh, and I just noticed HOST_WIDE_INT will probably not
always be large enough to hold the mask...).
Falk
Index: gcc/optabs.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/optabs.c,v
retrieving revision 1.137.2.27.2.1
diff -u -p -r1.137.2.27.2.1 optabs.c
--- gcc/optabs.c 21 Jan 2004 01:10:36 -0000 1.137.2.27.2.1
+++ gcc/optabs.c 29 Jan 2004 14:01:13 -0000
@@ -1931,7 +1931,54 @@ expand_vector_binop (enum machine_mode m
if (!target)
target = gen_reg_rtx (mode);
- for (i = 0; i < elts; ++i)
+ if ((binoptab->code == PLUS || binoptab->code == MINUS)
+ && elts >= 4 && int_mode_for_mode (mode) != BLKmode)
+ {
+ /* Do full word add/sub and fix up spilling overflows. */
+ rtx signmask, inv_signmask, signs;
+ HOST_WIDE_INT m = 0;
+
+ tmode = int_mode_for_mode (mode);
+
+ /* Build mask for all sign bits except highest one. */
+ for (i = 0; i < elts - 1; ++i)
+ {
+ m <<= subbitsize;
+ m |= 1 << (subbitsize - 1);
+ }
+ signmask = GEN_INT(m);
+ inv_signmask = GEN_INT(~m);
+
+ t = simplify_gen_subreg (tmode, target, mode, 0);
+ a = simplify_gen_subreg (tmode, op0, mode, 0);
+ b = simplify_gen_subreg (tmode, op1, mode, 0);
+
+ if (binoptab->code == MINUS) {
+ rtx complb = expand_unop (tmode, one_cmpl_optab, b, NULL_RTX,
+ true);
+ signs = expand_binop (tmode, xor_optab, a, complb, NULL_RTX,
+ true, methods);
+ } else {
+ signs = expand_binop (tmode, xor_optab, a, b, NULL_RTX,
+ true, methods);
+ }
+ signs = expand_binop (tmode, and_optab, signs, signmask, NULL_RTX,
+ true, methods);
+ a = expand_binop (tmode,
+ binoptab->code == PLUS ? and_optab : ior_optab,
+ a,
+ binoptab->code == PLUS ? inv_signmask : signmask,
+ NULL_RTX, true, methods);
+ b = expand_binop (tmode, and_optab, b, inv_signmask, NULL_RTX,
+ true, methods);
+ a = expand_binop (tmode,
+ binoptab->code == PLUS ? add_optab : sub_optab,
+ a, b, NULL_RTX, true, methods);
+ res = expand_binop (tmode, xor_optab, a, signs, t,
+ true, methods);
+ emit_move_insn (t, res);
+ }
+ else for (i = 0; i < elts; ++i)
{
/* If this is part of a register, and not the first item in the
word, we can't store using a SUBREG - that would clobber
Index: gcc/tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-vectorizer.c,v
retrieving revision 1.1.2.15
diff -u -p -r1.1.2.15 tree-vectorizer.c
--- gcc/tree-vectorizer.c 26 Jan 2004 17:59:17 -0000 1.1.2.15
+++ gcc/tree-vectorizer.c 29 Jan 2004 14:01:35 -0000
@@ -1444,7 +1447,10 @@ vect_is_supportable_binop (tree stmt)
vec_mode = TYPE_MODE (vectype);
- if (binoptab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing)
+ if (binoptab->handlers[(int) vec_mode].insn_code == CODE_FOR_nothing
+ /* These can be efficiently open-coded. */
+ && !((code == PLUS_EXPR || code == MINUS_EXPR)
+ && GET_MODE_NUNITS (vec_mode) >= 4))
{
if (tree_dump_file && (tree_dump_flags & TDF_DETAILS))
fprintf (tree_dump_file, "op not supported by target\n");