This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC/RFT] Tree-level lowering of generic vectors
- From: Richard Henderson <rth at redhat dot com>
- To: Paolo Bonzini <paolo dot bonzini at polimi dot it>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 15 Jul 2004 09:59:04 -0700
- Subject: Re: [RFC/RFT] Tree-level lowering of generic vectors
- References: <40F6760B.1020304@polimi.it>
On Thu, Jul 15, 2004 at 02:18:19PM +0200, Paolo Bonzini wrote:
> * c-typeck.c (build_binary_op): Do not use RDIV_EXPR for vectors.
Why not? Seems like this *should* be used for fp vectors.
> + if (!simple_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (expr))))
Correct, or alternately tree_int_cst_equal.
> - && GET_MODE_SIZE (TYPE_MODE (type))
> - == GET_MODE_SIZE (TYPE_MODE (orig)))
> + && TREE_INT_CST_LOW (TYPE_SIZE (type))
> + == TREE_INT_CST_LOW (TYPE_SIZE (orig)))
Not correct. More occurences.
Essentially everytime you use TREE_INT_CST_LOW, you are wrong.
> + /* All powers of two <= 32 give a different result modulo 37. */
> + static tree vector_types[37];
You've not constrained vector types to contain <= 32 elements.
Nor should you, IMO. I see no reason you can't use the generic
type hashing in tree.c.
> + ones = TREE_INT_CST_LOW (TYPE_MAX_VALUE (type)) / max;
TYPE_MAX_VALUE is not guaranteed to fit in HOST_WIDE_INT.
You need to be doing operations on trees for all of these.
You might well get a lot of mileage from a
low_bits = replicate_for_vec (type, max >> 1)
tree
replicate_for_vec (tree type, HOST_WIDE_INT val)
{
int width = TYPE_PRECISION (TREE_TYPE (type));
int i, n = HOST_BITS_PER_WIDE_INT / width;
HOST_WIDE_INT elt0, elt1, mask;
tree ret;
if (n > 1)
{
mask = ((HOST_WIDE_INT)1 << width) - 1;
for (elt0 = i = 0; i < n; i++)
elt0 = (elt0 << width) | (val & mask);
}
else if (n == 1)
elt0 = val;
else
abort ();
if (n == TYPE_VECTOR_SUBPARTS (type))
elt1 = 0;
else if (2*n == TYPE_VECTOR_SUBPARTS (type))
elt1 = elt0;
else
abort ();
ret = build_int (elt0, elt1);
TREE_TYPE (ret) = type;
return ret;
}
Which also brings up the point that, as written, vector elements
must fit in a HOST_WIDE_INT, and that vector types must fit in
2*HOST_WIDE_INT. Which means that if you run across
typedef char v32qi __attribute__((vector_size(32)))
v32qi x, y, z;
x = y + z;
you'll need to decompose this to 4 v8qi, and then decompose *that*
to DImode arithmetic, as you're doing.
> + int bits_per_part = TREE_INT_CST_LOW (TYPE_SIZE (innertype));
> + tree bit_field_ref_width = bitsize_int (bits_per_part);
This is a bit silly. Extracting the number in order to create
another tree?
> + if (mode == VOIDmode
> + && GET_MODE_CLASS (innermode) == MODE_INT)
> + {
> + /* For integers, try mapping it to a same-sized scalar mode. */
> + mode = TYPE_MODE (innertype);
> + for (; mode != VOIDmode ; mode = GET_MODE_WIDER_MODE (mode))
> + if (GET_MODE_BITSIZE (mode) == nunits * GET_MODE_BITSIZE (innermode))
> + break;
mode_for_size.
> /* For a VECTOR_TYPE, this is the number of sub-parts of the vector. */
> #define TYPE_VECTOR_SUBPARTS(VECTOR_TYPE) \
> - GET_MODE_NUNITS (VECTOR_TYPE_CHECK (VECTOR_TYPE)->type.mode)
> + TREE_INT_CST_LOW (TYPE_VECTOR_SUBPARTS_TREE (VECTOR_TYPE))
> +
> +/* For a VECTOR_TYPE, this is a tree holding the number of sub-parts
> + of the vector. */
> +#define TYPE_VECTOR_SUBPARTS_TREE(VECTOR_TYPE) \
> + (VECTOR_TYPE_CHECK (VECTOR_TYPE)->type.maxval)
I wonder if reusing TYPE_PRECISION would be a better idea.
r~