This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFC/RFT] Tree-level lowering of generic vectors
- From: Paolo Bonzini <paolo dot bonzini at polimi dot it>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 15 Jul 2004 14:18:19 +0200
- Subject: [RFC/RFT] Tree-level lowering of generic vectors
This is the patch I was speaking about. This finishes the overhauling
of generic vector support.
In addition to moving the lowering of generic vectors to the tree-level,
this does the following:
- it moves lowering of complex operations (and vectors too at -O0) to
*just after* the CFG is built. I'd rather modify it to use tsi's in a
follow-up patch since that is orthogonal to adding a new kind of
lowering. Vectors are lowered later with optimization enabled so that
the vectorizer can create generic vector types.
- it implements bit-twiddling tricks to do sum/subtract/negate, as per
the thread at http://gcc.gnu.org/ml/gcc/2004-01/msg00078.html; they are
used if four vector elements fit in a word (that means QI vectors on
32-bit targets, HI vectors too on 64-bit targets).
- it moves all vector modes to machine-dependent mode definition files,
as per http://gcc.gnu.org/ml/gcc-patches/2004-02/msg01804.html. Vector
types get an appropriate integer mode or BLKmode if they are not
supported in hardware.
This patch bootstrapped/regtested successfully on i686-pc-linux-gnu. I
also checked the code it produces on a target without vectors, which is
sparc-sun-solaris2.8 (just to check endianness issues, because that's an
assembly flavor I know).
If anyone would like to test it on one of the affected targets,
especially PowerPC, that would be good; note that this patch depends on
the one at http://gcc.gnu.org/ml/gcc-patches/2004-07/msg01329.html
Paolo
2004-07-15 Paolo Bonzini <bonzini@gnu.org>
* Makefile.in (tree_complex.o): Update dependencies.
* c-common.c (handle_vector_size_attribute): Update for
vector types without corresponding vector modes.
* c-typeck.c (build_binary_op): Do not use RDIV_EXPR for vectors.
* convert.c (convert_to_integer): Use vector types' TYPE_SIZE.
* fold-const.c (fold_convert): Likewise.
* print-tree.c (print_node): Print nunits for vector types
* tree-complex.c (build_word_mode_vector_type, tree_vec_extract,
do_unop, do_binop, do_plus_minus, do_negate,
expand_vector_piecewise, expand_vector_parallel,
expand_vector_addition, expand_vector_operations_1,
expand_vector_operations, tree_lower_operations,
pass_lower_vector_ssa, pass_pre_expand): New.
(expand_complex_operations, pass_lower_complex): Remove.
* tree-optimize.c (init_tree_optimization_passes): Adjust
pass ordering for changes in tree-complex.c.
* tree-pass.h: Declare new passes.
* tree.c (finish_vector_type): Remove.
(make_vector_type): New.
(build_vector_type_for_mode, build_vector_type): Rewritten.
* tree.def (VECTOR_TYPE): Document where the number of
subparts is stored.
* tree.h (TYPE_VECTOR_SUBPARTS): Use t->type.maxval field.
(TYPE_VECTOR_SUBPARTS_TREE): New.
(make_vector): Remove declaration.
* machmode.def: Remove vector modes.
* config/alpha/alpha-modes.def: Add required vector modes.
* config/arm/arm-modes.def: Likewise.
* config/frv/frv-modes.def: Likewise.
* config/i386/i386-modes.def: Likewise.
* config/rs6000/rs6000-modes.def: Likewise.
* config/sh/sh-modes.def: Likewise.
cp/ChangeLog:
2004-07-15 Paolo Bonzini <bonzini@gnu.org>
* typeck.c (build_binary_op): Do not use RDIV_EXPR for vectors.
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1327
diff -u -p -r1.1327 Makefile.in
--- Makefile.in 13 Jul 2004 16:43:30 -0000 1.1327
+++ Makefile.in 15 Jul 2004 10:22:43 -0000
@@ -1918,7 +1918,8 @@ tree-sra.o : tree-sra.c $(CONFIG_H) syst
langhooks.h tree-pass.h $(FLAGS_H) $(EXPR_H)
tree-complex.o : tree-complex.c $(CONFIG_H) system.h $(TREE_H) \
$(TM_H) $(TREE_FLOW_H) $(TREE_GIMPLE_H) tree-iterator.h tree-pass.h \
- $(FLAGS_H)
+ $(FLAGS_H) $(OPTABS_H) $(RTL_H) $(MACHMODE_H) $(EXPR_H) \
+ langhooks.h $(FLAGS_H)
df.o : df.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
insn-config.h $(RECOG_H) function.h $(REGS_H) alloc-pool.h hard-reg-set.h \
$(BASIC_BLOCK_H) $(DF_H)
Index: c-common.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-common.c,v
retrieving revision 1.529
diff -u -p -r1.529 c-common.c
--- c-common.c 11 Jul 2004 09:45:35 -0000 1.529
+++ c-common.c 15 Jul 2004 10:22:43 -0000
@@ -4699,7 +4699,7 @@ handle_vector_size_attribute (tree *node
bool *no_add_attrs)
{
unsigned HOST_WIDE_INT vecsize, nunits;
- enum machine_mode mode, orig_mode, new_mode;
+ enum machine_mode orig_mode;
tree type = *node, new_type, size;
*no_add_attrs = true;
@@ -4748,28 +4748,13 @@ handle_vector_size_attribute (tree *node
/* Calculate how many units fit in the vector. */
nunits = vecsize / tree_low_cst (TYPE_SIZE_UNIT (type), 1);
-
- /* Find a suitably sized vector. */
- new_mode = VOIDmode;
- for (mode = GET_CLASS_NARROWEST_MODE (GET_MODE_CLASS (orig_mode) == MODE_INT
- ? MODE_VECTOR_INT
- : MODE_VECTOR_FLOAT);
- mode != VOIDmode;
- mode = GET_MODE_WIDER_MODE (mode))
- if (vecsize == GET_MODE_SIZE (mode)
- && nunits == (unsigned HOST_WIDE_INT) GET_MODE_NUNITS (mode))
- {
- new_mode = mode;
- break;
- }
-
- if (new_mode == VOIDmode)
+ if (nunits & (nunits - 1))
{
- error ("no vector mode with the size and type specified could be found");
+ error ("number of components of the vector not a power of two");
return NULL_TREE;
}
- new_type = build_vector_type_for_mode (type, new_mode);
+ new_type = build_vector_type (type, nunits);
/* Build back pointers if needed. */
*node = reconstruct_complex_type (*node, new_type);
Index: c-typeck.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-typeck.c,v
retrieving revision 1.338
diff -u -p -r1.338 c-typeck.c
--- c-typeck.c 9 Jul 2004 23:20:30 -0000 1.338
+++ c-typeck.c 15 Jul 2004 10:22:43 -0000
@@ -6940,7 +6940,8 @@ build_binary_op (enum tree_code code, tr
&& (code1 == INTEGER_TYPE || code1 == REAL_TYPE
|| code1 == COMPLEX_TYPE || code1 == VECTOR_TYPE))
{
- if (!(code0 == INTEGER_TYPE && code1 == INTEGER_TYPE))
+ if (!((code0 == INTEGER_TYPE || code0 == VECTOR_TYPE)
+ && (code1 == INTEGER_TYPE || code1 == VECTOR_TYPE)))
resultcode = RDIV_EXPR;
else
/* Although it would be tempting to shorten always here, that
Index: convert.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/convert.c,v
retrieving revision 1.47
diff -u -p -r1.47 convert.c
--- convert.c 23 Jun 2004 20:42:45 -0000 1.47
+++ convert.c 15 Jul 2004 10:22:44 -0000
@@ -677,8 +677,7 @@ convert_to_integer (tree type, tree expr
TREE_TYPE (TREE_TYPE (expr)), expr)));
case VECTOR_TYPE:
- if (GET_MODE_SIZE (TYPE_MODE (type))
- != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (expr))))
+ if (!simple_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (expr))))
{
error ("can't convert between vector values of different size");
return error_mark_node;
Index: fold-const.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/fold-const.c,v
retrieving revision 1.424
diff -u -p -r1.424 fold-const.c
--- fold-const.c 11 Jul 2004 21:56:37 -0000 1.424
+++ fold-const.c 15 Jul 2004 10:22:44 -0000
@@ -1930,8 +1930,8 @@ fold_convert (tree type, tree arg)
return fold_convert (type, tem);
}
if (TREE_CODE (orig) == VECTOR_TYPE
- && GET_MODE_SIZE (TYPE_MODE (type))
- == GET_MODE_SIZE (TYPE_MODE (orig)))
+ && TREE_INT_CST_LOW (TYPE_SIZE (type))
+ == TREE_INT_CST_LOW (TYPE_SIZE (orig)))
return fold (build1 (NOP_EXPR, type, arg));
}
else if (TREE_CODE (type) == REAL_TYPE)
@@ -1990,12 +1990,12 @@ fold_convert (tree type, tree arg)
else if (TREE_CODE (type) == VECTOR_TYPE)
{
if ((INTEGRAL_TYPE_P (orig) || POINTER_TYPE_P (orig))
- && GET_MODE_SIZE (TYPE_MODE (type))
- == GET_MODE_SIZE (TYPE_MODE (orig)))
+ && TREE_INT_CST_LOW (TYPE_SIZE (type))
+ == TREE_INT_CST_LOW (TYPE_SIZE (orig)))
return fold (build1 (NOP_EXPR, type, arg));
if (TREE_CODE (orig) == VECTOR_TYPE
- && GET_MODE_SIZE (TYPE_MODE (type))
- == GET_MODE_SIZE (TYPE_MODE (orig)))
+ && TREE_INT_CST_LOW (TYPE_SIZE (type))
+ == TREE_INT_CST_LOW (TYPE_SIZE (orig)))
return fold (build1 (NOP_EXPR, type, arg));
}
else if (VOID_TYPE_P (type))
Index: machmode.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/machmode.def,v
retrieving revision 1.27
diff -u -p -r1.27 machmode.def
--- machmode.def 6 Nov 2003 08:38:50 -0000 1.27
+++ machmode.def 15 Jul 2004 10:22:44 -0000
@@ -186,36 +186,6 @@ CC_MODE (CC);
COMPLEX_MODES (INT);
COMPLEX_MODES (FLOAT);
-/* Vector modes. */
-VECTOR_MODES (INT, 2); /* V2QI */
-VECTOR_MODES (INT, 4); /* V4QI V2HI */
-VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
-VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
-/* VECTOR_MODES (INT, 32); V8SI V4DI */
-/* VECTOR_MODES (INT, 64); V8DI */
-
-VECTOR_MODE (INT, SI, 8)
-VECTOR_MODE (INT, DI, 4);
-VECTOR_MODE (INT, DI, 8);
-
-/* PPC uses this to distinguish between DImode passed in
- float registers and DImode passed in vector registers.
- It would be in rs6000-modes.def but it's referenced in
- c-common.c. FIXME. */
-
-VECTOR_MODE (INT, DI, 1);
-
-VECTOR_MODES (FLOAT, 4); /* V2HF */
-VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
-VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
-/* VECTOR_MODES (FLOAT, 32); V8SF V4DF */
-/* VECTOR_MODES (FLOAT, 64); V16SF V8DF */
-
-VECTOR_MODE (FLOAT, SF, 8);
-VECTOR_MODE (FLOAT, SF, 16);
-VECTOR_MODE (FLOAT, DF, 4);
-VECTOR_MODE (FLOAT, DF, 8);
-
/* The symbol Pmode stands for one of the above machine modes (usually SImode).
The tm.h file specifies which one. It is not a distinct mode. */
Index: print-tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/print-tree.c,v
retrieving revision 1.89
diff -u -p -r1.89 print-tree.c
--- print-tree.c 8 Jul 2004 07:45:44 -0000 1.89
+++ print-tree.c 15 Jul 2004 10:22:44 -0000
@@ -537,6 +537,8 @@ print_node (FILE *file, const char *pref
print_node (file, "values", TYPE_VALUES (node), indent + 4);
else if (TREE_CODE (node) == ARRAY_TYPE || TREE_CODE (node) == SET_TYPE)
print_node (file, "domain", TYPE_DOMAIN (node), indent + 4);
+ else if (TREE_CODE (node) == VECTOR_TYPE)
+ fprintf (file, " nunits %d", (int) TYPE_VECTOR_SUBPARTS (node));
else if (TREE_CODE (node) == RECORD_TYPE
|| TREE_CODE (node) == UNION_TYPE
|| TREE_CODE (node) == QUAL_UNION_TYPE)
Index: tree-complex.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-complex.c,v
retrieving revision 2.5
diff -u -p -r2.5 tree-complex.c
--- tree-complex.c 9 Jun 2004 15:07:01 -0000 2.5
+++ tree-complex.c 15 Jul 2004 10:22:44 -0000
@@ -1,4 +1,4 @@
-/* Lower complex operations to scalar operations.
+/* Lower complex number and vector operations to scalar operations.
Copyright (C) 2004 Free Software Foundation, Inc.
This file is part of GCC.
@@ -23,6 +23,12 @@ Software Foundation, 59 Temple Place - S
#include "coretypes.h"
#include "tree.h"
#include "tm.h"
+#include "rtl.h"
+#include "expr.h"
+#include "insn-codes.h"
+#include "optabs.h"
+#include "machmode.h"
+#include "langhooks.h"
#include "tree-flow.h"
#include "tree-gimple.h"
#include "tree-iterator.h"
@@ -517,17 +473,378 @@ expand_complex_operations_1 (block_stmt_
abort ();
}
}
+
+/* Return a suitable vector types made of SUBPARTS units each of mode
+ "word_mode" (the global variable). */
+static tree
+build_word_mode_vector_type (int nunits)
+{
+ /* All powers of two <= 32 give a different result modulo 37. */
+ static tree vector_types[37];
+ int hash = nunits % 37;
+
+ if (!vector_types[hash])
+ {
+ tree innertype = lang_hooks.types.type_for_mode (word_mode, 1);
+ vector_types[hash] = build_vector_type (innertype, nunits);
+ }
+
+ return vector_types[hash];
+}
+
+typedef tree (*elem_op_func) (block_stmt_iterator *,
+ tree, tree, tree, tree, tree, enum tree_code);
+
+static inline tree
+tree_vec_extract (block_stmt_iterator *bsi, tree type,
+ tree t, tree bitsize, tree bitpos)
+{
+ if (bitpos)
+ return gimplify_build3 (bsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+ else
+ return gimplify_build1 (bsi, VIEW_CONVERT_EXPR, type, t);
+}
+
+static tree
+do_unop (block_stmt_iterator *bsi, tree type, tree a,
+ tree b ATTRIBUTE_UNUSED, tree bitpos, tree bitsize,
+ enum tree_code code)
+{
+ a = tree_vec_extract (bsi, type, a, bitsize, bitpos);
+ return gimplify_build1 (bsi, code, type, a);
+}
+
+static tree
+do_binop (block_stmt_iterator *bsi, tree type, tree a, tree b,
+ tree bitpos, tree bitsize, enum tree_code code)
+{
+ a = tree_vec_extract (bsi, type, a, bitsize, bitpos);
+ b = tree_vec_extract (bsi, type, b, bitsize, bitpos);
+ return gimplify_build2 (bsi, code, type, a, b);
+}
+
+/* Expand vector addition to scalars. This should do bit twiddling
+ in order to increase parallelism:
+
+ a + b = (((int) a & 0x7f7f7f7f) + ((int) b & 0x7f7f7f7f)) ^
+ (a ^ b) & 0x80808080
+
+ a - b = (((int) a | 0x80808080) - ((int) b & 0x7f7f7f7f)) ^
+ (a ^ ~b) & 0x80808080
+
+ -b = (0x80808080 - ((int) b & 0x7f7f7f7f)) ^ (~b & 0x80808080)
+
+ This optimization should be done only if 4 vector items or more
+ fit into a word. */
+static tree
+do_plus_minus (block_stmt_iterator *bsi, tree type, tree a, tree b,
+ tree bitpos ATTRIBUTE_UNUSED, tree bitsize ATTRIBUTE_UNUSED,
+ enum tree_code code)
+{
+ tree innertype = TREE_TYPE (TREE_TYPE (a));
+ HOST_WIDE_INT max, ones, low_bits, high_bits;
+ tree low_bits_tree, high_bits_tree, a_low, b_low, result_low, signs;
+
+ max = GET_MODE_MASK (TYPE_MODE (innertype));
+ ones = TREE_INT_CST_LOW (TYPE_MAX_VALUE (type)) / max;
+ low_bits = ones * (max >> 1);
+ high_bits = TREE_INT_CST_LOW (TYPE_MAX_VALUE (type)) ^ low_bits;
+
+ low_bits_tree = build_int_2 (low_bits, 0);
+ high_bits_tree = build_int_2 (high_bits, 0);
+
+ a = tree_vec_extract (bsi, type, a, bitsize, bitpos);
+ b = tree_vec_extract (bsi, type, b, bitsize, bitpos);
+
+ signs = gimplify_build2 (bsi, BIT_XOR_EXPR, type, a, b);
+ b_low = gimplify_build2 (bsi, BIT_AND_EXPR, type, b, low_bits_tree);
+ if (code == PLUS_EXPR)
+ a_low = gimplify_build2 (bsi, BIT_AND_EXPR, type, a, low_bits_tree);
+ else
+ {
+ a_low = gimplify_build2 (bsi, BIT_IOR_EXPR, type, a, high_bits_tree);
+ signs = gimplify_build1 (bsi, BIT_NOT_EXPR, type, signs);
+ }
+
+ signs = gimplify_build2 (bsi, BIT_AND_EXPR, type, signs, high_bits_tree);
+ result_low = gimplify_build2 (bsi, code, type, a_low, b_low);
+ return gimplify_build2 (bsi, BIT_XOR_EXPR, type, result_low, signs);
+}
+
+static tree
+do_negate (block_stmt_iterator *bsi, tree type, tree b,
+ tree unused ATTRIBUTE_UNUSED, tree bitpos ATTRIBUTE_UNUSED,
+ tree bitsize ATTRIBUTE_UNUSED, enum tree_code code)
+{
+ tree innertype = TREE_TYPE (TREE_TYPE (b));
+ HOST_WIDE_INT max, ones, low_bits, high_bits;
+ tree low_bits_tree, high_bits_tree, b_low, result_low, signs;
+
+ max = GET_MODE_MASK (TYPE_MODE (innertype));
+ ones = TREE_INT_CST_LOW (TYPE_MAX_VALUE (type)) / max;
+ low_bits = ones * (max >> 1);
+ high_bits = TREE_INT_CST_LOW (TYPE_MAX_VALUE (type)) ^ low_bits;
+
+ low_bits_tree = build_int_2 (low_bits, 0);
+ high_bits_tree = build_int_2 (high_bits, 0);
+
+ b = tree_vec_extract (bsi, type, b, bitsize, bitpos);
+
+ b_low = gimplify_build2 (bsi, BIT_AND_EXPR, type, b, low_bits_tree);
+ signs = gimplify_build1 (bsi, BIT_NOT_EXPR, type, b);
+ signs = gimplify_build2 (bsi, BIT_AND_EXPR, type, signs, high_bits_tree);
+ result_low = gimplify_build2 (bsi, code, type, high_bits_tree, b_low);
+ return gimplify_build2 (bsi, BIT_XOR_EXPR, type, result_low, signs);
+}
+
+/* Expand a vector operation to scalars, by using many operations
+ whose type is the vector type's inner type. */
+static tree
+expand_vector_piecewise (block_stmt_iterator *bsi, elem_op_func f, tree type,
+ tree a, tree b, enum tree_code code)
+{
+ tree head, *chain = &head;
+ tree innertype = TREE_TYPE (type);
+ int bits_per_part = TREE_INT_CST_LOW (TYPE_SIZE (innertype));
+ int nunits = TYPE_VECTOR_SUBPARTS (type);
+ tree bit_field_ref_width = bitsize_int (bits_per_part);
+ int i;
+
+ for (i = 0; i < nunits; i++)
+ {
+ tree index = bitsize_int (i * bits_per_part);
+ tree result = f (bsi, innertype, a, b, index, bit_field_ref_width, code);
+ *chain = tree_cons (NULL_TREE, result, NULL_TREE);
+ chain = &TREE_CHAIN (*chain);
+ }
+
+ return build1 (CONSTRUCTOR, type, head);
+}
+
+/* Expand a vector operation to scalars with the freedom to use
+ a scalar integer type, or to use a different size for the items
+ in the vector type. */
+static tree
+expand_vector_parallel (block_stmt_iterator *bsi, elem_op_func f, tree type,
+ tree a, tree b,
+ enum tree_code code)
+{
+ tree result, innertype;
+ enum machine_mode mode;
+ int n_words = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (type)) / UNITS_PER_WORD;
+
+ /* We have three strategies. If the type is already correct, just do
+ the operation an element at a time. Else, if the vector is wider than
+ one word, do it a word at a time; finally, if the vector is smaller
+ than one word, do it as a scalar. */
+ if (TYPE_MODE (TREE_TYPE (type)) == word_mode)
+ return expand_vector_piecewise (bsi, f, type, a, b, code);
+ else if (n_words > 1)
+ {
+ tree word_type = build_word_mode_vector_type (n_words);
+ result = expand_vector_piecewise (bsi, f, word_type, a, b, code);
+ result = gimplify_val (bsi, word_type, result);
+ }
+ else
+ {
+ /* Use a single scalar operation with a mode no wider than word_mode. */
+ mode = mode_for_size (TREE_INT_CST_LOW (TYPE_SIZE (type)), MODE_INT, 0);
+ innertype = lang_hooks.types.type_for_mode (mode, 1);
+ result = f (bsi, innertype, a, b, NULL_TREE, NULL_TREE, code);
+ }
+
+ return build1 (VIEW_CONVERT_EXPR, type, result);
+}
+
+/* Expand a vector operation to scalars; for integer types we can use
+ special bit twiddling tricks to do the sums a word at a time, using
+ function F_PARALLEL instead of F. These tricks are done only if
+ they can process at least four items, that is, only if the vector
+ holds at least four items and if a word can hold four items. */
+static tree
+expand_vector_addition (block_stmt_iterator *bsi,
+ elem_op_func f, elem_op_func f_parallel, tree type,
+ tree a, tree b, enum tree_code code)
+{
+ int parts_per_word = UNITS_PER_WORD
+ / TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (type)));
+
+ if (INTEGRAL_TYPE_P (TREE_TYPE (type))
+ && parts_per_word >= 4
+ && TYPE_VECTOR_SUBPARTS (type) >= 4)
+ return expand_vector_parallel (bsi, f_parallel, type, a, b, code);
+ else
+ return expand_vector_piecewise (bsi, f, type, a, b, code);
+}
+
+/* Process one statement. If we identify a vector operation, expand it. */
+
+static void
+expand_vector_operations_1 (block_stmt_iterator *bsi)
+{
+ tree stmt = bsi_stmt (*bsi);
+ tree *p_rhs, rhs, type, inner_type;
+ enum machine_mode mode;
+ enum tree_code code;
+
+ switch (TREE_CODE (stmt))
+ {
+ case RETURN_EXPR:
+ stmt = TREE_OPERAND (stmt, 0);
+ if (!stmt || TREE_CODE (stmt) != MODIFY_EXPR)
+ return;
+
+ /* FALLTHRU */
+
+ case MODIFY_EXPR:
+ p_rhs = &TREE_OPERAND (stmt, 1);
+ rhs = *p_rhs;
+ break;
+
+ default:
+ return;
+ }
+
+ type = TREE_TYPE (rhs);
+ if (TREE_CODE (type) != VECTOR_TYPE)
+ return;
+
+ code = TREE_CODE (rhs);
+ if (TREE_CODE_CLASS (code) != '1'
+ && TREE_CODE_CLASS (code) != '2')
+ return;
+
+ inner_type = TREE_TYPE (type);
+ mode = TYPE_MODE (type);
+
+ /* If the mode is not BLKmode or an integer mode, we might be able
+ to do the operation in hardware, so expand only if there is no
+ handler for the operation in the optab. */
+ if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+ || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
+ {
+ switch (code)
+ {
+ case PLUS_EXPR:
+ if (add_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case NEGATE_EXPR:
+ if (neg_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+
+ case MINUS_EXPR:
+ if (sub_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case BIT_AND_EXPR:
+ if (and_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case BIT_IOR_EXPR:
+ if (ior_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case BIT_NOT_EXPR:
+ if (one_cmpl_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+
+ case BIT_XOR_EXPR:
+ if (xor_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case MULT_EXPR:
+ if (smul_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ case RDIV_EXPR:
+ case TRUNC_DIV_EXPR:
+ case CEIL_DIV_EXPR:
+ case FLOOR_DIV_EXPR:
+ case ROUND_DIV_EXPR:
+ case EXACT_DIV_EXPR:
+ if (TYPE_UNSIGNED (inner_type)
+ ? udiv_optab->handlers[mode].insn_code != CODE_FOR_nothing
+ : sdiv_optab->handlers[mode].insn_code != CODE_FOR_nothing)
+ return;
+ break;
+
+ default:
+ break;
+ }
+ }
+
+ /* Now expand the operations, either in parallel or piecewise. */
+ switch (code)
+ {
+ case PLUS_EXPR:
+ case MINUS_EXPR:
+ *p_rhs = expand_vector_addition (bsi, do_binop, do_plus_minus, type,
+ TREE_OPERAND (rhs, 0),
+ TREE_OPERAND (rhs, 1), code);
+ break;
+
+ case NEGATE_EXPR:
+ *p_rhs = expand_vector_addition (bsi, do_unop, do_negate, type,
+ TREE_OPERAND (rhs, 0), NULL_TREE, code);
+ break;
+
+ case BIT_AND_EXPR:
+ case BIT_IOR_EXPR:
+ case BIT_XOR_EXPR:
+ *p_rhs = expand_vector_parallel (bsi, do_binop, type,
+ TREE_OPERAND (rhs, 0),
+ TREE_OPERAND (rhs, 1), code);
+ break;
+
+ case BIT_NOT_EXPR:
+ *p_rhs = expand_vector_parallel (bsi, do_unop, type,
+ TREE_OPERAND (rhs, 0), NULL_TREE, code);
+ break;
+
+ case CONVERT_EXPR:
+ abort ();
+
+ case VIEW_CONVERT_EXPR:
+ case NOP_EXPR:
+ break;
+
+ default:
+ if (TREE_CODE_CLASS (code) == '1')
+ *p_rhs = expand_vector_piecewise (bsi, do_unop, type,
+ TREE_OPERAND (rhs, 0),
+ NULL_TREE, code);
+ else
+ *p_rhs = expand_vector_piecewise (bsi, do_binop, type,
+ TREE_OPERAND (rhs, 0),
+ TREE_OPERAND (rhs, 1), code);
+ break;
+ }
+
+ modify_stmt (bsi_stmt (*bsi));
+}
+
+static void
+expand_vector_operations (void)
+{
+ block_stmt_iterator bsi;
+ basic_block bb;
-/* Main loop to process each statement. */
-/* ??? Could use dominator bits to propagate from complex_expr at the
- same time. This might reveal more constants, particularly in cases
- such as (complex = complex op scalar). This may not be relevant
- after SRA and subsequent cleanups. Proof of this would be if we
- verify that the code generated by expand_complex_div_wide is
- simplified properly to straight-line code. */
+ FOR_EACH_BB (bb)
+ {
+ for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
+ expand_vector_operations_1 (&bsi);
+ }
+}
static void
-expand_complex_operations (void)
+tree_lower_operations (void)
{
int old_last_basic_block = last_basic_block;
block_stmt_iterator bsi;
@@ -538,15 +855,24 @@ expand_complex_operations (void)
if (bb->index >= old_last_basic_block)
continue;
for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi))
- expand_complex_operations_1 (&bsi);
+ {
+ expand_complex_operations_1 (&bsi);
+
+ /* When not optimizing, only lower complex arithmetic. This is so
+ that the loop optimizers do not waste time on lowered vectors,
+ and more importantly they can create generic vector code. */
+ if (!optimize)
+ expand_vector_operations_1 (&bsi);
+ }
}
}
-struct tree_opt_pass pass_lower_complex =
+
+struct tree_opt_pass pass_lower_vector_ssa =
{
- "complex", /* name */
+ "vector", /* name */
NULL, /* gate */
- expand_complex_operations, /* execute */
+ expand_vector_operations, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
@@ -555,7 +881,24 @@ struct tree_opt_pass pass_lower_complex
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
- TODO_dump_func | TODO_rename_vars
+ TODO_dump_func | TODO_rename_vars /* todo_flags_finish */
| TODO_ggc_collect | TODO_verify_ssa
- | TODO_verify_stmts | TODO_verify_flow /* todo_flags_finish */
+ | TODO_verify_stmts | TODO_verify_flow
+};
+
+struct tree_opt_pass pass_pre_expand =
+{
+ "pre-expand", /* name */
+ 0, /* gate */
+ tree_lower_operations, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ 0, /* tv_id */
+ PROP_cfg, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_dump_func | TODO_ggc_collect
+ | TODO_verify_stmts /* todo_flags_finish */
};
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.30
diff -u -p -r2.30 tree-optimize.c
--- tree-optimize.c 13 Jul 2004 20:39:09 -0000 2.30
+++ tree-optimize.c 15 Jul 2004 10:22:44 -0000
@@ -270,6 +270,7 @@ init_tree_optimization_passes (void)
NEXT_PASS (pass_lower_cf);
NEXT_PASS (pass_lower_eh);
NEXT_PASS (pass_build_cfg);
+ NEXT_PASS (pass_pre_expand);
NEXT_PASS (pass_tree_profile);
NEXT_PASS (pass_init_datastructures);
NEXT_PASS (pass_all_optimizations);
@@ -296,7 +297,6 @@ init_tree_optimization_passes (void)
NEXT_PASS (pass_tail_recursion);
NEXT_PASS (pass_ch);
NEXT_PASS (pass_profile);
- NEXT_PASS (pass_lower_complex);
NEXT_PASS (pass_sra);
NEXT_PASS (DUP_PASS (pass_rename_ssa_copies));
NEXT_PASS (DUP_PASS (pass_dominator));
@@ -314,6 +314,7 @@ init_tree_optimization_passes (void)
NEXT_PASS (pass_loop);
NEXT_PASS (DUP_PASS (pass_dominator));
NEXT_PASS (DUP_PASS (pass_redundant_phi));
+ NEXT_PASS (pass_lower_vector_ssa);
NEXT_PASS (pass_cd_dce);
NEXT_PASS (DUP_PASS (pass_dse));
NEXT_PASS (DUP_PASS (pass_forwprop));
Index: tree-pass.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-pass.h,v
retrieving revision 2.5
diff -u -p -r2.5 tree-pass.h
--- tree-pass.h 10 Jul 2004 04:57:54 -0000 2.5
+++ tree-pass.h 15 Jul 2004 10:22:44 -0000
@@ -120,7 +120,8 @@ extern struct tree_opt_pass pass_may_ali
extern struct tree_opt_pass pass_split_crit_edges;
extern struct tree_opt_pass pass_pre;
extern struct tree_opt_pass pass_profile;
-extern struct tree_opt_pass pass_lower_complex;
+extern struct tree_opt_pass pass_pre_expand;
+extern struct tree_opt_pass pass_lower_vector_ssa;
extern struct tree_opt_pass pass_fold_builtins;
extern struct tree_opt_pass pass_early_warn_uninitialized;
extern struct tree_opt_pass pass_late_warn_uninitialized;
Index: tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.c,v
retrieving revision 1.396
diff -u -p -r1.396 tree.c
--- tree.c 13 Jul 2004 16:43:32 -0000 1.396
+++ tree.c 15 Jul 2004 10:22:45 -0000
@@ -114,7 +114,7 @@ static void set_type_quals (tree, int);
static int type_hash_eq (const void *, const void *);
static hashval_t type_hash_hash (const void *);
static void print_type_hash_statistics (void);
-static void finish_vector_type (tree);
+static tree make_vector_type (tree, int, enum machine_mode);
static int type_hash_marked_p (const void *);
static unsigned int type_hash_list (tree, hashval_t);
static unsigned int attribute_hash_list (tree, hashval_t);
@@ -5305,18 +5305,27 @@ tree_operand_check_failed (int idx, enum
}
#endif /* ENABLE_TREE_CHECKING */
-/* For a new vector type node T, build the information necessary for
- debugging output. */
+/* Create a new vector type node holding SUBPARTS units of type INNERTYPE,
+ and mapped to the machine mode MODE. Initialize its fields and build
+ the information necessary for debugging output. */
-static void
-finish_vector_type (tree t)
+static tree
+make_vector_type (tree innertype, int nunits, enum machine_mode mode)
{
- layout_type (t);
+ tree t = make_node (VECTOR_TYPE);
+ TREE_TYPE (t) = innertype;
+ TYPE_VECTOR_SUBPARTS_TREE (t) = build_int_2 (nunits, 0);
+ TYPE_MODE (t) = mode;
+ TYPE_UNSIGNED (t) = TYPE_UNSIGNED (innertype);
+ TYPE_SIZE_UNIT (t) = int_const_binop (MULT_EXPR, TYPE_SIZE_UNIT (innertype),
+ TYPE_VECTOR_SUBPARTS_TREE (t), 0);
+ TYPE_SIZE (t) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
+ TYPE_VECTOR_SUBPARTS_TREE (t), 0);
+ layout_type (t);
{
- tree index = build_int_2 (TYPE_VECTOR_SUBPARTS (t) - 1, 0);
- tree array = build_array_type (TREE_TYPE (t),
- build_index_type (index));
+ tree index = build_int_2 (nunits - 1, 0);
+ tree array = build_array_type (innertype, build_index_type (index));
tree rt = make_node (RECORD_TYPE);
TYPE_FIELDS (rt) = build_decl (FIELD_DECL, get_identifier ("f"), array);
@@ -5329,6 +5338,8 @@ finish_vector_type (tree t)
numbers equal. */
TYPE_UID (rt) = TYPE_UID (t);
}
+
+ return t;
}
static tree
@@ -5547,19 +5558,30 @@ reconstruct_complex_type (tree type, tre
return outer;
}
-/* Returns a vector tree node given a vector mode and inner type. */
+/* Returns a vector tree node given a mode (integer, vector, or BLKmode) and
+ the inner type. */
tree
build_vector_type_for_mode (tree innertype, enum machine_mode mode)
{
- tree t;
- t = make_node (VECTOR_TYPE);
- TREE_TYPE (t) = innertype;
- TYPE_MODE (t) = mode;
- finish_vector_type (t);
- return t;
+ int nunits;
+
+ if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+ || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
+ nunits = GET_MODE_NUNITS (mode);
+ else
+ nunits = GET_MODE_BITSIZE (mode) / TREE_INT_CST_LOW (TYPE_SIZE (innertype));
+
+ /* Check that the mode is correct, and that (if it is a scalar mode)
+ there are no leftover bits. */
+ if (GET_MODE_BITSIZE (mode) !=
+ nunits * TREE_INT_CST_LOW (TYPE_SIZE (innertype)))
+ abort ();
+
+ return make_vector_type (innertype, nunits, mode);
}
-/* Similarly, but takes inner type and units. */
+/* Similarly, but takes the inner type and number of units, which must be
+ a power of two. */
tree
build_vector_type (tree innertype, int nunits)
@@ -5567,6 +5589,9 @@ build_vector_type (tree innertype, int n
enum machine_mode innermode = TYPE_MODE (innertype);
enum machine_mode mode;
+ if (nunits & (nunits - 1))
+ abort ();
+
if (GET_MODE_CLASS (innermode) == MODE_FLOAT)
mode = MIN_MODE_VECTOR_FLOAT;
else
@@ -5574,9 +5599,22 @@ build_vector_type (tree innertype, int n
for (; mode != VOIDmode ; mode = GET_MODE_WIDER_MODE (mode))
if (GET_MODE_NUNITS (mode) == nunits && GET_MODE_INNER (mode) == innermode)
- return build_vector_type_for_mode (innertype, mode);
+ break;
- return NULL_TREE;
+ if (mode == VOIDmode
+ && GET_MODE_CLASS (innermode) == MODE_INT)
+ {
+ /* For integers, try mapping it to a same-sized scalar mode. */
+ mode = TYPE_MODE (innertype);
+ for (; mode != VOIDmode ; mode = GET_MODE_WIDER_MODE (mode))
+ if (GET_MODE_BITSIZE (mode) == nunits * GET_MODE_BITSIZE (innermode))
+ break;
+ }
+
+ if (mode == VOIDmode)
+ mode = BLKmode;
+
+ return make_vector_type (innertype, nunits, mode);
}
/* Given an initializer INIT, return TRUE if INIT is zero or some
Index: tree.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.def,v
retrieving revision 1.92
diff -u -p -r1.92 tree.def
--- tree.def 8 Jul 2004 07:45:46 -0000 1.92
+++ tree.def 15 Jul 2004 10:22:46 -0000
@@ -151,7 +151,8 @@ DEFTREECODE (REAL_TYPE, "real_type", 't'
DEFTREECODE (COMPLEX_TYPE, "complex_type", 't', 0)
/* Vector types. The TREE_TYPE field is the data type of the vector
- elements. */
+ elements. The TYPE_MAX_VALUE field is the number of subparts of
+ the vector. */
DEFTREECODE (VECTOR_TYPE, "vector_type", 't', 0)
/* C enums. The type node looks just like an INTEGER_TYPE node.
Index: tree.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree.h,v
retrieving revision 1.559
diff -u -p -r1.559 tree.h
--- tree.h 13 Jul 2004 16:43:32 -0000 1.559
+++ tree.h 15 Jul 2004 10:22:46 -0000
@@ -1505,7 +1505,12 @@ struct tree_block GTY(())
/* For a VECTOR_TYPE, this is the number of sub-parts of the vector. */
#define TYPE_VECTOR_SUBPARTS(VECTOR_TYPE) \
- GET_MODE_NUNITS (VECTOR_TYPE_CHECK (VECTOR_TYPE)->type.mode)
+ TREE_INT_CST_LOW (TYPE_VECTOR_SUBPARTS_TREE (VECTOR_TYPE))
+
+/* For a VECTOR_TYPE, this is a tree holding the number of sub-parts
+ of the vector. */
+#define TYPE_VECTOR_SUBPARTS_TREE(VECTOR_TYPE) \
+ (VECTOR_TYPE_CHECK (VECTOR_TYPE)->type.maxval)
/* Indicates that objects of this type must be initialized by calling a
function when they are created. */
@@ -3469,7 +3474,6 @@ extern void expand_function_start (tree)
extern void expand_pending_sizes (tree);
extern void recompute_tree_invarant_for_addr_expr (tree);
extern bool needs_to_live_in_memory (tree);
-extern tree make_vector (enum machine_mode, tree, int);
extern tree reconstruct_complex_type (tree, tree);
extern int real_onep (tree);
Index: config/alpha/alpha-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/alpha/alpha-modes.def,v
retrieving revision 1.2
diff -u -p -r1.2 alpha-modes.def
--- config/alpha/alpha-modes.def 25 Dec 2003 15:17:34 -0000 1.2
+++ config/alpha/alpha-modes.def 15 Jul 2004 10:22:46 -0000
@@ -21,3 +21,8 @@ Boston, MA 02111-1307, USA. */
/* 128-bit floating point. This gets reset in alpha_override_options
if VAX float format is in use. */
FLOAT_MODE (TF, 16, ieee_quad_format);
+
+/* Vector modes. */
+VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
+VECTOR_MODE (INT, QI, 4); /* V4QI */
+VECTOR_MODE (INT, QI, 2); /* V2QI */
Index: config/arm/arm-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/arm/arm-modes.def,v
retrieving revision 1.5
diff -u -p -r1.5 arm-modes.def
--- config/arm/arm-modes.def 25 Dec 2003 15:17:36 -0000 1.5
+++ config/arm/arm-modes.def 15 Jul 2004 10:22:46 -0000
@@ -50,3 +50,10 @@ CC_MODE (CC_DGEU);
CC_MODE (CC_DGTU);
CC_MODE (CC_C);
CC_MODE (CC_N);
+
+/* Vector modes. */
+VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
+VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
+
Index: config/frv/frv-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/frv/frv-modes.def,v
retrieving revision 1.3
diff -u -p -r1.3 frv-modes.def
--- config/frv/frv-modes.def 13 Oct 2003 21:16:26 -0000 1.3
+++ config/frv/frv-modes.def 15 Jul 2004 10:22:46 -0000
@@ -28,3 +28,6 @@ Boston, MA 02111-1307, USA. */
CC_MODE (CC_UNS);
CC_MODE (CC_FP);
CC_MODE (CC_CCR);
+
+VECTOR_MODE (INT, QI, 4); /* V4QI */
+VECTOR_MODE (INT, SI, 4); /* V4SI */
Index: config/i386/i386-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386-modes.def,v
retrieving revision 1.6
diff -u -p -r1.6 i386-modes.def
--- config/i386/i386-modes.def 30 Oct 2003 23:27:30 -0000 1.6
+++ config/i386/i386-modes.def 15 Jul 2004 10:22:46 -0000
@@ -60,3 +60,15 @@ CC_MODE (CCNO);
CC_MODE (CCZ);
CC_MODE (CCFP);
CC_MODE (CCFPU);
+
+/* Vector modes. */
+VECTOR_MODES (INT, 4); /* V4QI V2HI */
+VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
+VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
+VECTOR_MODE (INT, DI, 4); /* V4DI */
+VECTOR_MODE (INT, SI, 8); /* V8SI */
+
+/* The symbol Pmode stands for one of the above machine modes (usually SImode).
+ The tm.h file specifies which one. It is not a distinct mode. */
Index: config/rs6000/rs6000-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000-modes.def,v
retrieving revision 1.4
diff -u -p -r1.4 rs6000-modes.def
--- config/rs6000/rs6000-modes.def 31 Dec 2003 00:25:51 -0000 1.4
+++ config/rs6000/rs6000-modes.def 15 Jul 2004 10:22:46 -0000
@@ -38,3 +38,10 @@ PARTIAL_INT_MODE (SI);
CC_MODE (CCUNS);
CC_MODE (CCFP);
CC_MODE (CCEQ);
+
+/* Vector modes. */
+VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
+VECTOR_MODE (INT, DI, 1);
+VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
+VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
Index: config/sh/sh-modes.def
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/sh/sh-modes.def,v
retrieving revision 1.1
diff -u -p -r1.1 sh-modes.def
--- config/sh/sh-modes.def 13 Oct 2003 21:16:32 -0000 1.1
+++ config/sh/sh-modes.def 15 Jul 2004 10:22:46 -0000
@@ -21,3 +21,12 @@ Boston, MA 02111-1307, USA. */
/* The SH uses a partial integer mode to represent the FPSCR register. */
PARTIAL_INT_MODE (SI);
+/* Vector modes. */
+VECTOR_MODES (INT, 4); /* V4QI V2HI */
+VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
+VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
+VECTOR_MODE (INT, DI, 4); /* V4DI */
+VECTOR_MODE (INT, DI, 8); /* V8DI */
+VECTOR_MODE (FLOAT, SF, 16); /* V16SF */
Index: cp/typeck.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/typeck.c,v
retrieving revision 1.561
diff -u -p -r1.561 typeck.c
--- cp/typeck.c 12 Jul 2004 16:06:39 -0000 1.561
+++ cp/typeck.c 15 Jul 2004 10:22:47 -0000
@@ -2878,7 +2878,8 @@ build_binary_op (enum tree_code code, tr
else if (TREE_CODE (op1) == REAL_CST && real_zerop (op1))
warning ("division by zero in `%E / 0.'", op0);
- if (!(code0 == INTEGER_TYPE && code1 == INTEGER_TYPE))
+ if (!((code0 == INTEGER_TYPE || code0 == VECTOR_TYPE)
+ && (code1 == INTEGER_TYPE || code1 == VECTOR_TYPE)))
resultcode = RDIV_EXPR;
else
/* When dividing two signed integers, we have to promote to int.