This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] lno branch merge -- vectorizer patch #1


On Thu, Jul 22, 2004 at 07:26:50PM +0300, Dorit Naishlos wrote:
> On powerpc-apple-darwin7.0.0 the vectorizer adds 148 passes, 38 expected
> failures, and 2 failures (due to a bug in altivec.md, not related to
> vectorization; a separate fix for that will follow).
> On i686-pc-linux-gnu the vectorizer adds 146 passes, 38 expected failures,
> and 4 failures (see PR16362).

I will require that these new tests be xfailed before the patch is
committed.

> int tree_nargs[] = {
> #include "tree.def"
> };

Redundant with TREE_CODE_LENGTH or first_rtl_op.

> vect_force_dr_alignment_p (struct data_reference *dr)
> {
>   tree ref = DR_REF (dr);
>   tree array_base = DR_BASE_NAME (dr);
> 
>   if ((TREE_CODE (ref) != ARRAY_REF)
>       || (TREE_CODE (TREE_TYPE (array_base)) != ARRAY_TYPE)
>       || (TREE_CODE (array_base) == VAR_DECL && DECL_EXTERNAL (array_base)))
>     return false;

If the variable is TREE_STATIC, then it's fairly easy to give it
any alignment you want.  Otherwise the data is on the stack, and
it's hard to give it alignment larger than STACK_BOUNDARY.

>   if (TREE_CODE (array_base) == COMPONENT_REF)

You should be recursing here.  The other problem is that you may
not be able to get the alignment that you need due to field offsets.
Consider:

	struct S {
	   int space;
	   int array[1000];
	};

The field "array" will always be offset 4 from the beginning of
the struct.  Now, you *would* be able to ensure that "array" is
offset 4 mod 16, such that you know that array[3] is aligned 16.

If that's something that this function is supposed to be able to
tell, the function block comment could use some improvement.

>       vect_var_name = (char *) xmalloc (strlen (name) + prefix_len + 1);
>       sprintf (vect_var_name, "%s%s", prefix, name);

Consider using "concat".

>       vect_var_name = (char *) xmalloc (prefix_len + 1);
>       sprintf (vect_var_name, "%s", prefix);

And not using allocating new data just to copy prefix.

>   if (TREE_INT_CST_HIGH (init) != 0 || TREE_INT_CST_HIGH (step) != 0)
>     abort ();

host_integerp?

>   /* Calculate the 'init' of the new index.
>      'array_first_index' (usually 0) is the TYPE_MIN_VALUE of the DOMAIN of
>      'dr' (if it has a DOMAIN).  */
>   init_val = TREE_INT_CST_LOW (init);
>   ok = vect_get_array_first_index (expr, &array_first_index);
> #ifdef ENABLE_CHECKING
>   if (!ok)
>     abort ();
> #endif
>   vec_init_val = (init_val - array_first_index)/vectorization_factor;
>   init = build_int_2 (vec_init_val, 0);

All of this should probably be done in tree arithmetic; int_const_binop
and such.  The things that are wrong here are:

  (1) array_first_index is "int".  At minimum should be HOST_WIDE_INT,
      but really should be a tree.

  (2) There *will* be arrays with non-constant first index.  Ada
      apparently creates them all the time.

  (3) No type set for init after build_int_2.  Using int_const_binop
      would get this all correct as well as handling double-word
      arithmetic.

  (4) vect_get_array_first_index is redundant with array_ref_low_bound.

>   step = integer_one_node;

You probably want size_one_node, to get the correct type.  Or keep
the type of the original index.  Changing unilaterally to "int" is
a bad idea.

>   /* Check if the alignment of the base of the data structure needs to be 
>      forced:  */ 
> 
>   if (vect_force_dr_alignment_p (dr))
>     {
>       tree decl = NULL_TREE;
>       if (TREE_CODE (array_base) == VAR_DECL)
> 	decl = array_base;
>       else if (TREE_CODE (array_base) == COMPONENT_REF)
> 	decl = TREE_OPERAND (array_base, 1);

You should have vect_force_dr_alignment_p return the decl rather
than perform the search again here.

> #ifdef ENABLE_CHECKING
>       if (!decl || (TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FIELD_DECL))
> 	abort ();
> #endif

Um, you absolutely *cannot* change the alignment of a field.  In the
first place, the type has already been layed out, so you changing the
alignment has no effect.  In the second place, even if the type had
not been layed out, changing the alignment changes the structure layout,
which changes the real type.

>           DECL_ALIGN (decl) = TYPE_ALIGN (vectype);

I believe you also need to set DECL_USER_ALIGN.

> vect_create_data_ref (tree stmt, block_stmt_iterator *bsi)
...
>   /* FORNOW: make sure the data reference is aligned.  */
>   vect_align_data_ref (stmt);

Why wouldn't you do this as part of the checks to verify that
the accesses are aligned when deciding whether or not to 
vectorize the loop in the first place?

>   /*** create: vectype *p;  ***/
>   ptr_type = build_pointer_type (vectype);
>   vect_ptr = vect_get_new_vect_var (ptr_type, vect_pointer_var, 
> 		get_name (addr_ref));
>   add_referenced_tmp_var (vect_ptr);

Perhaps you should be using make_rename_temp in vect_get_new_vect_var?

>   /* Get base address:  */
>   if (TREE_CODE (addr_ref) == VAR_DECL || TREE_CODE (addr_ref) == COMPONENT_REF)
>     base_addr = build1 (ADDR_EXPR,
>         build_pointer_type (TREE_TYPE (addr_ref)), addr_ref);
>   else if (TREE_CODE (addr_ref) == SSA_NAME)
>     base_addr = addr_ref;

Better as

	if (TREE_CODE (addr_ref) == SSA_NAME)
	  base_addr = addr_ref;
	else
	  base_addr = build_fold_addr_expr (addr_ref);

>   TREE_ADDRESSABLE (addr_ref) = 1;

Would be done by build_fold_addr_expr.  And is only valid on DECLs anyway.

>   vec_stmt = build1 (NOP_EXPR, ptr_type, base_addr);

fold_convert.

>   addr_expr = create_tmp_var (ptr_type, "addr");
>   add_referenced_tmp_var (addr_expr);

make_rename_temp

>   new_temp = force_gimple_operand (vec_stmt, &new_stmt, false, addr_expr);

What is this function?  I can't find it.

>   new_bb = bsi_insert_on_edge_immediate (pe, new_stmt);

This function doesn't exist anymore.  When was the last time the 
branch was merged?

>   array_type = build_array_type (vectype, 0);
>   TYPE_ALIGN (array_type) = TYPE_ALIGN (TREE_TYPE (addr_ref)); /* CHECKME */
>   new_base = build1 (INDIRECT_REF, array_type, TREE_OPERAND (vec_stmt, 0)); 

This is wrong.  You've got a type mismatch with the inside and outside
of the indirect_ref.  Use only build_fold_indirect_ref so that you're
sure that the types are ok.

>   data_ref = build4 (ARRAY_REF, vectype, new_base, idx, NULL_TREE, NULL_TREE);

Which brings us to the point that ARRAY_REF can only be applied to
array types, and you've got a pointer type.  You've got a choice:
either do the arithmetic yourself,

	T.0 = idx * TYPE_SIZE_UNIT (TREE_TYPE (ptr))
	T.1 = ptr + T.0
	*T.1

or cast to a pointer-to-array type in the first place.

>   TYPE_ALIAS_SET (TREE_TYPE (vec_dest)) =
>     TYPE_ALIAS_SET (TREE_TYPE (scalar_dest));

Why is this necessary?  We already handle vector types correctly
in the generic code.

>   /* nop_expr is expected only in case of a function argument.
>      (Otherwise - we expect a phi_node or a modify_expr).  */
>   if (TREE_CODE (def_stmt) == NOP_EXPR)
>     {
>       tree arg = TREE_OPERAND (def_stmt, 0);
>       if (TREE_CODE (arg) == INTEGER_CST || TREE_CODE (arg) == REAL_CST)
> 	return true;
>       if (vect_debug_details (NULL))
> 	{
> 	  fprintf (dump_file, "Unexpected form of NOP_EXPR: ");
> 	  print_generic_expr (dump_file, def_stmt, TDF_SLIM);
> 	}
>       return false;  
>     }

Actually, you should be expecting only IS_EMPTY_STMT.

>   switch (code)
>     {
>     case PLUS_EXPR:
>       optab = add_optab;

There's now a optab_for_tree_code function.

> vect_is_supportable_assignment (tree stmt)

I wonder about the long-term maintainability of pairs of functions
that must be kept in sync -- the "supportable" and the "transform".
I wonder if it wouldn't be better to have one function with a flag
that controls whether a transformation is made.  The function returns
true if it would succeed with the transformation.  When actually 
performing the transformation, it would be the caller that would
abort if the function returns false.

> vect_get_loop_niters (struct loop *loop, int *number_of_iterations)

At minimum HOST_WIDE_INT not int.  Preferably a tree.




r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]