This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] lno branch merge -- vectorizer patch #1


thanks Richard.

I think I addressed all your comments - the new patch is attached below. It
passed bootstrap and the vectorizer test-cases. More testing (including on
i686-pc-linux) in progress.

> > On powerpc-apple-darwin7.0.0 the vectorizer adds 148 passes, 38
expected
> > failures, and 2 failures (due to a bug in altivec.md, not
> > related to
> > vectorization; a separate fix for that will follow).

with this fix: http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00133.html
these 2 failures now pass

> > On i686-pc-linux-gnu the vectorizer adds 146 passes, 38 expected
> > failures, and 4 failures (see PR16362).
>
> I will require that these new tests be xfailed before the patch is
> committed.

These are test-cases vect-17,18.19.20.c; how do I mark a test-case as xfail
only for a subset of the targets this test is relevant for? alternatively I
could enable these tests only for ppc for now.

> > int tree_nargs[] = {
> > #include "tree.def"
> > };
>
> Redundant with TREE_CODE_LENGTH or first_rtl_op.

ok, thanks

> If the variable is TREE_STATIC, then it's fairly easy to give it
> any alignment you want.  Otherwise the data is on the stack, and
> it's hard to give it alignment larger than STACK_BOUNDARY.

Fixed; is this ok?:

/* Function vect_force_dr_alignment_p.

   Returns whether the alignment of a DECL can be forced to be aligned
   on ALIGNMENT bit boundary.  */

static bool
vect_can_force_dr_alignment_p (tree decl, unsigned int alignment)
{
  if (TREE_CODE (decl) != VAR_DECL)
    return false;

  if (DECL_EXTERNAL (decl))
    return false;

  if (TREE_STATIC (decl))
    return true;
  else
    return (alignment <= STACK_BOUNDARY);
}


> >   if (TREE_CODE (array_base) == COMPONENT_REF)
>
> You should be recursing here.

I wasn't recursing because I was allowing only certain forms of
COMPONENT_REFs to be processed. I extended that now.

> The other problem is that you may
> not be able to get the alignment that you need due to field offsets.

ok - this is now done in the function vect_get_base_decl_and_bit_offset.

> Consider using "concat".

done

> And not using allocating new data just to copy prefix.

sure

> host_integerp?

ok, thanks

> All of this should probably be done in tree arithmetic; int_const_binop
> and such.  The things that are wrong here are:

ok, switched to using tree arithmetic

> You probably want size_one_node, to get the correct type.  Or keep
> the type of the original index.  Changing unilaterally to "int" is
> a bad idea.

ok

> You should have vect_force_dr_alignment_p return the decl rather
> than perform the search again here.

this function no longer exists - replaced with
vect_get_base_decl_and_bit_offset (to return the decl and field offset) and
vect_can_force_dr_alignment_p (to check if the decl can be aligned).

> Um, you absolutely *cannot* change the alignment of a field.

thanks, fixed.

> I believe you also need to set DECL_USER_ALIGN.

ok

> Why wouldn't you do this as part of the checks to verify that
> the accesses are aligned when deciding whether or not to
> vectorize the loop in the first place?

because I was trying to not make any changes during the analysis phase
(i.e, before I decide to vectorize); but in this case I agree that the code
is cleaner if I force the alignment during the analysis, so this is what
I'm doing now (at the moment, this is the only transformation that takes
place during the analysis).

> Perhaps you should be using make_rename_temp in vect_get_new_vect_var?

I was trying to avoid using "bitmap_set_bit (vars_to_rename" wherever it's
trivial enough to assign an SSA_NAME to a newly created variable. I think I
prefer to leave it this way, if it's ok.

> Better as
>
>            if (TREE_CODE (addr_ref) == SSA_NAME)
>              base_addr = addr_ref;
>            else
>              base_addr = build_fold_addr_expr (addr_ref);

ok

> >   TREE_ADDRESSABLE (addr_ref) = 1;
>
> Would be done by build_fold_addr_expr.  And is only valid on DECLs
anyway.

ok

> >   vec_stmt = build1 (NOP_EXPR, ptr_type, base_addr);
>
> fold_convert.

ok

> >   addr_expr = create_tmp_var (ptr_type, "addr");
> >   add_referenced_tmp_var (addr_expr);
>
> make_rename_temp

not used for the reason above

> >   new_temp = force_gimple_operand ...
>
> What is this function?  I can't find it.

I'm not using it anymore.

> >   new_bb = bsi_insert_on_edge_immediate (pe, new_stmt);
>
> This function doesn't exist anymore

I'm not using it anymore

> This is wrong.  You've got a type mismatch with the inside and
> outside
> of the indirect_ref.  Use only build_fold_indirect_ref so that
> you're
> sure that the types are ok.

ok, thanks

> Which brings us to the point that ARRAY_REF can only be applied to
> array types, and you've got a pointer type.  You've got a choice:
> either do the arithmetic yourself,
> ...
> or cast to a pointer-to-array type in the first place.

I'm casting it to pointer-to-array type

> >   TYPE_ALIAS_SET (TREE_TYPE (vec_dest)) =
> >     TYPE_ALIAS_SET (TREE_TYPE (scalar_dest));
>
> Why is this necessary?  We already handle vector types correctly
> in the generic code.

removed.

> Actually, you should be expecting only IS_EMPTY_STMT.

ok

> There's now a optab_for_tree_code function.

ok

> I wonder about the long-term maintainability of pairs of functions
> that must be kept in sync -- the "supportable" and the
> "transform".
> I wonder if it wouldn't be better to have one function with a flag
> that controls whether a transformation is made.  The function
> returns
> true if it would succeed with the transformation.  When actually
> performing the transformation, it would be the caller that would
> abort if the function returns false.

agreed. I switched to using a single function now.

> > vect_get_loop_niters (struct loop *loop, int
> *number_of_iterations)
>
> At minimum HOST_WIDE_INT not int.  Preferably a tree.

For now HOST_WIDE_INT; with Olga's unknown-loop-bound support (coming up
soon) will be changed to a tree.

thanks

dorit

(See attached file: tree-vectorizer.h.Aug16)(See attached file:
tree-vectorizer.c.Aug16)(See attached file: diff.Aug16.2029)

Attachment: tree-vectorizer.h.Aug16
Description: Binary data

Attachment: tree-vectorizer.c.Aug16
Description: Binary data

Attachment: diff.Aug16.2029
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]