[PATCH][4.6] [2/2] Handle multiple vector sizes with AVX
Richard Guenther
rguenther@suse.de
Mon Mar 8 11:48:00 GMT 2010
On Sun, 7 Mar 2010, Ira Rosen wrote:
> Richard Guenther <rguenther@suse.de> wrote on 03/03/2010 03:54:15 PM:
>
> > This adjusts the vectorizer to handle multiple vector sizes as
> > supported by AVX.
> >
> > The main parts of this patch are
> >
> > 1) Separate analysis of datarefs and stmts from committing to
> > vector types
> >
> > 2) Fixup remaining calls to get_vectype_for_scalar_type to get
> > the correct vector type
> >
> > I mostly concentrated on loop vectorization and only made SLP
> > work as far as the testsuite or SPEC is concerned. Also the
> > pattern recognizer commits to vector types too early. Thus likely
> > I will have to dissect the analysis phase some more.
>
> Yes, I think, since pattern recognition checks for exact target support of
> the patterns, changing VF afterwards may be problematic.
>
> > Index: trunk/gcc/tree-vect-data-refs.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-vect-data-refs.c 2010-03-02 17:57:08.000000000
> +0100
> > --- trunk/gcc/tree-vect-data-refs.c 2010-03-02 18:05:48.000000000 +0100
> ...
> > *************** vect_analyze_data_ref_dependence (struct
> > *** 595,611 ****
> > if (vect_print_dump_info (REPORT_DR_DETAILS))
> > fprintf (vect_dump, "dependence distance = %d.", dist);
> >
> > ! /* Same loop iteration. */
> > ! if (dist % vectorization_factor == 0 && dra_size == drb_size)
> > {
> > /* Two references with distance zero have the same alignment. */
> > VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> > (stmtinfo_a), drb);
> > VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> > (stmtinfo_b), dra);
> > if (vect_print_dump_info (REPORT_ALIGNMENT))
> > fprintf (vect_dump, "accesses have the same alignment.");
> > if (vect_print_dump_info (REPORT_DR_DETAILS))
> > {
> > ! fprintf (vect_dump, "dependence distance modulo vf == 0
> between ");
> > print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM);
> > fprintf (vect_dump, " and ");
> > print_generic_expr (vect_dump, DR_REF (drb), TDF_SLIM);
> > --- 596,616 ----
> > if (vect_print_dump_info (REPORT_DR_DETAILS))
> > fprintf (vect_dump, "dependence distance = %d.", dist);
> >
> > ! /* ??? Why was that dist % vectorization_factor == 0? Only for
> > ! dist == 0 we have to record a rw dependence? Alignment
>
> I think you are right, the code is incorrect and we need to record
> read-write dependence for dist == 0.
Ok, I'll split this out to a separate patch.
> > ! stuff is handled in vect_analyze_data_ref_group after we
> > ! determinded the final vectorization factor. */
> > ! if (dist == 0 && dra_size == drb_size)
> > {
> > /* Two references with distance zero have the same alignment. */
> > VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> > (stmtinfo_a), drb);
> > VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> > (stmtinfo_b), dra);
>
>
> Why do you collect same alignment pairs here (for the case of dist == 0),
> instead of collecting them with the rest in vect_analyze_data_ref_group?
>
> ...
probably a left-over. I'll move it.
> >
> > + /* Function vect_analyze_data_ref_group.
> > +
> > + Update group and alignment relations according to the chosen
> > + vectorization factor. */
> > +
> > + static void
> > + vect_analyze_data_ref_group (struct data_dependence_relation *ddr,
> > + loop_vec_info loop_vinfo)
>
>
> I think that the name of this function is confusing. We use a term "group"
> for groups of strided accesses. Maybe vect_find_same_alignment_drs is
> better?
Ok.
>
> ...
>
> *************** vect_determine_vectorization_factor (loo
> ...
> > else
> > {
> > ! int stmt_desired_vf = 0;
> > !
> > ! gcc_assert (!is_pattern_stmt_p (stmt_info));
> > !
> > ! /* Iterate over all scalar types of the stmt operands and
> > ! determine the minimal and maximal vectorization factors. */
> > ! for (j = 0; j < gimple_num_ops (stmt); ++j)
> > ! {
> > ! tree op = gimple_op (stmt, j);
> > ! if (!op
> > ! || TYPE_P (op))
> > ! continue;
> > !
> > ! scalar_type = TREE_TYPE (op);
> > ! vectype = get_vectype_for_scalar_type_1 (scalar_type,
> > ! *min_vf, true);
> > ! if (!vectype
> > ! || (int)TYPE_VECTOR_SUBPARTS (vectype) > *max_vf)
> > ! {
> > ! if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
> > ! {
> > ! fprintf (vect_dump,
> > ! "not vectorized: unsupported data-type ");
> > ! print_generic_expr (vect_dump, scalar_type, TDF_SLIM);
> > ! }
> > ! return false;
> > ! }
> > !
> > ! nunits = TYPE_VECTOR_SUBPARTS (vectype);
> > ! if (nunits > *min_vf)
> > ! {
> > ! *min_vf = nunits;
> > ! if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "increasing minimal vectorization "
> > ! "factor to %d\n", *min_vf);
> > ! }
> > !
> > ! vectype = get_vectype_for_scalar_type (scalar_type, *max_vf);
> > ! nunits = TYPE_VECTOR_SUBPARTS (vectype);
> > ! if (desired_vf == 0
> > ! || nunits > desired_vf)
> > ! {
> > ! desired_vf = nunits;
> > ! if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "increasing desired vectorization "
> > ! "factor to %d\n", desired_vf);
> > ! }
> > ! if (stmt_desired_vf == 0
> > ! || nunits > stmt_desired_vf)
> > ! stmt_desired_vf = nunits;
>
> Looks like stmt_desired_vf is unused.
Indeed. Removed.
> > ! }
> > ! }
> > ! }
> > ! }
> > !
> > ! /* If we do not support the desired vectorization factor, adjust it.
> */
> > ! if (desired_vf > *max_vf)
> > ! desired_vf = *max_vf;
> > !
> > ! /* Restrict the vectorization factor to a known loop-trip count. */
> > ! if (LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> > ! && ((int)LOOP_VINFO_INT_NITERS (loop_vinfo) < desired_vf))
> > ! {
> > ! desired_vf = LOOP_VINFO_INT_NITERS (loop_vinfo);
> > ! if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "reducing desired vectorization factor to "
> > ! "knwon loop-trip count %d.", desired_vf);
>
> Typo: knwon -> known.
Fixed.
> > ! }
> > !
> > ! /* If we didn't have any interesting statements that shows what we
> > ! desire, use the maximal (??? for now minimal) vectorization
> factor. */
> > ! if (desired_vf == 0
> > ! || desired_vf < *min_vf)
> > ! desired_vf = *min_vf;
> > !
> > ! if (desired_vf == 0
> > ! || *min_vf > *max_vf)
> > ! {
> > ! if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
> > ! fprintf (vect_dump, "not vectorized: unsupported data-type");
> > ! return false;
> > ! }
> > !
> > ! /* TODO: Analyze cost. Decide if worth while to vectorize. */
> > ! if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "vectorization factor = %d", desired_vf);
> > !
> > ! /* The vectorization factor is now determined. */
> > ! LOOP_VINFO_VECT_FACTOR (loop_vinfo) = desired_vf;
> > !
> > ! return true;
> > ! }
> > !
>
> It would be nice to gave some general explanation in this function, like
> what desired_vf, min_vf, max_vf stand for and how they are computed...
I'll think of something.
> A clarification question: for AVX in case there are both ints and floats in
> the loop, vf will be 8 and all the int statements will have two copies,
> right?
If the dependence distance allow 8 then yes. Note that AVX does have
256 bit integer vectors (it can load and store them, do bit operations
on them and even do v8si <-> v8sf conversions), just you can't do
arithmetic on them ...
I was thinking of re-running the analysis phase with vector size
forced to 128 bit if it failed...
> > *************** vect_analyze_loop (struct loop *loop)
> > *** 1410,1428 ****
> > return NULL;
> > }
> >
> > ! /* Analyze the alignment of the data-refs in the loop.
> > ! Fail if a data reference is found that cannot be vectorized. */
> >
> > ! ok = vect_analyze_data_refs_alignment (loop_vinfo, NULL);
> > ! if (!ok)
> > {
> > if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "bad data alignment.");
> > destroy_loop_vec_info (loop_vinfo, true);
> > return NULL;
> > }
> >
> > ! ok = vect_determine_vectorization_factor (loop_vinfo);
> > if (!ok)
> > {
> > if (vect_print_dump_info (REPORT_DETAILS))
> > --- 1570,1593 ----
> > return NULL;
> > }
> >
> > ! /* Analyze data dependences between the data-refs in the loop
> > ! and the maximal possible vectorization factor.
> > ! FORNOW: fail at the first data dependence that we encounter. */
> >
> > ! ok = vect_analyze_data_ref_dependences (loop_vinfo, NULL, &max_vf);
> > ! if (!ok
> > ! || max_vf < min_vf)
> > {
> > if (vect_print_dump_info (REPORT_DETAILS))
> > ! fprintf (vect_dump, "bad data dependence.");
> > destroy_loop_vec_info (loop_vinfo, true);
> > return NULL;
> > }
> > + if (vect_print_dump_info (REPORT_DETAILS))
> > + fprintf (vect_dump, "maximum vectorization factor %i.", max_vf);
> >
> > ! /* Determine the vectorization factor. */
> > ! ok = vect_determine_vectorization_factor (loop_vinfo, &min_vf,
> &max_vf);
>
> Is there a reason to pass min_vf and max_vf by reference? (They are not
> used after this call).
No. I'll clean up that bits.
Thanks,
Richard.
More information about the Gcc-patches
mailing list