This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] (4.1 project list) vectorizer alignment improvements
- From: Dorit Naishlos <DORIT at il dot ibm dot com>
- To: Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Ira Rosen <IRAR at il dot ibm dot com>, Keith Besaw <kbesaw at us dot ibm dot com>
- Date: Wed, 1 Jun 2005 23:46:50 +0300
- Subject: Re: [patch] (4.1 project list) vectorizer alignment improvements
- Reply-to:
- Sensitivity:
Richard Henderson <rth@redhat.com> wrote on 01/06/2005 22:25:18:
> On Mon, May 30, 2005 at 07:16:35PM +0300, Dorit Naishlos wrote:
> > ! if (dist % vectorization_factor == 0)
> > {
> > ! /* Two references with distance zero have the same alignment.
*/
> > ! VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> (stmtinfo_a), drb);
> > ! VEC_safe_push (dr_p, heap, STMT_VINFO_SAME_ALIGN_REFS
> (stmtinfo_b), dra);
> > ! if (vect_print_dump_info (REPORT_ALIGNMENT, LOOP_LOC
(loop_vinfo)))
> ----
> > + STMT_VINFO_SAME_ALIGN_REFS (vinfo_for_stmt (DR_STMT (dr0)));
> > + for (i = 0; VEC_iterate (dr_p, same_align_drs, i, dr); i++)
> > + {
> > + DR_MISALIGNMENT (dr) = 0;
>
> So we find two references, and see that they're offset from one another
> by a multiple of the vector size. Fine. Presumably this later forcing
> of the alignment to zero comes after loop peeling to align one of the
> references, and we're remembering that the other references shared the
> same relative alignment.
>
> What I don't understand is how this works except in the special case of
> only one pair found. In which case you don't need a queue like this.
>
we have a same_align VEC for each dataref in the loop, so for each dataref
x in the loop we can record all the other datarefs that have the same
alignment as the alignment of x (right now the same_align_refs VEC is per
stmt, and each stmt is limited to one dataref; if/when we'll want to
support multiple datarefs per stmt we'll need to move this VEC from
stmt-info-struct to the dataref-struct).
> I'm thinking of something like
>
> for (i = 0; i < N; ++i) {
> a[i] += a[i+4];
> b[i+1] += b[i+5];
> }
>
> or something like that. The point being that a[i] and a[i+4] are 4
> units apart, as are b[i+1] and b[i+5]. But a[i] and b[i+1] are not
> co-aligned, which would seem to break the bulk processing that you're
> doing here.
>
no, cause none of the accesses to array 'b' will be recorded as having the
same alignment as any of the accesses to array 'a'. Currently we only
record same-alignment when we have a dependence-distance between accesses,
and we have a dependence-distance only between accesses to the same
array/object (so with this scheme we're currently actually a bit limited.
Say we have two arrays - d,c - whose alignment is known to be the same;
then in this loop:
for (i = 0; i < N; ++i) {
d[i] += c[i+4];
}
we potentially have enough info to determine that all accesses have the
same alignment, but since we don't have a dependence-distance for
(d[i],c[i+4]), we don't record these two accesses as having
same-alignment).
Back to your example:
for (i = 0; i < N; ++i) {
a[i] = a[i] + a[i+4];
b[i+1] = b[i+1] + b[i+5];
}
We have 6 DRs in the loop. Say we don't know what are the initial
alignments of 'a','b'. Each DR will have 2 other DRs recorded in its
same_align VEC, as follows:
1. a[i](store): a[i](load), a[i+4]
2. a[i](load): a[i](store), a[i+4]
3. a[i+4](load): a[i](store), a[i](load)
4. b[i+1](store): b[i+1](load), b[i+5]
5. b[i+1](load): b[i+1](store), b[i+5]
6. b[i+5](load): b[i+1](store), b[i+1](load)
So later on, if we peel to align any of the accesses to 'a', we know that
the other two accesses in it's same_align VEC will also be aligned. We know
nothing about the alignment of the accesses to 'b', and will have to
generate unaligned accesses for them.
dorit
>
> r~