This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Duplicating loops and virtual phis
- From: Steve Ellcey <sellcey at cavium dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, gcc at gcc dot gnu dot org
- Date: Mon, 15 May 2017 09:56:53 -0700
- Subject: Re: Duplicating loops and virtual phis
- Authentication-results: sourceware.org; auth=none
- Authentication-results: gmail.com; dkim=none (message not signed) header.d=none;gmail.com; dmarc=none action=none header.from=cavium.com;
- References: <201705122042.v4CKgYk1028704@sellcey-dt.caveonetworks.com> <0BB8C390-74E8-45EA-A4FF-438B53197254@gmail.com>
- Reply-to: sellcey at cavium dot com
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Sat, 2017-05-13 at 08:18 +0200, Richard Biener wrote:
> On May 12, 2017 10:42:34 PM GMT+02:00, Steve Ellcey <sellcey@cavium.c
> om> wrote:
> >
> > (Short version of this email, is there a way to recalculate/rebuild
> > virtual
> > phi nodes after modifying the CFG.)
> >
> > I have a question about duplicating loops and virtual phi nodes.
> > I am trying to implement the following optimization as a pass:
> >
> > Transform:
> >
> > for (i = 0; i < n; i++) {
> > A[i] = A[i] + B[i];
> > C[i] = C[i-1] + D[i];
> > }
> >
> > Into:
> >
> > if (noalias between A&B, A&C, A&D)
> > for (i = 0; i < 100; i++)
> > A[i] = A[i] + B[i];
> > for (i = 0; i < 100; i++)
> > C[i] = C[i-1] + D[i];
> > else
> > for (i = 0; i < 100; i++) {
> > A[i] = A[i] + B[i];
> > C[i] = C[i-1] + D[i];
> > }
> >
> > Right now the vectorizer sees that 'C[i] = C[i-1] + D[i];' cannot be
> > vectorized so it gives up and does not vectorize the loop. If we split
> > up the loop into two loops then the vector add with A[i] could be
> > vectorized
> > even if the one with C[i] could not.
> Loop distribution does this transform but it doesn't know about
> versioning for unknown dependences.
>
Yes, I looked at loop distribution. But it only works with global
arrays and not with pointer arguments where it doesn't know the size of
the array being pointed at. I would like to be able to have it work
with pointer arguments. If I call a function with 2 or
more integer pointers, and I have a loop that accesses them with
offsets between 0 and N where N is loop invariant then I should have
enough information (at runtime) to determine if there are overlapping
memory accesses through the pointers and determine whether or not I can
distribute the loop.
The loop splitting code seemed like a better template since it already
knows how to split a loop based on a runtime determined condition. That
part seems to be working for me, it is when I try to
distribute/duplicate one of those loops (under the unaliased condition)
that I am running into the problem with virtual PHIs.
Steve Ellcey
sellcey@cavium.com