This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: [Fortran] Support inverted execution masks in WHEREs
- From: Roger Sayle <roger at eyesopen dot com>
- To: Richard Guenther <richard dot guenther at gmail dot com>
- Cc: fortran at gcc dot gnu dot org, <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 22 Feb 2006 06:27:32 -0700 (MST)
- Subject: Re: [Fortran] Support inverted execution masks in WHEREs
On Wed, 22 Feb 2006, Richard Guenther wrote:
> On 2/22/06, Roger Sayle <roger@eyesopen.com> wrote:
> > cmask = malloc(...);
> > for (i=0; i<...; i++)
> > cmask[i] = cond(i);
> > for (i=0; i<...; i++)
> > if (cmask[i])
> > stmt1(i);
> > for (i=0; i<...; i++)
> > if (!cmask[i])
> > stmt2(i);
> > free(cmask);
>
> I wonder why we use multiple loops here - at least fusing the last two ones
> would be more cache friendly for use of the mask. Or does fusing the loops
> create worse code somehow?
In the general case three loops are required as the fortran standards
allow for the side-effects of cond, stmt1 and stmt2 to effect each
other, and if so require to behaviour to match the three loops above.
We also already optimize the special case of no dependencies between
cond, stmt1 and stmt2 (where stmt1 and stmt2 are single assignments)
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg00316.html
However, your suggestion does reveal an potential intermediate case
that could be improved. Loops such as those above where "cond" is
affected by either stmt1 or stmt2, i.e. we need a temporary mask,
but where stmt1 and stmt2 don't affect each other, i.e. are independent.
In this case we could potentially generate better code like:
cmask = malloc(...);
for (i=0; i<...; i++)
cmask[i] = cond(i);
for (i=0; i<...; i++)
if (cmask[i])
stmt1(i);
else
stmt2(i);
free(cmask);
Grrr! I'll investigate a new gfc_trans_where_4 to catch this case,
though I'd expect this shape of dependency to be relatively rare.
In the longer term, the tree-ssa optimizers may be able to implement
loop-fusion, to combine consecutive loops into one when profitable.
Roger
--