[PATCH] predcom: Fix invalid store-store commoning [PR93434]

Richard Biener richard.guenther@gmail.com
Tue Jan 28 10:34:00 GMT 2020


On Tue, Jan 28, 2020 at 10:49 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > predcom has the following code to stop one rogue load from
> > interfering with other store-load opportunities:
> >
> >       /* If A is read and B write or vice versa and there is unsuitable
> >        dependence, instead of merging both components into a component
> >        that will certainly not pass suitable_component_p, just put the
> >        read into bad component, perhaps at least the write together with
> >        all the other data refs in it's component will be optimizable.  */
> >
> > But when store-store commoning was added later, this had the effect
> > of ignoring loads that occur between two candidate stores.
> >
> > There is code further up to handle loads and stores with unknown
> > dependences:
> >
> >       /* Don't do store elimination if there is any unknown dependence for
> >        any store data reference.  */
> >       if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
> >         && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
> >             || DDR_NUM_DIST_VECTS (ddr) == 0))
> >       eliminate_store_p = false;
> >
> > But the store-load code above skips loads for *known* dependences
> > if (a) the load has already been marked "bad" or (b) the data-ref
> > machinery knows the dependence distance, but determine_offsets
> > can't handle the combination.
> >
> > (a) happens to be the problem in the testcase, but a different
> > sequence could have given (b) instead.  We have writes to individual
> > fields of a structure and reads from the whole structure.  Since
> > determine_offsets requires the types to be the same, it returns false
> > for each such read/write combination.
> >
> > This patch records which components have had loads removed and
> > prevents store-store commoning for them.  It's a bit too pessimistic,
> > since there shouldn't be a problem if a "bad" load dominates all stores
> > in a component.  But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
> > here and (b) the handling for that case would probably need to be
> > removed again if we handled more exotic cases in future.
>
> Forgot to add:
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for master
> and eventually for backports?

OK for master and backports.
Richard.

> Thanks,
> Richard
>
> >
> > 2020-01-28  Richard Sandiford  <richard.sandiford@arm.com>
> >
> > gcc/
> >       PR tree-optimization/93434
> >       * tree-predcom.c (split_data_refs_to_components): Record which
> >       components have had aliasing loads removed.  Prevent store-store
> >       commoning for all such components.
> >
> > gcc/testsuite/
> >       PR tree-optimization/93434
> >       * gcc.c-torture/execute/pr93434.c: New test.
> > ---
> >  gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++++++++++++++++++
> >  gcc/tree-predcom.c                            | 24 +++++++++++--
> >  2 files changed, 58 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c
> >
> > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
> > index 047c7fa9583..d2dcfe7f42d 100644
> > --- a/gcc/tree-predcom.c
> > +++ b/gcc/tree-predcom.c
> > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop,
> >    /* Don't do store elimination if loop has multiple exit edges.  */
> >    bool eliminate_store_p = single_exit (loop) != NULL;
> >    basic_block last_always_executed = last_always_executed_block (loop);
> > +  auto_bitmap no_store_store_comps;
> >
> >    FOR_EACH_VEC_ELT (datarefs, i, dr)
> >      {
> > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop,
> >        else if (DR_IS_READ (dra) && ib != bad)
> >       {
> >         if (ia == bad)
> > -         continue;
> > +         {
> > +           bitmap_set_bit (no_store_store_comps, ib);
> > +           continue;
> > +         }
> >         else if (!determine_offset (dra, drb, &dummy_off))
> >           {
> > +           bitmap_set_bit (no_store_store_comps, ib);
> >             merge_comps (comp_father, comp_size, bad, ia);
> >             continue;
> >           }
> > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop,
> >        else if (DR_IS_READ (drb) && ia != bad)
> >       {
> >         if (ib == bad)
> > -         continue;
> > +         {
> > +           bitmap_set_bit (no_store_store_comps, ia);
> > +           continue;
> > +         }
> >         else if (!determine_offset (dra, drb, &dummy_off))
> >           {
> > +           bitmap_set_bit (no_store_store_comps, ia);
> >             merge_comps (comp_father, comp_size, bad, ib);
> >             continue;
> >           }
> > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop,
> >        comp->refs.quick_push (dataref);
> >      }
> >
> > +  if (eliminate_store_p)
> > +    {
> > +      bitmap_iterator bi;
> > +      EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
> > +     {
> > +       ca = component_of (comp_father, ia);
> > +       if (ca != bad)
> > +         comps[ca]->eliminate_store_p = false;
> > +     }
> > +    }
> > +
> >    for (i = 0; i < n; i++)
> >      {
> >        comp = comps[i];
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
> > new file mode 100644
> > index 00000000000..e786252794b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
> > @@ -0,0 +1,36 @@
> > +typedef struct creal_T {
> > +  double re;
> > +  double im;
> > +} creal_T;
> > +
> > +#define N 16
> > +int main() {
> > +  int k;
> > +  int i;
> > +  int j;
> > +  creal_T t2[N];
> > +  double inval;
> > +
> > +  inval = 1.0;
> > +  for (j = 0; j < N; ++j) {
> > +    t2[j].re = 0;
> > +    t2[j].im = 0;
> > +  }
> > +
> > +  for (j = 0; j < N/4; j++) {
> > +    i = j * 4;
> > +    t2[i].re = inval;
> > +    t2[i].im = inval;
> > +    k = i + 3;
> > +    t2[k].re = inval;
> > +    t2[k].im = inval;
> > +    t2[i] = t2[k];
> > +    t2[k].re = inval;
> > +  }
> > +
> > +  for (i = 0; i < 2; ++i)
> > +    if (t2[i].re != !i || t2[i].im != !i)
> > +      __builtin_abort ();
> > +
> > +  return 0;
> > +}



More information about the Gcc-patches mailing list