[PATCH] predcom: Fix invalid store-store commoning [PR93434]
Richard Biener
richard.guenther@gmail.com
Tue Jan 28 10:34:00 GMT 2020
On Tue, Jan 28, 2020 at 10:49 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > predcom has the following code to stop one rogue load from
> > interfering with other store-load opportunities:
> >
> > /* If A is read and B write or vice versa and there is unsuitable
> > dependence, instead of merging both components into a component
> > that will certainly not pass suitable_component_p, just put the
> > read into bad component, perhaps at least the write together with
> > all the other data refs in it's component will be optimizable. */
> >
> > But when store-store commoning was added later, this had the effect
> > of ignoring loads that occur between two candidate stores.
> >
> > There is code further up to handle loads and stores with unknown
> > dependences:
> >
> > /* Don't do store elimination if there is any unknown dependence for
> > any store data reference. */
> > if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
> > && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
> > || DDR_NUM_DIST_VECTS (ddr) == 0))
> > eliminate_store_p = false;
> >
> > But the store-load code above skips loads for *known* dependences
> > if (a) the load has already been marked "bad" or (b) the data-ref
> > machinery knows the dependence distance, but determine_offsets
> > can't handle the combination.
> >
> > (a) happens to be the problem in the testcase, but a different
> > sequence could have given (b) instead. We have writes to individual
> > fields of a structure and reads from the whole structure. Since
> > determine_offsets requires the types to be the same, it returns false
> > for each such read/write combination.
> >
> > This patch records which components have had loads removed and
> > prevents store-store commoning for them. It's a bit too pessimistic,
> > since there shouldn't be a problem if a "bad" load dominates all stores
> > in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
> > here and (b) the handling for that case would probably need to be
> > removed again if we handled more exotic cases in future.
>
> Forgot to add:
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK for master
> and eventually for backports?
OK for master and backports.
Richard.
> Thanks,
> Richard
>
> >
> > 2020-01-28 Richard Sandiford <richard.sandiford@arm.com>
> >
> > gcc/
> > PR tree-optimization/93434
> > * tree-predcom.c (split_data_refs_to_components): Record which
> > components have had aliasing loads removed. Prevent store-store
> > commoning for all such components.
> >
> > gcc/testsuite/
> > PR tree-optimization/93434
> > * gcc.c-torture/execute/pr93434.c: New test.
> > ---
> > gcc/testsuite/gcc.c-torture/execute/pr93434.c | 36 +++++++++++++++++++
> > gcc/tree-predcom.c | 24 +++++++++++--
> > 2 files changed, 58 insertions(+), 2 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr93434.c
> >
> > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
> > index 047c7fa9583..d2dcfe7f42d 100644
> > --- a/gcc/tree-predcom.c
> > +++ b/gcc/tree-predcom.c
> > @@ -767,6 +767,7 @@ split_data_refs_to_components (class loop *loop,
> > /* Don't do store elimination if loop has multiple exit edges. */
> > bool eliminate_store_p = single_exit (loop) != NULL;
> > basic_block last_always_executed = last_always_executed_block (loop);
> > + auto_bitmap no_store_store_comps;
> >
> > FOR_EACH_VEC_ELT (datarefs, i, dr)
> > {
> > @@ -838,9 +839,13 @@ split_data_refs_to_components (class loop *loop,
> > else if (DR_IS_READ (dra) && ib != bad)
> > {
> > if (ia == bad)
> > - continue;
> > + {
> > + bitmap_set_bit (no_store_store_comps, ib);
> > + continue;
> > + }
> > else if (!determine_offset (dra, drb, &dummy_off))
> > {
> > + bitmap_set_bit (no_store_store_comps, ib);
> > merge_comps (comp_father, comp_size, bad, ia);
> > continue;
> > }
> > @@ -848,9 +853,13 @@ split_data_refs_to_components (class loop *loop,
> > else if (DR_IS_READ (drb) && ia != bad)
> > {
> > if (ib == bad)
> > - continue;
> > + {
> > + bitmap_set_bit (no_store_store_comps, ia);
> > + continue;
> > + }
> > else if (!determine_offset (dra, drb, &dummy_off))
> > {
> > + bitmap_set_bit (no_store_store_comps, ia);
> > merge_comps (comp_father, comp_size, bad, ib);
> > continue;
> > }
> > @@ -908,6 +917,17 @@ split_data_refs_to_components (class loop *loop,
> > comp->refs.quick_push (dataref);
> > }
> >
> > + if (eliminate_store_p)
> > + {
> > + bitmap_iterator bi;
> > + EXECUTE_IF_SET_IN_BITMAP (no_store_store_comps, 0, ia, bi)
> > + {
> > + ca = component_of (comp_father, ia);
> > + if (ca != bad)
> > + comps[ca]->eliminate_store_p = false;
> > + }
> > + }
> > +
> > for (i = 0; i < n; i++)
> > {
> > comp = comps[i];
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr93434.c b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
> > new file mode 100644
> > index 00000000000..e786252794b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/pr93434.c
> > @@ -0,0 +1,36 @@
> > +typedef struct creal_T {
> > + double re;
> > + double im;
> > +} creal_T;
> > +
> > +#define N 16
> > +int main() {
> > + int k;
> > + int i;
> > + int j;
> > + creal_T t2[N];
> > + double inval;
> > +
> > + inval = 1.0;
> > + for (j = 0; j < N; ++j) {
> > + t2[j].re = 0;
> > + t2[j].im = 0;
> > + }
> > +
> > + for (j = 0; j < N/4; j++) {
> > + i = j * 4;
> > + t2[i].re = inval;
> > + t2[i].im = inval;
> > + k = i + 3;
> > + t2[k].re = inval;
> > + t2[k].im = inval;
> > + t2[i] = t2[k];
> > + t2[k].re = inval;
> > + }
> > +
> > + for (i = 0; i < 2; ++i)
> > + if (t2[i].re != !i || t2[i].im != !i)
> > + __builtin_abort ();
> > +
> > + return 0;
> > +}
More information about the Gcc-patches
mailing list