[PATCH]middle-end cse: Make sure duplicate elements are not entered into the equivalence set [PR103404]

Tamar Christina Tamar.Christina@arm.com
Mon Nov 29 09:16:27 GMT 2021


> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Monday, November 29, 2021 9:02 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jlaw@tachyum.com;
> Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH]middle-end cse: Make sure duplicate elements are not
> entered into the equivalence set [PR103404]
> 
> On Mon, 29 Nov 2021, Tamar Christina wrote:
> 
> > Hi All,
> >
> > CSE uses equivalence classes to keep track of expressions that all
> > have the same values at the current point in the program.
> >
> > Normal equivalences through SETs only insert and perform lookups in
> > this set but equivalence determined from comparisons, e.g.
> >
> > (insn 46 44 47 7 (set (reg:CCZ 17 flags)
> >         (compare:CCZ (reg:SI 105 [ iD.2893 ])
> >             (const_int 0 [0]))) "cse.c":18:22 7 {*cmpsi_ccno_1}
> >      (expr_list:REG_DEAD (reg:SI 105 [ iD.2893 ])
> >         (nil)))
> >
> > creates the equivalence EQ on (reg:SI 105 [ iD.2893 ]) and (const_int 0 [0]).
> >
> > This causes a merge to happen between the two equivalence sets denoted
> > by (const_int 0 [0]) and (reg:SI 105 [ iD.2893 ]) respectively.
> >
> > The operation happens through merge_equiv_classes however this
> > function has an invariant that the classes to be merge not contain any
> > duplicates.  This is because it frees entries before merging.
> >
> > The given testcase when using the supplied flags trigger an ICE due to
> > the equivalence set being
> >
> > (rr) p dump_class (class1)
> > Equivalence chain for (reg:SI 105 [ iD.2893 ]):
> > (reg:SI 105 [ iD.2893 ])
> > $3 = void
> >
> > (rr) p dump_class (class2)
> > Equivalence chain for (const_int 0 [0]):
> > (const_int 0 [0])
> > (reg:SI 97 [ _10 ])
> > (reg:SI 97 [ _10 ])
> > $4 = void
> >
> > This happens because the original INSN being recorded is
> >
> > (insn 18 17 24 2 (set (subreg:V1SI (reg:SI 97 [ _10 ]) 0)
> >         (const_vector:V1SI [
> >                 (const_int 0 [0])
> >             ])) "cse.c":11:9 1363 {*movv1si_internal}
> >      (expr_list:REG_UNUSED (reg:SI 97 [ _10 ])
> >         (nil)))
> >
> > and we end up generating two equivalences. the first one is simply
> > that reg:SI 97 is 0.  The second one is that 0 can be extracted from
> > the V1SI, so subreg (subreg:V1SI (reg:SI 97) 0) 0 == 0.  This nested
> > subreg gets folded away to just reg:SI 97 and we re-insert the same
> equivalence.
> >
> > This patch changes it so that once we figure out the bucket to insert
> > into we check if the equivalence set already contains the entry and if
> > so just return the existing entry and exit.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no regressions.
> >
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	PR rtl-optimization/103404
> > 	* cse.c (insert_with_costs): Check if item exists already before
> adding
> > 	a new entry in the equivalence class.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	PR rtl-optimization/103404
> > 	* gcc.target/i386/pr103404.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/cse.c b/gcc/cse.c
> > index
> >
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..08295246c594109e947276051c
> 67
> > 76e4cabca4ec 100644
> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -1537,6 +1537,17 @@ insert_with_costs (rtx x, struct table_elt *classp,
> unsigned int hash,
> >    if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
> >      add_to_hard_reg_set (&hard_regs_in_table, GET_MODE (x), REGNO
> > (x));
> >
> > +  /* We cannot allow a duplicate to be entered into the equivalence sets
> > +     and so we should perform a check before we do any allocations or
> > +     change the buckets.  */
> > +  if (classp)
> > +    {
> > +      struct table_elt *p;
> > +      for (p = classp; p; p = p->next_same_value)
> > +	if (exp_equiv_p (p->exp, x, 1, false))
> > +	  return p;
> 
> not really a review, leaving that to who approved the original change, but
> these things always look bad - this linear list walk makes insert_with_costs
> quadratic.  Is there any mitigation (like limiting the number of entries?), is
> that already an existing problem elsewhere in CSE?

Hmm I believe insert_with_costs is already quadratic as it walks the list after allocations
in a similar manner. 

The problem is that the function assumes it can't ever fail, so it starts off by allocating
memory and modifying the table itself.  There are also two ways it can allocate memory,
so by the time it starts walking the classp itself and we notice any duplicates we did quite
a lot of work already and can't easily undo all of it. Ideally the function would identify
the insertion point before it does any changes.  But there are multiple ways to determine
where to insert.

But.. I think I can use the result of the first walk to seed the second walk which becomes O(1).
That way there's no additional work being done.  I'll update the patch.

Regards,
Tamar
> 
> > +    }
> > +
> >    /* Put an element for X into the right hash bucket.  */
> >
> >    elt = free_element_chain;
> > diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c
> > b/gcc/testsuite/gcc.target/i386/pr103404.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..66f33645301db09503fc0977fd
> 0f
> > 061a19e56ea5
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr103404.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce
> > +-fno-early-inlining -fgcse -fharden-conditional-branches
> > +-frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
> > +
> > +typedef unsigned __attribute__((__vector_size__ (4))) U; typedef
> > +unsigned __attribute__((__vector_size__ (16))) V; typedef unsigned
> > +__attribute__((__vector_size__ (64))) W;
> > +
> > +int x, y;
> > +
> > +V v;
> > +W w;
> > +
> > +inline
> > +int bar (U a)
> > +{
> > +  a |= x;
> > +  W k =
> > +    __builtin_shufflevector (v, 5 / a,
> > +			     2, 4, 0, 2, 4, 1, 0, 1,
> > +			     1, 2, 1, 3, 0, 4, 4, 0);
> > +  w = k;
> > +  y = 0;
> > +}
> > +
> > +int
> > +foo ()
> > +{
> > +  bar ((U){0xffffffff});
> > +  for (unsigned i; i < sizeof (foo);)
> > +    ;
> > +}
> > +
> >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


More information about the Gcc-patches mailing list