[PATCH]middle-end slp: fix accidental resource re-use of slp_tree (PR99220)
Tamar Christina
Tamar.Christina@arm.com
Wed Feb 24 14:08:11 GMT 2021
Hi Richi,
This is an updated patch with your suggestion.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
PR tree-optimization/99220
* tree-vect-slp.c (optimize_load_redistribution_1): Remove
node from cache when it's about to be deleted.
gcc/testsuite/ChangeLog:
PR tree-optimization/99220
* g++.dg/vect/pr99220.cc: New test.
The 02/24/2021 08:52, Richard Biener wrote:
> On Tue, 23 Feb 2021, Tamar Christina wrote:
>
> > Hi Richi,
> >
> > The attached testcase shows a bug where two nodes end up with the same pointer.
> > During the loop that analyzes all the instances
> > in optimize_load_redistribution_1 we do
> >
> > if (value)
> > {
> > SLP_TREE_REF_COUNT (value)++;
> > SLP_TREE_CHILDREN (root)[i] = value;
> > vect_free_slp_tree (node);
> > }
> >
> > when doing a replacement. When this is done and the refcount for the node
> > reaches 0, the node is removed, which allows the libc to return the pointer
> > again in the next call to new, which it does..
> >
> > First instance
> >
> > note: node 0x5325f48 (max_nunits=1, refcnt=2)
> > note: op: VEC_PERM_EXPR
> > note: { }
> > note: lane permutation { 0[0] 1[1] 0[2] 1[3] }
> > note: children 0x5325db0 0x5325200
> >
> > Second instance
> >
> > note: node 0x5325f48 (max_nunits=1, refcnt=1)
> > note: op: VEC_PERM_EXPR
> > note: { }
> > note: lane permutation { 0[0] 1[1] }
> > note: children 0x53255b8 0x5325530
> >
> > This will end up with the illegal construction of
> >
> > note: node 0x53258e8 (max_nunits=2, refcnt=2)
> > note: op template: slp_patt_57 = .COMPLEX_MUL (_16, _16);
> > note: stmt 0 _16 = _14 - _15;
> > note: stmt 1 _23 = _17 + _22;
> > note: children 0x53257d8 0x5325d28
> > note: node 0x53257d8 (max_nunits=2, refcnt=3)
> > note: op template: l$b_4 = MEM[(const struct a &)_3].b;
> > note: stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
> > note: stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
> > note: load permutation { 0 1 }
> > note: node 0x5325d28 (max_nunits=2, refcnt=8)
> > note: op template: l$b_4 = MEM[(const struct a &)_3].b;
> > note: stmt 0 l$b_4 = MEM[(const struct a &)_3].b;
> > note: stmt 1 l$c_5 = MEM[(const struct a &)_3].c;
> > note: stmt 2 l$b_4 = MEM[(const struct a &)_3].b;
> > note: stmt 3 l$c_5 = MEM[(const struct a &)_3].c;
> > note: load permutation { 0 1 0 1 }
> >
> > To prevent this my initial thought was to add the temporary VEC_PERM_EXPR nodes
> > to the bst_map cache and increase their refcnt one more. However since bst_map
> > is gated on scalar statements and these nodes have none we can't do that.
> >
> > Instead I realized that load_map is really only a visited list at the top level.
> > So instead of returning the reference, we should return NULL.
> >
> > What this means is that it will no replacement was found at that level. This is
> > fine since these VEC_PERM_EXPR are single use. So while the any other node is
> > an indication to use the cache, VEC_PERM_EXPR are an indication to avoid it.
>
> I don't understand really. Waiting for the other patch to be pushed so
> I can eventually have a look, but see below.
>
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/99220
> > * tree-vect-slp.c (optimize_load_redistribution_1): Don't use
> > VEC_PERM_EXPR in cache.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/99220
> > * g++.dg/vect/pr99220.cc: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/g++.dg/vect/pr99220.cc b/gcc/testsuite/g++.dg/vect/pr99220.cc
> > new file mode 100755
> > index 0000000000000000000000000000000000000000..ff3058832b742414202a8ada0a9dafc72c9a54aa
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/vect/pr99220.cc
> > @@ -0,0 +1,29 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-w -O3 -march=armv8.3-a" { target { aarch64*-*-* } } } */
> > +
> > +class a {
> > + float b;
> > + float c;
> > +
> > +public:
> > + a(float d, float e) : b(d), c(e) {}
> > + a operator+(a d) { return a(b + d.b, c + d.c); }
> > + a operator-(a d) { return a(b - d.b, c - d.c); }
> > + a operator*(a d) { return a(b * b - c * c, b * c + c * d.b); }
> > +};
> > +long f;
> > +a *g;
> > +class {
> > + a *h;
> > + long i;
> > + a *j;
> > +
> > +public:
> > + void k() {
> > + a l = h[0], m = g[i], n = l * g[1], o = l * j[8];
> > + g[i] = m + n;
> > + g[i + 1] = m - n;
> > + j[f] = o;
> > + }
> > +} p;
> > +main() { p.k(); }
> > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> > index 605873714a5cafaaf822f61f1f769f96b3876694..e631463be8fc5b2de355e674a9c96665beb9516c 100644
> > --- a/gcc/tree-vect-slp.c
> > +++ b/gcc/tree-vect-slp.c
> > @@ -2292,7 +2292,12 @@ optimize_load_redistribution_1 (scalar_stmts_to_slp_tree_map_t *bst_map,
> > slp_tree root)
> > {
> > if (slp_tree *leader = load_map->get (root))
> > - return *leader;
> > + {
> > + if (SLP_TREE_CODE (root) == VEC_PERM_EXPR)
> > + return NULL;
>
> But this will then only optimize the first occurance. Wouldn't it be
> better to increase the refcount at
>
> load_map->put (root, node);
>
> and walk load_map at the end, releasing refs owned by it like we do
> for bst_map?
>
> > + else
> > + return *leader;
> > + }
> >
> > load_map->put (root, NULL);
> >
> >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rb14228.patch
Type: text/x-diff
Size: 1908 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20210224/cc0fab48/attachment-0001.bin>
More information about the Gcc-patches
mailing list