Bug 82899 - *this in constructors could not alias with reference input parameters of the same type
Summary: *this in constructors could not alias with reference input parameters of the ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: 9.0
Assignee: Not yet assigned to anyone
URL:
Keywords: alias, missed-optimization
Depends on:
Blocks:
 
Reported: 2017-11-08 11:00 UTC by Antony Polukhin
Modified: 2018-05-18 22:25 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-11-08 00:00:00


Attachments
Untested patch (1.14 KB, patch)
2018-05-10 19:43 UTC, Marc Glisse
Details | Diff
Successful patch (800 bytes, patch)
2018-05-11 21:20 UTC, Marc Glisse
Details | Diff
untested middle-end patch (1.32 KB, patch)
2018-05-12 12:16 UTC, Marc Glisse
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Antony Polukhin 2017-11-08 11:00:45 UTC
Following code 
struct test {
    char data[2] = {1, 2};

    test(const test& v);
    ~test(){} // non trivial destructor!
};
test::test(const test& v)
    : data{ v.data[0], v.data[0] }
{}


Produces the following assembly:

test::test(test const&):
        movzx   eax, BYTE PTR [rsi]
        mov     BYTE PTR [rdi], al
        movzx   eax, BYTE PTR [rsi]   <=== This is not required
        mov     BYTE PTR [rdi+1], al
        ret


`*this` could not alias with `v` because `v` is already constructed while *this is still being constructed. Any attempt to make `*this` alias `v` is a UB:

#1
test t; // let's assume that `test` is default constructible
new (&t) test(t); // UB: destructor was not called

#2
std::aligned_storage_t<test> t;
new (&t) test(reinterpret_cast<test&>(t)); // UB: test was not constructed on t

#3 (doubtful)
test t; // let's assume that `test` has trivial destructor
// t.~test(); // users may skip this call
new (&t) test(t); // extremely wired


Please, either assume that *this never alias input parameter of the same type OR
assume that *this never alias input parameter of the same type if the destructor is not trivial.

This aliasing issue affects performance of all the copy and move constructors.
Comment 1 Richard Biener 2017-11-08 12:43:37 UTC
Would it be valid to make 'this' __restrict?  Is it valid to do

struct X { int i; };
struct Y : X { Y(Y *); int k; };

Y::Y(Y *x)
{
  k = x->i;
}

int main()
{
  Y y(&y);
}

?  That is, access members of the base class which should be initialized already?
This would mean using __restrict isn't valid.
Comment 2 Antony Polukhin 2017-11-08 13:25:33 UTC
Looks like [class.ctor] paragraph 14 covers this case:

"During the construction of an object, if the value of the object or any of its subobjects is accessed through
a glvalue that is not obtained, directly or indirectly, from the constructor’s this pointer, the value of the
object or subobject thus obtained is unspecified."
Comment 3 Antony Polukhin 2017-11-08 13:28:21 UTC
BTW, Clang warns on code like Y y(y)

warning: variable 'y' is uninitialized when used within its own initialization [-Wuninitialized]
  Y y(y);
    ~ ^

GCC may add same warning
Comment 4 Richard Biener 2017-11-08 13:31:13 UTC
(In reply to Antony Polukhin from comment #2)
> Looks like [class.ctor] paragraph 14 covers this case:
> 
> "During the construction of an object, if the value of the object or any of
> its subobjects is accessed through
> a glvalue that is not obtained, directly or indirectly, from the
> constructor’s this pointer, the value of the
> object or subobject thus obtained is unspecified."

Yeah, sounds like covering this case.  Thus we can make 'this' restrict in
constructors (and possibly assignment operators if self-assignment is forbidden).

It would be "restrict" in the GCC middle-end sense, not the C specification sense though, barring some extra clarification on "not obtained, directly or indirectly, from the constructor's this pointer".  We conservatively propagate
what can point to a restrict parameters object.

Thus, C++ FE issue.
Comment 5 Richard Biener 2017-11-08 13:31:36 UTC
(In reply to Antony Polukhin from comment #3)
> BTW, Clang warns on code like Y y(y)
> 
> warning: variable 'y' is uninitialized when used within its own
> initialization [-Wuninitialized]
>   Y y(y);
>     ~ ^
> 
> GCC may add same warning

Can you open a bugreport?
Comment 6 Antony Polukhin 2017-11-08 13:44:15 UTC
Done: Bug 82900
Comment 7 Martin Sebor 2017-11-08 16:20:45 UTC
See also bug 81009 for a related optimization opportunity (this one having to do with const objects/members).

A ctor declaration cannot be qualified but G++ does seem top allow the __restrict qualifier on a ctor definition outside the class with the expected effect.

Explicitly annotating the argument of the copy ctor with __restrict keyword is valid and has the expected effect:

$ cat t.C && gcc -O2 -S -fdump-tree-optimized=/dev/stdout t.C
struct test {
    char data[2] = {1, 2};

    test(const test& v);
    ~test(){} // non trivial destructor!
};

test::test(const test& __restrict v) /* __restrict here works too */
    : data{ v.data[0], v.data[0] }
{}

;; Function test::test (_ZN4testC2ERKS_, funcdef_no=4, decl_uid=2331, cgraph_uid=4, symbol_order=4)

test::test (struct test * const restrict this, const struct test & v)
{
  char _1;

  <bb 2> [local count: 10000]:
  _1 = *v_5(D).data[0];
  *this_3(D).data[0] = _1;
  *this_3(D).data[1] = _1;
  return;

}
Comment 8 Antony Polukhin 2018-05-03 17:47:34 UTC
(In reply to Richard Biener from comment #4)
> (In reply to Antony Polukhin from comment #2)
> > Looks like [class.ctor] paragraph 14 covers this case:
> > 
> > "During the construction of an object, if the value of the object or any of
> > its subobjects is accessed through
> > a glvalue that is not obtained, directly or indirectly, from the
> > constructor’s this pointer, the value of the
> > object or subobject thus obtained is unspecified."
> 
> Yeah, sounds like covering this case.  Thus we can make 'this' restrict in
> constructors (and possibly assignment operators if self-assignment is
> forbidden).

Self assignment is tricky and is OK to alias in most cases. It could be restricted at some point after the `this != &rhs` check (as proposed in Bug 82918). 

I'd rather start by "restricting this" for copy and move constructors, leaving assignment as is.
Comment 9 Antony Polukhin 2018-05-10 12:31:42 UTC
There's an identical issue for clang: https://bugs.llvm.org/show_bug.cgi?id=37329

During review of that issue Richard Smith noted that the solution could be made more generic by adding `__restrict` for `this` for any constructor (not just copy and move constructors).

Does the violation of noalias in GCC could be treated as unspecified behavior or is it undefined?
Comment 10 Marc Glisse 2018-05-10 15:11:20 UTC
This seems fixed in 8.1 (at least we don't generate the extra mov anymore), can you check?
Comment 11 Antony Polukhin 2018-05-10 17:03:18 UTC
Seems perfect https://godbolt.org/g/GX3GQd
The mov is not generated for any constructor and the following code:

extern struct A a;
struct A {
  int m, n;
  A(const A &v);
};

A::A(const A &v) : m(v.m), n((a.m = 1, v.m)) {}

Is not optimized to "A::A(int, const A &v) : m(v.m), n(v.m) { a.m = 1; }" (which is a mistake).


Are there some tests to make sure that the `mov` won't appear again?
Comment 12 Antony Polukhin 2018-05-10 17:39:43 UTC
(In reply to Marc Glisse from comment #10)
> This seems fixed in 8.1 (at least we don't generate the extra mov anymore),
> can you check?

Actually it still does not work for subobjects. For example https://godbolt.org/g/zPha3U

Code 

struct array {
    int d[2];
};

struct test {
    array data1;
    array data2;

    test(const array& t);
};

test::test(const array& t)
    : data1{t}
    , data2{t}
{}

produces assembly

test::test(array const&):
  mov rax, QWORD PTR [rsi]
  mov QWORD PTR [rdi], rax
  mov rax, QWORD PTR [rsi]   <== Not required. Could not alias
  mov QWORD PTR [rdi+8], rax
  ret

[class.ctor] paragraph 14 also covers this case:

"During the construction of an object, if the value of the object *or any of its subobjects* is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor’s this pointer, the value of the object or subobject thus obtained is unspecified."

Looks like not only `this` should be marked with __restrict, but also all the subobjects of the type.
Comment 13 Marc Glisse 2018-05-10 18:16:15 UTC
Explicitly marking the constructor with __restrict lets it optimize also the testcase in comment #12. I have no idea what was changed in gcc-8 that helped the original testcase, but it wasn't equivalent to marking constructors with __restrict :-(
Comment 14 Marc Glisse 2018-05-10 18:47:10 UTC
(In reply to Marc Glisse from comment #13)
> I have no idea what was changed in gcc-8 that
> helped the original testcase,

(optimization happens in FRE1)
It could be an optimization that says that either the objects don't alias and we are writing _1, or they are the same object and we are writing _1, so just write _1 without caring about aliasing. This was definitely discussed, but I didn't remember that it had been committed (bug 80617 is still open).

Anyway, if you want to experiment, the function build_this_parm (in gcc/cp/decl.c) seems like a good starting point.
Comment 15 Marc Glisse 2018-05-10 19:43:35 UTC
Created attachment 44112 [details]
Untested patch

Something like this, but I haven't tested, and other calls to build_this_parm need auditing to check if "being a constructor" is set before.
Comment 16 Marc Glisse 2018-05-11 19:05:46 UTC
(patch should use 'fn && DECL_CONSTRUCTOR_P (fn)' since fn can be NULL)

As I was half expecting, messing with the types that directly doesn't work. It means 'this' has type T*restrict, and if I try for instance to bind it with a T*const& reference, gcc is unwilling to discard the restrict qualifier.

I wonder if modifying the declarations later (say during gimplification) to mark them with restrict would work.

I'd rather avoid teaching the middle-end a second way that parameters can be __restrict (being the first argument of a C++ constructor), better if this can all be done in the front-end.
Comment 17 Marc Glisse 2018-05-11 21:20:41 UTC
Created attachment 44120 [details]
Successful patch

Changing the type during gimplification seems to work, it passes the testsuite.
Comment 18 rguenther@suse.de 2018-05-12 11:26:10 UTC
On May 11, 2018 11:20:41 PM GMT+02:00, "glisse at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899
>
>Marc Glisse <glisse at gcc dot gnu.org> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>  Attachment #44112 [details]|0                           |1
>        is obsolete|                            |
>
>--- Comment #17 from Marc Glisse <glisse at gcc dot gnu.org> ---
>Created attachment 44120 [details]
>  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44120&action=edit
>Successful patch
>
>Changing the type during gimplification seems to work, it passes the
>testsuite.

I suppose this changes debug information? I think adjusting the only user in tree-ssa-structalias.c is better.
Comment 19 Marc Glisse 2018-05-12 12:13:44 UTC
(In reply to rguenther@suse.de from comment #18)
> I suppose this changes debug information?

Yes. Probably not so bad, but indeed better if we can avoid it.

> I think adjusting the only user in tree-ssa-structalias.c is better.

I was trying to avoid adding more language-specific stuff to the middle-end, especially since I wasn't sure if constructors were also used by other languages, but I see the bit is now called cxx_constructor and tested with DECL_CXX_CONSTRUCTOR_P so it should be safe.
Comment 20 Marc Glisse 2018-05-12 12:16:24 UTC
Created attachment 44122 [details]
untested middle-end patch

This works on the testcase, I need to add a comment and run it through the testsuite.
Comment 21 rguenther@suse.de 2018-05-14 07:31:11 UTC
On Sat, 12 May 2018, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899
> 
> --- Comment #20 from Marc Glisse <glisse at gcc dot gnu.org> ---
> Created attachment 44122 [details]
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44122&action=edit
> untested middle-end patch
> 
> This works on the testcase, I need to add a comment and run it through the
> testsuite.

Looks good.  Note that in the strict C semantic thing __restrict on
this isn't valid as the following is valid C++:

Foo() __restrict
{
  Foo *x = this;
  x->a = 1;
  this->a = 2;
...
}

but for C restrict you'd have two pointers with same provenance here.

The middle-end handles this situation conservatively and thus the
middle-end approach of effectively handling it as restrict is fine.
Comment 22 Marc Glisse 2018-05-14 10:17:07 UTC
(In reply to rguenther@suse.de from comment #21)
> Note that in the strict C semantic thing __restrict on
> this isn't valid as the following is valid C++:
> 
> Foo() __restrict
> {
>   Foo *x = this;
>   x->a = 1;
>   this->a = 2;
> ...
> }
> 
> but for C restrict you'd have two pointers with same provenance here.


How is that different from the C

void Foo_const(Foo*const restrict t)
{
  Foo *x = t;
  x->a = 1;
  t->a = 2;
}

, which seems valid to me?
Comment 23 rguenther@suse.de 2018-05-14 10:31:25 UTC
On Mon, 14 May 2018, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82899
> 
> --- Comment #22 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to rguenther@suse.de from comment #21)
> > Note that in the strict C semantic thing __restrict on
> > this isn't valid as the following is valid C++:
> > 
> > Foo() __restrict
> > {
> >   Foo *x = this;
> >   x->a = 1;
> >   this->a = 2;
> > ...
> > }
> > 
> > but for C restrict you'd have two pointers with same provenance here.
> 
> 
> How is that different from the C
> 
> void Foo_const(Foo*const restrict t)
> {
>   Foo *x = t;
>   x->a = 1;
>   t->a = 2;
> }
> 
> , which seems valid to me?

it accesses the object pointed-to by the restrict qualified pointer
via an lvalue that is not based on it in my reading of C11 6.7.3.1/3
(though "some" sequence point sounds like that being x = t makes it
valid based on).  Otherwise for restrict to be usable by a compiler
it would need to perform conservative pointer propagation.  Like

void Foo(Foo **x, Foo * restrict t)
{
  *x = t;
  t->a = 1;
  (*x)->a = 2;
}

or with Foo *baz(Foo *);

 Foo *x = baz(t);
 t->a = 1;
 x->a = 2;

if x and t point to the same object my reading of the standard
says thats undefined (GCC of course treats it conservatively instead).
Comment 24 Marc Glisse 2018-05-18 22:21:52 UTC
Author: glisse
Date: Fri May 18 22:21:20 2018
New Revision: 260383

URL: https://gcc.gnu.org/viewcvs?rev=260383&root=gcc&view=rev
Log:
Aliasing 'this' in a C++ constructor

2018-05-18  Marc Glisse  <marc.glisse@inria.fr>

	PR c++/82899
gcc/
	* tree-ssa-structalias.c (create_variable_info_for_1): Extra argument.
	(intra_create_variable_infos): Handle C++ constructors.

gcc/testsuite/
	* g++.dg/pr82899.C: New testcase.


Added:
    trunk/gcc/testsuite/g++.dg/pr82899.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-structalias.c
Comment 25 Marc Glisse 2018-05-18 22:25:15 UTC
.