29286 – [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

Bug 29286 - [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

Summary: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type a...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	libstdc++ (show other bugs)
Version:	4.2.0

Importance:	P1 normal
Target Milestone:	4.3.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	alias, wrong-code

Depends on:
Blocks:	33407
	Show dependency tree / graph

Reported:	2006-09-29 14:13 UTC by Richard Biener
Modified:	2023-12-30 03:16 UTC (History)
CC List:	12 users (show)

See Also:
Host:
Target:
Build:
Known to work:	3.4.0 4.3.0
Known to fail:	4.0.0 4.1.0 4.1.1 4.1.2 4.2.0
Last reconfirmed:	2006-09-29 15:49:15

Attachments
Patch (5.49 KB, patch) 2007-05-02 16:57 UTC, Ian Lance Taylor	Details \| Diff
Patch (550 bytes, patch) 2007-05-03 20:33 UTC, Ian Lance Taylor	Details \| Diff
Patch (1.88 KB, patch) 2007-05-04 07:37 UTC, Ian Lance Taylor	Details \| Diff
Patch (7.91 KB, patch) 2007-05-12 00:22 UTC, Ian Lance Taylor	Details \| Diff
Patch (7.71 KB, patch) 2007-05-14 04:45 UTC, Ian Lance Taylor	Details \| Diff
Patch (7.78 KB, patch) 2007-05-14 18:14 UTC, Ian Lance Taylor	Details \| Diff
patch to perserve store ordering in loop load/store motion (1.00 KB, patch) 2007-05-23 20:46 UTC, Richard Biener	Details \| Diff
Patch (7.22 KB, patch) 2007-05-25 23:21 UTC, Ian Lance Taylor	Details \| Diff
Patch (8.27 KB, patch) 2007-05-30 23:18 UTC, Ian Lance Taylor	Details \| Diff
Patch (9.09 KB, patch) 2007-06-08 07:49 UTC, Ian Lance Taylor	Details \| Diff
Show Obsolete (6) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Richard Biener 2006-09-29 14:13:58 UTC

Placement new is specified to end life-time of the object starting life-time of a new one.  The current implementation messes up with aliasing though, so stores are re-ordered:

#include <new>
struct Foo { long i; };
struct Bar { void *p; };
long foo(int n)
{
  Foo *f = new Foo;
  f->i = 1;
  for (int i=0; i<n; ++i)
    {
      Bar *b = new (f) Bar;
      b->p = 0;
      f = new (f) Foo;
      f->i = i;
    }
  return f->i;
}

Produces .099t.optimized:

long int foo(int) (n)
{
  struct Foo * f;
  void * D.2530;

<bb 2>:
  D.2530 = operator new (8);
  f = (struct Foo *) D.2530;
  f->i = 1;
  if (n > 0) goto <L7>; else goto <L2>;

<L7>:;
  f->i = (long int) ((unsigned int) n - 1);
  ((struct Bar *) f)->p = 0B;

<L2>:;
  return f->i;

}

where the stores to Bar::p and Foo:i are reordered.  Until .147r.bbro
everything looks like above on RTL level, too:

...
;; Start of basic block 3, registers live: 1 [dx] 3 [bx] 7 [sp]
(note:HI 21 19 22 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(note:HI 22 21 48 3 NOTE_INSN_DELETED)

(insn 48 22 24 3 (set (reg:DI 0 ax [63])
        (zero_extend:DI (subreg:SI (plus:DI (reg:DI 3 bx [orig:61 n ] [61])
                    (const_int -1 [0xffffffffffffffff])) 0))) 195 {*lea_1_zext} (nil)
    (expr_list:REG_DEAD (reg:DI 3 bx [orig:61 n ] [61])
        (nil)))

(insn:HI 24 48 25 3 (set (mem/s:DI (reg/v/f:DI 1 dx [orig:58 f ] [58]) [3 <variable>.i+0 S8 A64])
        (reg:DI 0 ax [63])) 81 {*movdi_1_rex64} (insn_list:REG_DEP_TRUE 23 (nil))
    (expr_list:REG_DEAD (reg:DI 0 ax [63])
        (nil)))

(insn:HI 25 24 26 3 (set (mem/s/f:DI (reg/v/f:DI 1 dx [orig:58 f ] [58]) [5 <variable>.p+0 S8 A64])
        (const_int 0 [0x0])) 81 {*movdi_1_rex64} (nil)
    (nil))
;; End of basic block 3, registers live:
 1 [dx] 7 [sp]
...

but now (unfortunately) scheduling reorders them back (on x86_64), so the
assembly looks good again:

_Z3fooi:
.LFB15:
        pushq   %rbx
.LCFI0:
        movl    %edi, %ebx
        movl    $8, %edi
        call    _Znwm
        testl   %ebx, %ebx
        movq    %rax, %rdx
        movq    $1, (%rax)
        jle     .L2
        leal    -1(%rbx), %eax
        movq    $0, (%rdx)
        movq    %rax, (%rdx)
.L2:
        popq    %rbx
        movq    (%rdx), %rax
        ret

But with -O2 -fno-schedule-insns2 we finally get wrong-code for this:

_Z3fooi:
.LFB15:
        pushq   %rbx
.LCFI0:
        movl    %edi, %ebx
        movl    $8, %edi
        call    _Znwm
        movq    %rax, %rdx
        movq    $1, (%rax)
        testl   %ebx, %ebx
        jle     .L2
        leal    -1(%rbx), %eax
        movq    %rax, (%rdx)
        movq    $0, (%rdx)
.L2:
        movq    (%rdx), %rax
        popq    %rbx
        ret

Comment 1 Richard Biener 2006-09-29 14:15:26 UTC

The root of the problem is "wrong" alias information:

<bb 2>:
  #   SMT.6_27 = V_MAY_DEF <SMT.6_25>;
  #   SMT.7_28 = V_MAY_DEF <SMT.7_26>;
  D.2530_3 = operator new (8);
  f_4 = (struct Foo *) D.2530_3;
  #   SMT.6_29 = V_MAY_DEF <SMT.6_27>;
  f_4->i = 1;
  i_5 = 0;
  goto <bb 4> (<L1>);

<L0>:;
  D.2542_7 = 8;
  __p_8 = f_1;
  D.2544_9 = __p_8;
  D.2544_10 = D.2544_9;
  D.2535_11 = D.2544_10;
  b_12 = (struct Bar *) D.2535_11;
  #   SMT.7_30 = V_MAY_DEF <SMT.7_24>;
  b_12->p = 0B;
  D.2545_13 = 8;
  __p_14 = f_1;
  D.2547_15 = __p_14;
  D.2547_16 = D.2547_15;
  D.2536_17 = D.2547_16;
  f_18 = (struct Foo *) D.2536_17;
  D.2537_19 = (long int) i_2;
  #   SMT.6_31 = V_MAY_DEF <SMT.6_23>;
  f_18->i = D.2537_19;
  i_20 = i_2 + 1;

  # SMT.7_24 = PHI <SMT.7_28(2), SMT.7_30(3)>;
  # SMT.6_23 = PHI <SMT.6_29(2), SMT.6_31(3)>;
  # i_2 = PHI <i_5(2), i_20(3)>;
  # f_1 = PHI <f_4(2), f_18(3)>;
<L1>:;
  if (i_2 < n_6) goto <L0>; else goto <L2>;

<L2>:;
  #   VUSE <SMT.6_23>;
  D.2538_21 = f_1->i;
  return D.2538_21;

the stores to b_12->p and f_18->i appear to be independent.  But there is
an implicit barrier between them from the std::new.

Comment 2 Richard Biener 2006-09-29 14:17:39 UTC

4.1 get's it correct by luck.

Comment 3 Paolo Carlini 2006-09-29 14:29:56 UTC

Hi Richard. For sure, I'm missing many details here, having to do with alias analysis, but I'm puzzled anyway ;) I mean, the current libsupc++ "implementation" of placement new is:

// Default placement versions of operator new.
inline void* operator new(std::size_t, void* __p) throw() { return __p; }
inline void* operator new[](std::size_t, void* __p) throw() { return __p; }

Can you explain a bit how a fix would look like?

Comment 4 Richard Biener 2006-09-29 14:38:21 UTC

One way to paper over the problem is to move std::new out-of-line :(  Otherwise I cannot see how we can fix this in libsupc++ without gcc help.  Basically we somehow need to insert (at least) a memory barrier here, like with

inline void* operator new(std::size_t, void* __p) throw() { 
  __asm__ volatile ("" : : : "memory");
  return __p; 
}

but that might pessimize code too much.

Comment 5 Paolo Carlini 2006-09-29 14:52:49 UTC

(In reply to comment #4)
> One way to paper over the problem is to move std::new out-of-line :(  Otherwise
> I cannot see how we can fix this in libsupc++ without gcc help.  Basically we
> somehow need to insert (at least) a memory barrier here, like with
> 
> inline void* operator new(std::size_t, void* __p) throw() { 
>   __asm__ volatile ("" : : : "memory");
>   return __p; 
> }
> 
> but that might pessimize code too much.

In case, memory barriers are available, and already used in guard.cc, but indeed, naively, something seems wrong with such a pure-library approach: a memory barrier needed for single thread code?!?!

Comment 6 Andrew Pinski 2006-09-29 15:49:15 UTC

Actually I think it should be:
inline void* operator new(std::size_t, void* __p) throw() { 
void *p1;
  __asm__  ("" : "=r"(p1): "0"(__p): "memory");
  return __p; 
}

Which is less pessimize.

Note the reason why you need this memory barrier is because the C++ standard says inplacement news change the dynamic type.

Comment 7 Andrew Pinski 2006-09-29 15:51:21 UTC

Note here is a slightly better testcase as now we know that Foo and Bar are the same size:
#include <new>
struct Foo { void *i; };
struct Bar { void *p; };
long foo(int n, int *t)
{
  Foo *f = new Foo;
  f->i = t;
  for (int i=0; i<n; ++i)
    {
      Bar *b = new (f) Bar;
      b->p = 0;
      f = new (f) Foo;
      f->i = t;
    }
  return f->i;
}

Comment 8 Paolo Carlini 2006-09-29 16:08:25 UTC

Let's suppose for a moment we actually try to fix this issue in the library: is adding a barrier conforming to the letter of 18.4.1.3/2-3, 5-6, that is:

  Returns: ptr.
  Notes: Intentionally performs no other action.

Comment 9 Andrew Pinski 2006-09-29 16:19:19 UTC

(In reply to comment #8)
> Let's suppose for a moment we actually try to fix this issue in the library: is
> adding a barrier conforming to the letter of 18.4.1.3/2-3, 5-6, that is:
> 
>   Returns: ptr.
>   Notes: Intentionally performs no other action.

There has to be some communutation between the library and the compiler to tell it that the memory location is being reused as defined by 3.8 of the standard.

Comment 10 Paolo Carlini 2006-09-29 16:23:47 UTC

(In reply to comment #9)
> There has to be some communutation between the library and the compiler to tell
> it that the memory location is being reused as defined by 3.8 of the standard. 

Yes. But considering that the placement forms cannot be replaced by an user program, isn't conceptually cleaner for the C++ front-end to simply recognize those forms and do the right thing?

Comment 11 Richard Biener 2006-09-29 16:24:53 UTC

I don't really know.  But for this issue I would like to introduce middle-end no-op builtins __builtin_clobber (void*) clobber the pointer argument alias set
like we do for malloc/free (and other calls):

int foo(void)
{
  int i, *p, *q;
  p = malloc(4);
  *p = 0;
  free(p);
  q = malloc(4);
  *q = 1;
  i = *q;
  free(q);
  return i;
}

<bb 2>:
  #   HEAP.4_10 = V_MAY_DEF <HEAP.4_8>;
  #   HEAP.5_11 = V_MAY_DEF <HEAP.5_9>;
  D.1530_1 = malloc (4);
  p_2 = (int *) D.1530_1;
  #   HEAP.4_12 = V_MAY_DEF <HEAP.4_10>;
  *p_2 = 0;
  #   HEAP.4_13 = V_MAY_DEF <HEAP.4_12>;
  #   HEAP.5_14 = V_MAY_DEF <HEAP.5_11>;
  free (p_2);
  #   HEAP.4_15 = V_MAY_DEF <HEAP.4_13>;
  #   HEAP.5_16 = V_MAY_DEF <HEAP.5_14>;
  D.1531_3 = malloc (4);
  q_4 = (int *) D.1531_3;
  #   HEAP.5_17 = V_MAY_DEF <HEAP.5_16>;
  *q_4 = 1;
  #   VUSE <HEAP.5_17>;
  i_5 = *q_4;
  #   HEAP.4_18 = V_MAY_DEF <HEAP.4_15>;
  #   HEAP.5_19 = V_MAY_DEF <HEAP.5_17>;
  free (q_4);
  D.1532_6 = i_5;
  return D.1532_6;


or even just a way to add a VUSE of the pointer alias sets, or even
void * __builtin_new_object (void *p) which would do

  # VUSE <p>
  # HEAP.n = V_MAY_DEF <HEAP.n>
  p = __builtin_new_object (p)

i.e. associate a new alias tag with the return value.

Can you benchmark the workaround in the library?

Comment 12 Richard Biener 2006-09-29 16:25:40 UTC

Yes, maybe we can do something in the C++ frontend.

Comment 13 Andrew Pinski 2006-09-29 16:58:07 UTC

And here is a testcase that fails also for 4.0.0 and self contained also:
#include <new>
int foo(int n, int *f, long *f1)
{
  long t = 0;
  *f = 1;
  long *fp = new(f) long;
  *fp = 1;
  for (int i=0; i<n; ++i)
    {
      t += *fp;
      int *b = new (f) int;
      *b = i + *f1;
      fp = new (f) long;
      *fp = t*t;
    }
  t+=*fp;
  return t+*f1;
}

extern "C" void abort ();

int main(void)
{
  if (sizeof(int) != sizeof(long)
    return 0;
  int *a = new int;
  long *b = new long;
  if (foo(2, a, b) != 6)
    abort ();
  return 0;
}

Comment 14 Andrew Pinski 2006-09-29 17:01:36 UTC

(In reply to comment #13)
> And here is a testcase that fails also for 4.0.0 and self contained also:
replace main with:
int main(void)
{
  if (sizeof(int) != sizeof(long))
    return 0;
  int *a = new int;
  long *b = new long;
  *a = 0;
  *b = 0;
  if (foo(2, a, b) != 6)
    abort ();
  return 0;
}

----
This is what you get when thinking about 2 steps ahead of yourself.

Comment 15 Mark Mitchell 2006-10-01 18:39:38 UTC

I cannot see any plausible way to argue that this is a library bug.  Implementation of placement new should not need barriers or other compiler intrinsics.  It's twisted to argue that the C++ standard itself would prevent implementation of placement new without non-standard facilities, since the people that wrote the standard all certainly expected the implementation to be just:

  void *operator new(size_t, void *p) {
    return p;
  }

Variations on that definition also occur in various user-defined new operators, including those for class-scoped operators.  I don't think we can reasonably ask people to change this code, which works with every other C++ compiler.  Therefore, the only possible conclusion is that either the compiler is buggy or that the examples are invalid.  Unfortunately, I don't think Andrew's example is invalid.

There are two possible approaches to fixing the compiler: either (a) the C++ front end should mark all new operators as special, in some way that the middle end then uses to avoid this over-optimization, or (b) relax the aliasing rules used by the middle end.

I think the right solution is (b).  Fundamentally, I'm not sure that the object models in C and C++ are sufficiently well-defined to permit the kind 
of optimization that we are discussing.  I also think that people use functions not named "operator new" to do the kinds of things being discussed (i.e., to allocate blocks of memory) and that C and C++ are intended to support that usage.  Certainly, in C, people can't use "operator new".  So, I'd probably suggest that we relax the aliasing rules, at least for heap-allocated objects.

As an example of the kind of problems the standard poses, [basic.life] says:

"The  lifetime  of  an object is a runtime property of the object.  The
  lifetime of an object of type T begins when:

  --storage with the proper alignment and size for type T  is  obtained,
    and

  --if  T is a class type with a non-trivial constructor (_class.ctor_),
    the constructor call has completed.

  The lifetime of an object of type T ends when:

  --if T is a class type with a non-trivial  destructor  (_class.dtor_),
    the destructor call starts, or

  --the storage which the object occupies is reused or released."

So, by that definition, it might seem that:

  int i = 3;
  *((float *)(&i)) = 7.0;

is valid; we're reusing the "int" memory, as a "float", and it has the right size and alignment (on most systems).  But, that would effectively nullify the [basic.lval] language that suggests that you can't access an object through a type other than its dynamic type -- or, at least, it would limit such access to writes.  I don't think anyone would expect the dynamic type of "i" to be "float"  in this example, even if someone had written:

  new (&i) float;

Surely, the dynamic type of something explicitly declared "int" must be "int".  But, for something with dynamic storage duration, perhaps we should assume that any write to the storage, with a type other than the one that it already has, forces it to possibly alias things of both types.

Comment 16 Mike Stump 2006-10-02 09:32:31 UTC

The dynamic type of the object at &i is indeed float and has the value 7.0 (iff align and sizes work out).  We permitted this so that can can do:

  class C { ... };
  char buf[1024];
  new (&buf[n]) C

and have the dynamic type of the storage be C.  You can know the dynamic type is C, by doing a virtual dispatch, or by asking for the name of the type with rtti.

Likewise, you can do:

  new (&i) float(7.0)

with the same effects.  The dynamic type is float, and the value is 7.0, and this is in all ways the same as:

  *(float*)&i = 7.0

Thinking about this some more, it would appear that it is unsafe to reorder writes of otherwise non-conflicting types past each other as type based analysis alone isn't enough to ensure they don't conflict.  It might be reasonable to add the C rules that the dynamic type _must_ match the static (declared) type, if there is one for types other than pointer or pointer to member the the language standard and to clarify the dynamic type of an object when a bitwise copy of the object is made to better match the C rules.

Reads should be fine as they are required to be the same type as the write (or character).  In practice, if the optimizer can't see the placement new, it can't reorder things so this is safe; if it can see it, then it should do value based analysis in addition to type analysis and conclude they conflict (or can conflict) and still avoid doing the wrong thing.

I do agree, that for now, this isn't a bug in the library.  It could be made a bug in the library by requring a placement new operator as the only way in which storage can be reused as a different dynamic type.

Comment 17 Mark Mitchell 2006-10-02 17:48:05 UTC

I disagree with Mike's analysis.  I certainly understand the goals of placement new, but I don't think that anyone intended this code:

  int i;
  *(float *)&i = 7.0;
  
to be valid.  I don't even think people intended this variant:

  int i;
  new (&i) float(7.0);

to be valid.  Note that a consequence of allowing this is that:

  int f() {
   int i = 3;
   new (&i) float(7.0);
   return i;
  }

is undefined -- not because of the overwriting of i, but because the return statement reads memory of the wrong type.  Mike's position is that accessing a variable using its static type is not always safe; I find this bizarre.

My personal opinion is that the authors of the standard just didn't think of all the possible interactions of the various aspects.  I believe placement new was intended to handle only arrays of characters, consistent with the special dispensation given to character pointers.

However, I don't think my differences with Mike can be resolved without going directly to the standards committee.  Fortunately, from the compiler point of view, I'm saying "punt" and Mike is saying "punt harder".  So, it's probably just a question of in how many cases we decide to disable type-based alias analysis.

Comment 18 Mike Stump 2006-10-02 19:28:12 UTC

What is your position based upon?  Mine is based upon having been in the room when we decided what the C rules probably were, what the C++ rules could be and the up side and down side of each choice and where we decided what our intent was going to be and lots of word smithing sessions as well.  You can see some of the depth with which we discussed it in our formulation of the wording in the standard:

6 Similarly,  before the lifetime of an object has started but after the
  storage which the object will occupy has been allocated or, after  the
  lifetime  of  an  object  has  ended  and before the storage which the
  object occupied is reused or released, any lvalue which refers to  the
  original  object may be used but only in limited ways.  Such an lvalue
  refers to allocated  storage  (_basic.stc.dynamic.deallocation_),  and
  using the properties of the lvalue which do not depend on its value is
  well-defined.  If  an  lvalue-to-rvalue  conversion  (_conv.lval_)  is
  applied  to such an lvalue, the program has undefined behavior; if the
  original object will be or was of a non-POD class  type,  the  program
  has undefined behavior if:

  --the lvalue is used to access a non-static data member or call a non-
    static member function of the object, or

  --the lvalue is implicitly converted (_conv.ptr_) to a reference to  a
    base class type, or

  --the   lvalue   is   used   as   the   operand   of   a   static_cast
    (_expr.static.cast_) (except when the conversion  is  ultimately  to
    char& or unsigned char&), or

  --the   lvalue   is   used   as   the   operand   of   a  dynamic_cast
    (_expr.dynamic.cast_) or as the operand of typeid.

7 If, after the lifetime of an object has ended and before  the  storage
  which  the object occupied is reused or released, a new object is cre-
  ated at the storage location which the  original  object  occupied,  a
  pointer that pointed to the original object, a reference that referred
  to the original object, or the name of the original object will  auto-
  matically  refer  to  the new object and, once the lifetime of the new
  object has started, can be used to manipulate the new object, if:

  --the storage for the new object exactly overlays the storage location
    which the original object occupied, and

  --the  new object is of the same type as the original object (ignoring
    the top-level cv-qualifiers), and

  --the original object was a most derived  object  (_intro.object_)  of
    type  T  and the new object is a most derived object of type T (that
    is, they are not base class subobjects).  [Example:
      struct C {
              int i;
              void f();
              const C& operator=( const C& );
      };
      const C& C::operator=( const C& other)
      {
              if ( this != &other )
              {
                      this->~C();             // lifetime of *this ends
                      new (this) C(other);    // new object of type C created
                      f();                    // well-defined
              }
              return *this;
      }
      C c1;
      C c2;
      c1 = c2;                        // well-defined
      c1.f();                         // well-defined; c1 refers to a new object of type C
     --end example]

8 If a program ends the lifetime of an object  of  type  T  with  static
  (_basic.stc.static_)  or automatic (_basic.stc.auto_) storage duration
  and if T has a non-trivial destructor,35) the program must ensure that
  an object of the original type occupies  that  same  storage  location
  when  the implicit destructor call takes place; otherwise the behavior
  of the program is undefined.  This is true even if the block is exited
  with an exception.

We explcitly did consider overlaying one type with a completely different type and did so intend for the standard to allow that.  Further, our intent of saying that automatics and objects with static duration could be so changed provided certain conditions were met was meant no only to state that this could be done, but to state the exact sitations in which it was defined to do so.

> Mike's position is that accessing a variable using its static type is not always safe; I find this bizarre.

Welcome to C++, let me quote the standard:

8 If a program ends the lifetime of an object  of  type  T  with  static
  (_basic.stc.static_)  or automatic (_basic.stc.auto_) storage duration
  and if T has a non-trivial destructor,35) the program must ensure that
  an object of the original type occupies  that  same  storage  location
  when  the implicit destructor call takes place; otherwise the behavior
  of the program is undefined.

If we didn't mean exactly what we wrote, why on earth would we write it?  Hint, in this, we meant exactlty what was written.  I was there, in every single meeting where we talked about it.  I know what our intent was, and I know what the standard says.  At no time does the standard says it is bizarre.  It prescribe what you, the user _must_ do, if it is to work.  You have to swap the type back to a valid type, before the system has a chance to notice that the type is wrong, for it to work, and then, only if there is a non-trivial dtor.  A trivial dtor doesn't even need to do that, as stated.  You can imagine it doesn't state

Comment 19 Ian Lance Taylor 2006-10-03 16:03:17 UTC

Mike suggests: "it would appear that it is unsafe to reorder writes of otherwise non-conflicting types past each other as type based analysis alone isn't enough to ensure they don't conflict."

That would be bad in the general case.  It would not be as bad as prohibiting the reorder of reads and writes, but it would be bad.  Let's find a way such that we do not have to do that.

Fortunately I believe that in a correct program we only have a problem when we can actually see the placement new (can any disprove that)?  I personally don't have a problem with saying that placement new is special.  When placement new is used, it has to move the pointer into alias set 0.

With regard to Mark's comment #15, the problem here is not heap allocation, as I see it.  The C standard explains how to use heap allocation: use a char* pointer.  The problem is doing something akin to heap allocation without using a char* pointer.

Comment 20 Mark Mitchell 2006-10-03 16:13:32 UTC

Subject: Re:  [4.0/4.1/4.2 Regression] placement new
 does not change the dynamic type as it should

ian at airs dot com wrote:

> Fortunately I believe that in a correct program we only have a problem when we
> can actually see the placement new (can any disprove that)?  I personally don't
> have a problem with saying that placement new is special.  When placement new
> is used, it has to move the pointer into alias set 0.

What about in C, as opposed to C++?

I agree that the obvious special cases are access through char*, arrays 
of characters, and placement new.  But, I'm afraid that there are lots 
of other allocation functions out there that are not spelled "operator 
new", and requiring decoration for them isn't something that (as far as 
I know) other compilers require.

Comment 21 Ian Lance Taylor 2006-10-03 23:44:06 UTC

In C a general allocation function should work with a char array.  A specific allocation function should use a union.  I don't think there are many existing exceptions to these guidelines.

I think that code like that in PR 29272, which casts Node* to void* to Foo*, is an unusual case.  And I believe we can handle that code correctly because of the use of memcpy.  And if the code didn't use memcpy it would be clearly noncomformant--which isn't to say that we should deliberately break it, but we don't need to try extra hard to make it work.

So I don't see a serious problem in C either.  Am I missing something/

Comment 22 Mark Mitchell 2006-10-04 05:39:35 UTC

Subject: Re:  [4.0/4.1/4.2 Regression] placement new
 does not change the dynamic type as it should

ian at airs dot com wrote:
> ------- Comment #21 from ian at airs dot com  2006-10-03 23:44 -------
> In C a general allocation function should work with a char array.  A specific
> allocation function should use a union.  I don't think there are many existing
> exceptions to these guidelines.

> So I don't see a serious problem in C either.  Am I missing something/

I think there are two remaining issues:

   int i;
   *(float*)(&i) = 7.0;

IIUC, Mike's position is that this is valid -- and that, in fact, after 
this point "i" can no longer be accessed as an int.  Do you agree?  Or 
do you think that this is (should be?) undefined behavior?

I think you are saying that:

   int i;
   new (&i) float;

is valid, and that after this point accessing "i" as an int is invalid. 
  Is that correct?  If so, I assume that you conider the following C 
code valid as well:

   int i;
   float f;
   memcpy (&i, &f, sizeof (i));

?

Also, in C++, given:

   extern void f(int *);
   void g() {
     int i;
     f(&i);
     /* HERE */
   }

do you believe that at the point marked HERE the dynamic type of "i" is 
indeterminate?  I think that if we allow the examples above, we have to 
assume that something may have reset the dynamic type of "i" here, 
meaning that we cannot assume that future stores through "float *" in 
the function do not alias "i".

Comment 23 Richard Biener 2006-10-09 08:22:20 UTC

One point to remember is that C does not allow re-using of storage with a different type (which is what PR29272 is about and why that testcase is invalid).

The storage type is either the declared one or the one assigned by virtue of the
first assignment (or memcpy).  So,

 int i;
 float f;
 memcpy(&f, &i, sizeof(f));

is valid, it doesn't change fs dynamic type but assigns it the bit-pattern
of i.

What the C++ standard seems to imply is that the storage type of a bunch of
memory (or an object with automatic storage) changes on "assignment".  So, indeed for C++ re-ordering writes is not allowed, and escaping pointers
must be assumed to change dynamic type.

So for C++ and 4.2 the best solution looks like to disable type based aliasing
completely.  For C I'm not sure the behavior the standard mandates is very
appealing - at least changing dynamic type via the use of memcpy should be
allowed as a GCC extension (like we allow type-punning via the union trick).

Comment 24 Gabriel Dos Reis 2006-10-24 06:53:23 UTC

Subject: Re:  [4.0/4.1/4.2 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Comment #22 from mark at codesourcery dot com  2006-10-04 05:39 -------
| Subject: Re:  [4.0/4.1/4.2 Regression] placement new
|  does not change the dynamic type as it should
| 
| ian at airs dot com wrote:
| > ------- Comment #21 from ian at airs dot com  2006-10-03 23:44 -------
| > In C a general allocation function should work with a char array.  A specific
| > allocation function should use a union.  I don't think there are many existing
| > exceptions to these guidelines.
| 
| > So I don't see a serious problem in C either.  Am I missing something/
| 
| I think there are two remaining issues:
| 
|    int i;
|    *(float*)(&i) = 7.0;
| 
| IIUC, Mike's position is that this is valid -- and that, in fact, after 
| this point "i" can no longer be accessed as an int.  Do you agree?  

I don't see how that code is supported by the paragraphs quoted by Mike.
There was a recent discussion (at most two weeks ago) on the C++
standard reflector -core about similar topic.  I do agree with Mark
that the proper way to resolve this is to go to the C++ committee.
It would be incredible that C++ was less type-strict than C.

-- Gaby

Comment 25 Daniel Berlin 2006-12-29 20:28:23 UTC

So the solution for 4.3 here, from what i understand, seems pretty easy at the tree level, no matter what reading we wish to give this.

If we add a placement_new_expr, and not try to revisit our interpretation of the standard, we can just DTRT and fix placement new. This would be best for optimizations, and IMHO, for users.

Otherwise, we are at the point in 4.3 (In the past month this has changed, it's not really possible to backport this stuff to 4.2 very easily) where we have two types of variables from an aliasing perspective:

SSA variables
Global/static variables

For both, we know if they are *actually* dereferenced (and not just escaping to a call, and thus "possibly dereferenced").

SSA variables never escape, and all casting will have been done in a statement that leads to the SSA variable. The SSA variable will only appear on the LHS once The only time they are currently pruned using TBAA information is at the dereference site (IE b_3 = *foo_5). This seems right to me, as it should have the correct dynamic type at this point, and anywhere else we prune using TBAA info without an explicit dereference is a bug. Under the reading that stores with different types are legal and start new object lifetimes, we would have to change the pruning to only prune at *load* dereference sites.

Under this reading, we would never be able to reorder stores based on TBAA because they are never illegal under the reading that they start new object lifetimes.

For globals/statics ("bare variables"), we currently will prune even if they are only "maybe-dereferenced". I think this is a bug under our current interpretation, and we should fix it.
Because bare variables may appear on the LHS multiple times, we have a couple options, depending on the readings.

Note that any of these require IPA and in the case of true globals, whole-program mode, in order to *ever be able to reorder a store/load based on TBAA*.

We could gather the types that occur, and prune at load dereference sites assuming it is one of these, *if the variable does not escape the module*.

Most statics do escape through library calls that we could add attributes to, but don't escape otherwise.
So without attributing, you will lose roughly 90-100% of TBAA enabled store/load reordering of globals and statics.
With attributing, you will probably lose 10-20% of TBAA enabled *load* reordering for statics, 100% of TBAA enabled reordering for globals (unless you promote them), and 100% of TBAA enabled store reordering.
We could disable TBAA entirely for globals and statics, and you will just lose 100% everywhere :)

TBAA accounts for about 7-25% of alias disambiguations in the real world for C and C++ (combined). This is according to both papers published by Intel about ICC, papers published by MS about VC, papers published by IBM about XLC, and testing of Google's codebase with and without -fno-strict-aliasing (with the caveat that we know our codebase is not entirely strict aliasing safe)

That said, it is wildly variant from program to program as to how effective it is, and pretty unpredictable just from staring at code.

I'll leave the decision about what we want to do up to someone else, i'm just explaining the aliasing options we have.

Comment 26 Mark Mitchell 2007-01-01 00:41:43 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

dberlin at gcc dot gnu dot org wrote:

> If we add a placement_new_expr, and not try to revisit our interpretation of
> the standard, we can just DTRT and fix placement new. This would be best for
> optimizations, and IMHO, for users.

I agree that treating placement new specially makes sense.  The first
argument to a placement new operator could be considered to have an
unspecified dynamic type on entrance to the operator, while the return
value has the dynamic type specified by the operator.  (So that the
pointer returned by "new (x) int" has type "int *".)

I'm not sure that placement_new_expr is the best way to accomplish this,
but, maybe it is.  Another possibility would be to define an attribute
or attributes to specify the dynamic type of arguments and return types,
and then have the C++ front end annotate all placement new operators
with those attributes.

I don't think that the other approach you sketch is very attractive
because it sounds like we're going to lose a lot of our TBAA opportunities.

Comment 27 Daniel Berlin 2007-01-02 03:01:34 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 1 Jan 2007 00:41:44 -0000, mark at codesourcery dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #26 from mark at codesourcery dot com  2007-01-01 00:41 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
>
> dberlin at gcc dot gnu dot org wrote:
>
> > If we add a placement_new_expr, and not try to revisit our interpretation of
> > the standard, we can just DTRT and fix placement new. This would be best for
> > optimizations, and IMHO, for users.
>
> I agree that treating placement new specially makes sense.  The first
> argument to a placement new operator could be considered to have an
> unspecified dynamic type on entrance to the operator, while the return
> value has the dynamic type specified by the operator.  (So that the
> pointer returned by "new (x) int" has type "int *".)

Right.

>
> I'm not sure that placement_new_expr is the best way to accomplish this,
> but, maybe it is.  Another possibility would be to define an attribute
> or attributes to specify the dynamic type of arguments and return types,
> and then have the C++ front end annotate all placement new operators
> with those attributes.
It would be nice if we could transform those attributes on
gimplification to something like an
an "alias preserving cast" (or something of that nature) that states
that the  cast is type unioning for alias purposes (IE that the
possible types of the result for TBAA/etc purposes is the union of the
type of the cast and the type of the cast's operand)..
Not a fully fleshed out idea, just something that popped into my head.

Comment 28 Mark Mitchell 2007-01-02 03:24:30 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

dberlin at dberlin dot org wrote:

> It would be nice if we could transform those attributes on
> gimplification to something like an
> an "alias preserving cast" (or something of that nature) that states
> that the  cast is type unioning for alias purposes (IE that the
> possible types of the result for TBAA/etc purposes is the union of the
> type of the cast and the type of the cast's operand)..

That sounds reasonable to me.

Comment 29 Ian Lance Taylor 2007-05-02 16:57:22 UTC

Created attachment 13497 [details]
Patch

Here is one approach which fixes the test case.  This introduces a new tree code, ALIASING_CONVERT_EXPR.  It is conveyed into RTL via a flag on REGS: REG_ALIAS_ALL.   I didn't try to really union the alias sets, I just said that the result of placement new can alias anything.  This patch is essentially untested.

I'm not very happy with this approach because it doesn't fail safe: it's too easy to lose the special aliasing, and then the problem appears again, but only with a more complicated test case.  A safer approach might be to change the type returned by placement new and mark it as TYPE_REF_CAN_ALIAS_ALL, but then I'm worried about type conversion and type comparison problems, since it isn't actually a different type.

Comment 30 Andrew Pinski 2007-05-02 17:10:49 UTC

Only one suggestion for the patch, instead of:
-  if (is_gimple_cast (t) || TREE_CODE (t) == VIEW_CONVERT_EXPR)
+  if (is_gimple_cast (t) || TREE_CODE (t) == VIEW_CONVERT_EXPR
+      || TREE_CODE (t) == ALIASING_CONVERT_EXPR)

Shouldn't ALIASING_CONVERT_EXPR be considered a gimple cast?  Yes you might need to change some places which use is_gimple_cast but it seems like a better idea to treat it as a cast rather than anything else.

Comment 31 Andrew Pinski 2007-05-02 17:14:41 UTC

Also should we fold the case where ALIASING_CONVERT_EXPR and the argument type are the same.  Also I think this right now causes a regression on the tunk interacting with the patch which fixed PR 31146.

Comment 32 Mark Mitchell 2007-05-02 21:41:00 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:

> Here is one approach which fixes the test case.  This introduces a new tree
> code, ALIASING_CONVERT_EXPR.  It is conveyed into RTL via a flag on REGS:
> REG_ALIAS_ALL.   I didn't try to really union the alias sets, I just said that
> the result of placement new can alias anything.  This patch is essentially
> untested.

I think this is a reasonable approach  I agree that it will require care
to make sure this is threaded through the compiler in all places (for
example, we may need variants of STRIP_NOPs that do/don't strip it), but
anything else is going to be too pessimistic.

I think that creating a separate type, with TYPE_REF_CAN_ALIAS_ALL set,
is worse: it's not the type of the expression that matters, but the
action of the expression itself.  It's the act of placement-newing that
destroys type information; after that, the type that you have is what
you expect.

Comment 33 Ian Lance Taylor 2007-05-03 20:33:41 UTC

Created attachment 13504 [details]
Patch

Here is a much simpler patch which also fixes the problem.  This simply inserts a memory clobber before each placement new.  This is safe, but is not as efficient as possible.  How much do we care about optimizing placement new?  How often is placement new used in a way which we could in practice optimize anyhow?

For the test case in this PR, the penalty is pretty severe: gcc can no longer eliminate the loop.  But this code is obviously artificial in that the loop serves no purpose.

Comment 34 Paolo Carlini 2007-05-03 21:45:23 UTC

First blush, I'm not enthusiastic about this last proposal. Seems to me just a concealed way to do what Richard suggested at the beginning of this thread. As such, I'm still finding it, in spirit at least, against the letter of 18.4.1.3/1-3, according to which placement new returns the ptr, and "*intentionally* (emphasis mine) performs no other action".

Comment 35 Andrew Pinski 2007-05-03 21:48:47 UTC

(In reply to comment #34)
> "*intentionally* (emphasis mine) performs no other action".

There is no other action taken, it is just a barrior, not even really an action in my mind.

Comment 36 Richard Biener 2007-05-03 21:59:26 UTC

Well, the patch in comment #33 really does what I did with the patch to libstdc++.
If we take the first patch (from comment #29) and change its effects on the alias machinery so that for

  q = ALIAS_CONVERT_EXPR (p)

we have two VDEFs on qs and ps alias tags like

  # SMT.1_2 = VDEF <SMT.1_1>
  # SMT.2_2 = VDEF <SMT.2_1>
  q = ALIAS_CONVERT_EXPR (p)

(and optimize q = ALIAS_CONVERT_EXPR (p) to q = p if the types of q and p
are the same, as pinskia noted), then we are safe at the tree level (the
ALIAS_CONVERT_EXPR represents a load/store barrier for qs and pq type).

At expansion time we'd need to do the same, which would be emitting clobbers
for mem expressions representing dereferences of q/p.

I believe we cannot really do better here.

Comment 37 Paolo Carlini 2007-05-03 22:24:58 UTC

By the way, would it be possible to have a look at the assembly of competitors? I'm under the impression that a quick look would more than enough to understand whether we are optimizing "enough"...

Comment 38 Mark Mitchell 2007-05-04 03:47:14 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:
> ------- Comment #36 from rguenth at gcc dot gnu dot org  2007-05-03 21:59 -------
> Well, the patch in comment #33 really does what I did with the patch to
> libstdc++.

I think that Ian's second patch (the heavy hammer of a memory clobber,
generated by the front end) is also reasonable.  It's more heavy-handed
than Ian's earlier solution, but certainly less intrusive and easier to
get right.

I think that either of these solutions will make user code work as
expected, which is the most important consideration.  I guess my
inclination would be to go with the heavy hammer, until/unless we start
finding real code that's seriously impacted.

Comment 39 Wolfgang Bangerth 2007-05-04 04:54:31 UTC

From my perspective as a user, I wouldn't find a memory clobber
in front of every placement new all that terrible. People don't
do that in tight loops, and even if they did all that would happen
is that the clobber presents a barrier across which no memory
accesses can be moved and across which no dead stores etc can
be eliminated. Presumably with every placement new/delete, there
comes some housekeeping of available memory, so there will be
a significant amount of code before and after, and so a single
barrier isn't going to have the effect of slowing down a code by
any significant factor.

This appears to me to be a case where one could overengineer a solution.
I would just go with the memory clobber and see whether we get bug
reports of people complaining about missed optimizations :-)

W.

Comment 40 Ian Lance Taylor 2007-05-04 07:37:14 UTC

Created attachment 13506 [details]
Patch

Here is yet another patch, an improvement over the previous one.  It finally dawned on me that we can use an asm to represent the exact memory operations we care about, namely a use of the memory accessed via the old type and a def of the memory accessed via the new type.  This way we don't mess with alias sets at all per se, we just put down a barrier for exactly the memory addresses that matter (of course the effect may be to VUSE and VDEF SMT tags, but the aliasing machinery takes care of that for us).

This patch still needs some more testing--in particular I need to make sure it does the right thing for templates--but it passes bootstrap and the g++ testsuite.

Richard, could you test it on your C++ test case(s)?  Thanks.

Comment 41 Paolo Carlini 2007-05-04 09:41:34 UTC

In case it wasn't clear already, many thanks Ian and everyone for fixing this annoying issue.

Comment 42 Richard Biener 2007-05-04 11:45:23 UTC

I put it on the tester.  A testcase that does regress (simplified from tramp3d):

template <class T, int D>
class Vector
{
public:
   Vector()
   {
     for (int i = 0; i < D; ++i)
        new (&x_m[i]) T();
   }
   T& operator[](int i) { return x_m[i]; }

private:
   T x_m[D];
};

void foo(Vector<double, 3> *m)
{
  Vector<double, 3> v;
  v[0] = 1.0;
  v[1] = 2.0;
  v[3] = 3.0;
  *m = v;
}

where we can no longer optimize the redundant stores to v:

<bb 2>:
  v.x_m[0] = 0.0;
  __asm__ __volatile__("":"=m" v.x_m[0]:"m" v.x_m[0]);
  v.x_m[1] = 0.0;
  __asm__ __volatile__("":"=m" v.x_m[1]:"m" v.x_m[1]);
  v.x_m[2] = 0.0;
  __asm__ __volatile__("":"=m" v.x_m[2]:"m" v.x_m[2]);
  v.x_m[0] = 1.0e+0;
  v.x_m[1] = 2.0e+0;
  v.x_m[3] = 3.0e+0;
  *m = v;
  return;

vs.

<bb 2>:
  v.x_m[2] = 0.0;
  v.x_m[0] = 1.0e+0;
  v.x_m[1] = 2.0e+0;
  v.x_m[3] = 3.0e+0;
  *m = v;
  return;

and assembly:

_Z3fooP6VectorIdLi3EE:
.LFB17:
        xorl    %r9d, %r9d
        movq    %r9, -40(%rsp)
        movq    %r9, -32(%rsp)
        movq    %r9, -24(%rsp)
        movabsq $4607182418800017408, %r8
        movabsq $4611686018427387904, %rsi
        movq    -24(%rsp), %rax
        movq    %r8, -40(%rsp)
        movq    %rsi, -32(%rsp)
        movq    -40(%rsp), %rcx
        movq    -32(%rsp), %rdx
        movq    %rax, 16(%rdi)
        movq    %rcx, (%rdi)
        movq    %rdx, 8(%rdi)
        ret

vs.

_Z3fooP6VectorIdLi3EE:
.LFB17:
        movabsq $4607182418800017408, %r8
        movabsq $4611686018427387904, %rsi
        movq    $0, -24(%rsp)
        movq    %r8, -40(%rsp)
        movq    %rsi, -32(%rsp)
        movq    -40(%rsp), %rcx
        movq    -32(%rsp), %rdx
        movq    -24(%rsp), %rax
        movq    %rcx, (%rdi)
        movq    %rdx, 8(%rdi)
        movq    %rax, 16(%rdi)
        ret

this happens in hot tramp3d loops and is a quite common idiom for initializing
storage.  To fix this we need to avoid creating the asm if the type of the
original storage is the same as the other.

Comment 43 Paolo Carlini 2007-05-04 12:03:52 UTC

(In reply to comment #42)
> this happens in hot tramp3d loops and is a quite common idiom for initializing
> storage.  To fix this we need to avoid creating the asm if the type of the
> original storage is the same as the other.

Many thanks Richard. Indeed, I find this already mentioned optimization a very sensible thing to add.

(By the way, someone should tell the tramp3d developers that using placement new for a POD type like double isn't a very smart idea ;) The only tricky point for default inizialization is whether using memset or a plain loop)

Comment 44 Paolo Carlini 2007-05-04 12:07:59 UTC

(In reply to comment #43)
> (By the way, someone should tell the tramp3d developers that using placement
> new for a POD type like double isn't a very smart idea ;) The only tricky point
> for default inizialization is whether using memset or a plain loop)

Forgot to add a detail that may be of interest, however: C++03 *fails* to classify as POD some very common and simple types which are morally PODs:

  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2172.html

Thus let's not assume users to be too good here: sometimes, in generic code, they may end-up calling placement new also for very simple types.

Comment 45 Richard Biener 2007-05-04 13:03:04 UTC

Of course in the tramp3d case the code is written in such a generic way that it supports non-POD data members...  (And yes, "upstream" has this optimized, but
tramp3d-v4.cpp is about having a gcc testcase, not perfectly user-written code ;))

Comment 46 Richard Biener 2007-05-04 15:31:04 UTC

http://www.suse.de/~gcctest/c++bench/tramp3d/  2nd graph from the top :(

boost ICEs:
/gcc/spec/sb-vangelis-head-64/boost_1_33_1/boost/wave/util/unput_queue_iterator.hpp:221: warning: suggest parentheses around && within ||
../cpp.cpp: In member function 'typename boost::spirit::parser_result<boost::spirit::sequence<A, B>, ScannerT>::type boost::spirit::sequence<A, B>::parse(const ScannerT&) const [with ScannerT = boost::spirit::scanner<const char*, boost::spirit::scanner_policies<boost::spirit::skipper_iteration_policy<boost::spirit::iteration_policy>, boost::spirit::match_policy, boost::spirit::action_policy> >, A = boost::spirit::sequence<boost::spirit::action<boost::spirit::symbols<int, char, boost::spirit::impl::tst<int, char> >, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> >, boost::spirit::action<boost::spirit::int_parser<int, 10, 1u, -0x00000000000000001>, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> > >, B = boost::spirit::action<boost::spirit::int_parser<int, 10, 1u, -0x00000000000000001>, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> >]':
../cpp.cpp:715: internal compiler error: in get_indirect_ref_operands, at tree-ssa-operands.c:1704
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

the run is not yet finished (currently mico is running).

Comment 47 Ian Lance Taylor 2007-05-04 16:54:46 UTC

Unfortunately there is no obvious way to avoid creating the asm if the types are the same.  The reason is that the dynamic types are different, and we don't know the dynamic type.

Look at your original test case:

      Bar *b = new (f) Bar;
      b->p = 0;
      f = new (f) Foo;

Here f has type Foo*.  So if we test whether the types are the same, we conclude that we don't need any blockage in the last statement quoted above.  But that misses the fact that although the static type of *f is Foo, the dynamic type of *f is Bar.

Any ideas?

Comment 48 Richard Biener 2007-05-04 16:57:30 UTC

Doh, you are right.  So we're back to doing it in the aliasing machinery and with
a new tree-code I think.  On IRC Danny said he "can do it" ;)

Comment 49 Ian Lance Taylor 2007-05-12 00:22:12 UTC

Created attachment 13543 [details]
Patch

Like sands through the hourglass, so are the patches to PR 29286.

This version creates a new tree code.  At the RTL level the aliasing is represented by a list of merged alias sets.  I think this is the best one so far.  Any comments?

Comment 50 Daniel Berlin 2007-05-12 01:36:07 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 11 May 2007 23:22:14 -0000, ian at airs dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #49 from ian at airs dot com  2007-05-12 00:22 -------
> Created an attachment (id=13543)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13543&action=view)
> Patch
>
> Like sands through the hourglass, so are the patches to PR 29286.
>
> This version creates a new tree code.  At the RTL level the aliasing is
> represented by a list of merged alias sets.  I think this is the best one so
> far.  Any comments?
>

I like this a lot, and it looks like it should do exactly the right thing.

Comment 51 Richard Biener 2007-05-12 10:47:52 UTC

This looks indeed very promising.  I'll give it a spin on tramp3d.  The only
thing I noticed is that

+    case CHANGE_DYNAMIC_TYPE_EXPR:
+      /* We can access the location with the given type.  We don't
+	 track pointers in RTL, so we handle this by assuming that any
+	 reference to one type can alias any reference to the other
+	 type across the whole function.  No code is generated.  */
+      {
+	tree type = CHANGE_DYNAMIC_TYPE_NEW_TYPE (exp);
+	tree ptr = CHANGE_DYNAMIC_TYPE_LOCATION (exp);	
+	tree ptr_type = TREE_TYPE (ptr);
+
+	gcc_assert (TREE_CODE (type) == POINTER_TYPE);
+	gcc_assert (TREE_CODE (ptr_type) == POINTER_TYPE);
+	merge_alias_sets_in_function (cfun,
+				      get_alias_set (TREE_TYPE (type)),
+				      get_alias_set (TREE_TYPE (ptr_type)));
+	return const0_rtx;
+      }

doesn't seem to handle the case the original pointed to type is not the old
dynamic type (which you rightfully were concerned about in comment #47).  So
it looks like (in absence of points-to information) we need to make the
new dynamic type alias everything?

Comment 52 Richard Biener 2007-05-12 12:11:02 UTC

But I get

Program received signal SIGSEGV, Segmentation fault.
0x00000000006e646f in find_assert_locations (bb=0x2b56e8984480)
    at ../../trunk/gcc/tree-vrp.c:3193
3193      if (flag_delete_null_pointer_checks && POINTER_TYPE_P (TREE_TYPE (op)))
(gdb) call debug_bb (bb)
;; basic block 3, loop depth 1, count 0
;; prev block 2, next block 4
;; pred:       6 [91.0%]  (true,exec)
;; succ:       4 [85.0%]  (true,exec) 5 [15.0%]  (false,exec)
<bb 3>:
D.926012_14 = D.926053_23;
<<<change_dynamic_type (struct Loc *) D.926018_15)>>>
this_16 = __cur_1;
if (__cur_1 != 0B)
  goto <bb 4>;
else
  goto <bb 5> (<L9>);


compiling tramp3d.  Supposedly because we walk all USEs in the stmt and
find (struct Loc *) which is not exactly a "use".

Comment 53 Daniel Berlin 2007-05-12 14:29:02 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 12 May 2007 11:11:03 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #52 from rguenth at gcc dot gnu dot org  2007-05-12 12:11 -------
> But I get
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000006e646f in find_assert_locations (bb=0x2b56e8984480)
>     at ../../trunk/gcc/tree-vrp.c:3193
> 3193      if (flag_delete_null_pointer_checks && POINTER_TYPE_P (TREE_TYPE
> (op)))
> (gdb) call debug_bb (bb)
> ;; basic block 3, loop depth 1, count 0
> ;; prev block 2, next block 4
> ;; pred:       6 [91.0%]  (true,exec)
> ;; succ:       4 [85.0%]  (true,exec) 5 [15.0%]  (false,exec)
> <bb 3>:
> D.926012_14 = D.926053_23;
> <<<change_dynamic_type (struct Loc *) D.926018_15)>>>
> this_16 = __cur_1;
> if (__cur_1 != 0B)
>   goto <bb 4>;
> else
>   goto <bb 5> (<L9>);
>
>
> compiling tramp3d.  Supposedly because we walk all USEs in the stmt and
> find (struct Loc *) which is not exactly a "use".

Probably need to teach tree-ssa-operands that only one of the operands
is really a use.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>

Comment 54 Ian Lance Taylor 2007-05-12 19:41:43 UTC

Regarding comment #51: I think the code is OK.  What it does is, at the RTL level, make the two types alias each other for the entire function.  This is different from what was happening on comment #47; that patch was making the two types alias at that specific point in the program.  At the tree level we want to alias specific pointers, and the new patch retains that.  It would be nice to do that at the RTL level also, but we unfortunately have no mechanism for that.  So people who use placement new with different types, neither of which is char*, may see some optimization regression in some cases, because the scheduler will be more constrained than is really required.  I don't see a way to avoid that.  I also don't think it's very important.

Comment 55 Ian Lance Taylor 2007-05-12 19:43:54 UTC

Regarding comment #52: do you have a reasonably sized test case?  I didn't see any problems when I ran the C++ testsuite.

I think tree-ssa-operands.c does the right thing:

    case CHANGE_DYNAMIC_TYPE_EXPR:
      get_expr_operands (stmt, &CHANGE_DYNAMIC_TYPE_LOCATION (expr), opf_use);
      return;

But there may well be something I misunderstand.

Comment 56 Richard Biener 2007-05-12 22:48:35 UTC

I'm trying to reduce it.  The problem seems to be somewhere else (-O -fstrict-aliasing):

/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h: In function '_ForwardIterator std::__uninitialized_copy_aux(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = __gnu_cxx::__normal_iterator<const Loc<1>*, std::vector<Loc<1>, std::allocator<Loc<1> > > >, _ForwardIterator = Loc<1>*]':
/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: error: SSA name in freelist but still referenced
D.927916_15

/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: internal compiler error: verify_stmts failed
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

Comment 57 Richard Biener 2007-05-13 11:04:05 UTC

It's become a libstdc++ test, but anyway:  -O -fstrict-aliasing

namespace Pooma {
   typedef int Context_t;
   namespace Arch {
   }
   inline Context_t context() {
  }
   inline int contexts() {
  }
 }
template<class DomT, class T, class NewDom1T> struct DomainTraitsScalar {
  };
template<class T> struct DomainTraits : public DomainTraitsScalar<T, T, T> {
  };
template<int Dim> class Grid;
template<class DT> class DomainBase {
  };
template<int Dim, class DT> class Domain : public DomainBase<DT> {
  };
#include <vector>
template<> class Grid<1> : public Domain<1, DomainTraits<Grid<1> > > {
  };
namespace Pooma {
 class PatchSizeSyncer {
    typedef Grid<1> Grid_t;
    PatchSizeSyncer(int contextKey, Grid_t &localGrid);
    int myContext_m;
    int numContexts_m;
    int localKey_m;
    Grid_t localGrid_m;
    typedef std::pair<int,Grid_t *> Elem_t;
    std::vector<Elem_t> gridList_m;
  };
 }
namespace Pooma {
 PatchSizeSyncer::PatchSizeSyncer(int contextKey, Grid_t &localGrid)   : myContext_m(Pooma::context()),     numContexts_m(Pooma::contexts()),     localKey_m(contextKey),     localGrid_m(localGrid) {
    if (myContext_m == 0) gridList_m.reserve(numContexts_m);
  }
 }


/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h: In function '_ForwardIterator std::__uninitialized_copy_aux(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = std::pair<int, Grid<1>*>*, _ForwardIterator = std::pair<int, Grid<1>*>*]':
/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: error: SSA name in freelist but still referenced
D.8564_13

/space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: internal compiler error: verify_stmts failed
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://gcc.gnu.org/bugs.html> for instructions.

Comment 58 Ian Lance Taylor 2007-05-14 04:45:16 UTC

Created attachment 13551 [details]
Patch

That test case turns out to be an oversight in my patch to tree-ssa-dce.c.  Here is the new patch against mainline, which has a one line change to tree-ssa-dce.c compared to the old patch.

Comment 59 Daniel Berlin 2007-05-14 05:00:09 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 14 May 2007 03:45:25 -0000, ian at airs dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #58 from ian at airs dot com  2007-05-14 04:45 -------
> Created an attachment (id=13551)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13551&action=view)
> Patch
>
> That test case turns out to be an oversight in my patch to tree-ssa-dce.c.
> Here is the new patch against mainline, which has a one line change to
> tree-ssa-dce.c compared to the old patch.

I believe your tree-ssa-dce part is wrong.

I see no reason for them to be marked always necessary.  They are only
necessary if the result is used, passed somewhere else, etc.

These should be properly marked in the regular course of DCE.

>
>
> --
>
> ian at airs dot com changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>   Attachment #13543 [details]|0                           |1
>         is obsolete|                            |
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>

Comment 60 Richard Biener 2007-05-14 09:25:21 UTC

But it doesn't have a result, does it?  Given that, I wonder how moving stmts
across it is prevented?

Comment 61 Daniel Berlin 2007-05-14 12:44:31 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 14 May 2007 08:25:27 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #60 from rguenth at gcc dot gnu dot org  2007-05-14 09:25 -------
> But it doesn't have a result, does it?  Given that, I wonder how moving stmts
> across it is prevented?

Okay, so then it needs an LHS that defines a new SSA name, otherwise,
we'll end up with dead ones everywhere, and they will keep other dead
code alive.

Comment 62 Richard Biener 2007-05-14 16:20:11 UTC

We ICE with

template <class T, class T2, int D>
class Vector
{
public:
   Vector()
   {
     for (int i = 0; i < D; ++i)
        new (&x_m[i]) T2();
   }
   T2& operator[](int i) { return *reinterpret_cast<T2*>(&x_m[i]); }

private:
   T x_m[D];
};

void foo(Vector<double, long, 3> *m)
{
  Vector<double, long, 3> v;
  v[0] = 1;
  v[1] = 2;
  v[2] = 3;
  *m = v;
}

(adjust the storage type to be of same size for double/long):

Program received signal SIGSEGV, Segmentation fault.
0x0000000000b11a5e in unmodifiable_var_p (var=0x0)
    at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598
1598      if (TREE_CODE (var) == SSA_NAME)

#0  0x0000000000b11a5e in unmodifiable_var_p (var=0x0)
    at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598
#1  0x0000000000b194c3 in may_alias_p (ptr=0x2b537d358210, mem_alias_set=2, 
    var=0x2b537d359e00, var_alias_set=3, alias_set_only=0 '\0')
    at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2685
#2  0x0000000000b17f2b in add_flow_insensitive_var_aliases (ai=0x14d7890, 
    pvar=0x2b537d358210, set=2, tag=0x2b537d356ba0)
    at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2190
#3  0x0000000000b184d3 in compute_change_dynamic_type_aliasing (
    vcdt=0x2b537d357a00, vai=0x14d7890)
    at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2242
#4  0x000000000098a387 in pointer_set_traverse (pset=0x14e4e10, 
    fn=0xb17f84 <compute_change_dynamic_type_aliasing>, data=0x14d7890)
    at /space/rguenther/src/svn/trunk/gcc/pointer-set.c:181
#5  0x0000000000b18719 in compute_flow_insensitive_aliasing (ai=0x14d7890)
    at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2295
#6  0x0000000000b15b17 in compute_may_aliases ()

where in may_alias_p the ptr we passed does not have a symbol memory tag
associated.  (but var is SFT.3)

Comment 63 Daniel Berlin 2007-05-14 16:38:25 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 14 May 2007 15:20:14 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #62 from rguenth at gcc dot gnu dot org  2007-05-14 16:20 -------
> We ICE with
>
> template <class T, class T2, int D>
> class Vector
> {
> public:
>    Vector()
>    {
>      for (int i = 0; i < D; ++i)
>         new (&x_m[i]) T2();
>    }
>    T2& operator[](int i) { return *reinterpret_cast<T2*>(&x_m[i]); }
>
> private:
>    T x_m[D];
> };
>
> void foo(Vector<double, long, 3> *m)
> {
>   Vector<double, long, 3> v;
>   v[0] = 1;
>   v[1] = 2;
>   v[2] = 3;
>   *m = v;
> }
>
> (adjust the storage type to be of same size for double/long):
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000b11a5e in unmodifiable_var_p (var=0x0)
>     at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598
> 1598      if (TREE_CODE (var) == SSA_NAME)
>
> #0  0x0000000000b11a5e in unmodifiable_var_p (var=0x0)
>     at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598
> #1  0x0000000000b194c3 in may_alias_p (ptr=0x2b537d358210, mem_alias_set=2,
>     var=0x2b537d359e00, var_alias_set=3, alias_set_only=0 '\0')
>     at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2685
> #2  0x0000000000b17f2b in add_flow_insensitive_var_aliases (ai=0x14d7890,
>     pvar=0x2b537d358210, set=2, tag=0x2b537d356ba0)
>     at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2190
> #3  0x0000000000b184d3 in compute_change_dynamic_type_aliasing (
>     vcdt=0x2b537d357a00, vai=0x14d7890)
>     at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2242
> #4  0x000000000098a387 in pointer_set_traverse (pset=0x14e4e10,
>     fn=0xb17f84 <compute_change_dynamic_type_aliasing>, data=0x14d7890)
>     at /space/rguenther/src/svn/trunk/gcc/pointer-set.c:181
> #5  0x0000000000b18719 in compute_flow_insensitive_aliasing (ai=0x14d7890)
>     at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2295
> #6  0x0000000000b15b17 in compute_may_aliases ()
>
> where in may_alias_p the ptr we passed does not have a symbol memory tag
> associated.  (but var is SFT.3)

unmodifiable_var_p is not allowed to alias anything, so we don't
create SMT's for them.

Comment 64 Ian Lance Taylor 2007-05-14 17:42:13 UTC

Re: comment #59: CHANGE_DYNAMIC_TYPE_EXPR does not have a result.  I set it up so that it is essentially volatile, and represents an additional type aliasing at that point in the program.

One of the earlier patches set it up as a cast.  But it was hard to convince myself that it really did the right thing.  A lot of code in the compiler strips casts for one reason or another.  One would have to examine each of those places and make sure that it did not strip the cast inappropriately.  Or, conversely, that it did strip the cast when it should.  This would tend to introduce a bit of code in many different parts of the compiler, all to handle a very unusual case.

Generating a statement with no result seemed more reliable to me.  It's true that there is a downside: it may cause a placement new to be preserved inappropriately.  But that seems like an unlikely case to me.  I doubt that very much code does a placement new which changes types and then does not use the resulting memory.  I think it's OK to misoptimize such code slightly.

Comment 65 Ian Lance Taylor 2007-05-14 18:14:09 UTC

Created attachment 13553 [details]
Patch

Thanks for the new test case.  Here is a new patch.  The change compared to the previous patch is just that we check whether the variable has a memory tag (i.e., the variable has been written) before calling add_flow_insensitive_var_aliases, rather than after.

Comment 66 Mark Mitchell 2007-05-14 22:25:17 UTC

Will not be fixed in 4.2.0; retargeting at 4.2.1.

Comment 67 Ian Lance Taylor 2007-05-14 22:35:57 UTC

The patch in comment #65 passes bootstrap and the gcc and libstdc++-v3 testsuites on i686-pc-linux-gnu.

Comment 68 Daniel Berlin 2007-05-14 22:39:56 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 14 May 2007 21:35:58 -0000, ian at airs dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #67 from ian at airs dot com  2007-05-14 22:35 -------
> The patch in comment #65 passes bootstrap and the gcc and libstdc++-v3
> testsuites on i686-pc-linux-gnu.
>

Ian has made a convincing argument that optimizing placement new just
ain't gonna matter here, so i'm fine with his patch to DCE :)

Comment 69 Ian Lance Taylor 2007-05-16 06:33:08 UTC

Unless somebody feels strongly otherwise, I propose that we only fix this in mainline (i.e., 4.3).  Any fix is going to be delicate, this bug is unusual and difficult to trigger, and there is a source code workaround.

Comment 70 Richard Biener 2007-05-16 10:57:22 UTC

The last patch looks good sofar.

Comment 71 Richard Biener 2007-05-16 13:05:58 UTC

But it doesn't fix the POOMA miscompilations if I change its Pool allocator to
use placement new instead of this

  // Release a block to the pool.
  inline void free(void *b)
    {
      // Record a free.
      outstandingAllocs_m -= 1;

      // Cast the pointer to the right type.
      Link *p = (Link*)b;

      // Make it point to the current head of the free list.
      p->next_m = head_m;

      // Make the next one the new head of the list.
      // We can't do head_m = p->next_m since p will soon be treated
      // as something other than a Link.  By doing this assignment
      // with memcpy, we ensure that p->next_m will be read before
      // it is clobbered.
      memcpy(&p->next_m, &head_m, sizeof(head_m));

      // Make it the head of the free list.
      head_m = p;
    }

memcpy hack.  I'll try to investigate...

Comment 72 Richard Biener 2007-05-16 14:49:08 UTC

inline void* operator new(unsigned long, void* __p) throw() { return __p; }

extern void bar();

float foo(double *p)
{
  long *l = (long *)p;
  *l = 1;
  float *f = new (p) float;
  *f = 0.0;
  bar ();
  return *f;
}

with -O2 I see

<bb 2>:
  l_2 = (long int *) p_1(D);
  # SMT.4_12 = VDEF <SMT.4_11(D)>
  *l_2 = 1;
  <<<change_dynamic_type (float *) p_1(D))>>>
  __p_8 = p_1(D);
  f_5 = (float *) __p_8;
  # SMT.5_14 = VDEF <SMT.5_13(D)>
  *f_5 = 0.0;
  # SMT.4_15 = VDEF <SMT.4_12>
  # SMT.5_16 = VDEF <SMT.5_14>
  bar ();
  # VUSE <SMT.5_16>
  D.2462_6 = *f_5;
  return D.2462_6;

which looks wrong - the store to f_5 needs to alias *l_2?

Comment 73 Richard Biener 2007-05-16 15:02:09 UTC

For the following testcase, loop store motion re-orders the writes to *l and *f
at -O2.  (Note as is, the testcase is only valid for n == 1)

inline void* operator new(unsigned long, void* __p) throw() { return __p; }

extern void bar();

long foo(double *p, int n)
{
  long *f;
  for (int i=0; i<n; ++i)
  {
    int *l = (int *)p;
    *l = 0;
    f = new (p) long;
    *f = -1;
  }
  bar ();
  return *f;
}

and we end up with

<bb 2>:
  if (n > 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  f = (long int *) p;
  <<<change_dynamic_type (long int *) p)>>>
  # SMT.4_27 = VDEF <SMT.4_19(D)>
  *f = -1;
  # SMT.5_35 = VDEF <SMT.5_20(D)>
  *(int *) p = 0;

<bb 4>:
  # SMT.4_23 = VDEF <SMT.4_32>
  # SMT.5_24 = VDEF <SMT.5_34>
  bar ();
  return *f;

which is wrong.  Runtime testcase that aborts:

inline void* operator new(unsigned long, void* __p) throw() { return __p; }

void __attribute__((noinline)) bar() {};

long __attribute__((noinline)) foo(double *p, int n)
{
  long *f;
  for (int i=0; i<n; ++i)
  {
    int *l = (int *)p;
    *l = 0;
    f = new (p) long;
    *f = -1;
  }
  bar ();
  return *f;
}

extern "C" void abort(void);
int main()
{
  union {
    int i;
    long l;
  } u;
  if (foo((double *)&u, 1) != -1)
    abort ();
  return 0;
}

Comment 74 Richard Biener 2007-05-16 15:36:44 UTC

Note that another solution to this whole problem is to record conflicts for
all stored-to types that we cannot disambiguate by points-to analysis.  This
also solves the reading of the standard that allows

  int i = 1;
  (float *)&i = 0.0;

at RTL expansion time we need to do the same apart from that we don't have
points-to information available.

The above supports the notion of "re-using memory changes its dynamic type".
Which of course gets a little interesting for aggregates that are initialized
piece-wise (not that this problem isn't present now).

Comment 75 Richard Biener 2007-05-16 17:08:36 UTC

In the chance of endlessly repeating myself here (;)) - what comment #73 boils down to is that we may not prune alias sets of stores based on TBAA results.
In fact, we need to realize that C++ object lifetime rules need to be modeled
somehow - it's those rules that count after all, not some magic side-effects
of placement new.

Of course I realize that without TBAA pruning we have no alias information left
on the RTL side for stores (so stores effectively have alias set zero).

Comment 76 Ian Lance Taylor 2007-05-16 18:01:52 UTC

I don't believe that the C++ standards writers really meant to eliminate TBAA.  And that is the inevitable consequence of the dynamic memory type approach if you are allowed to change the dynamic type in a function.

If we really want to adopt that approach, then the simple fix is for the C++ frontend to set flag_strict_aliasing to 0.  But I don't think we want to do that.

Comment 77 Mark Mitchell 2007-05-17 00:41:47 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:

> I don't believe that the C++ standards writers really meant to eliminate TBAA. 
> And that is the inevitable consequence of the dynamic memory type approach if
> you are allowed to change the dynamic type in a function.

I agree with Ian.

I think there is good evidence that the authors of the standard intended
C++ to be *more* typesafe than C, and we should read the standard in
that way.  It's unfortunate that neither the C or C++ standards is very
precise about various aspects of the memory model, but that is what it
is.  I think trying to read the standard to divine the answers to these
questions is essentially futile; this is a situation where we should
accept that the standard doesn't say, and do our best to balance
performance with existing practice and expectations.

I don't fully understand the point of Comment #73.  I thought the whole
point of this series of patches was to make the compiler understand that
memory returned by placement new could alias other memory, or to
otherwise introduce a barrier that would prevent the compiler from
reordering accesses across the call to operator new.  If that's the
case, why does the post-patch compiler still think that the writes to
"f" and "l" don't alias?

Comment 78 Ian Lance Taylor 2007-05-18 07:14:51 UTC

The test case in comment #73 is just a standard aliasing violation.  You are casting a double* to an int* and writing to it both ways.  As I argued in comment #76, the only way this is going to work is if we eliminate TBAA in C++.

Comment 79 Ian Lance Taylor 2007-05-18 07:25:02 UTC

The test case in comment #71 doesn't use placement new either.

This PR was originally about placement new.  I think there is general agreement thta placement new needs to avoid aliasing problem.  I don't think there is general agreement that arbitrary type casting needs to avoid aliasing problems.
Have you found any problems with my most recent patch and placement new?

Comment 80 Mark Mitchell 2007-05-18 07:26:42 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:
> ------- Comment #78 from ian at airs dot com  2007-05-18 07:14 -------
> The test case in comment #73 is just a standard aliasing violation.  You are
> casting a double* to an int* and writing to it both ways.

I'm confused.  The double-ness looks irrelevant to me; it could just as
well be "void *".  The only actual accesses to the memory are through an
"int *" pointer and a "long *" pointer, and there's a placement new
between the two.  I thought the whole point of these patches was to
allow placement new to change the type in exactly this way?

Comment 81 Richard Biener 2007-05-18 09:45:51 UTC

Yes, both testcases are valid and are using placement new.  Note the loop
is only to confuse the optimizers enough to re-order the stores and produce
a miscompilation.  Note the loop runs exactly once, and in essence we are doing

  int *p = XXX; /* integer memory */
  *p = 0;
  long *q = new (p) long;
  *q = -1;

and the compiler is re-ordering the stores which is wrong.  *p and *q need to
alias because of the placement new.

Comment 82 Richard Biener 2007-05-18 09:47:17 UTC

Oh, and the double-ness is to show that at RTL expansion we actually unify alias
sets of long and double, not long and int, which is wrong again.  See my
comment #51.

Comment 83 gdr@cs.tamu.edu 2007-05-18 14:26:03 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| The test case in comment #71 doesn't use placement new either.
| 
| This PR was originally about placement new.

Yes.

|  I think there is general agreement
| thta placement new needs to avoid aliasing problem.

Yes.

The intent of placement new is to specify a new dynamic type, and end
the lifetime of previous object.  This is type safe, beause the "old"
object no longer exists.

|  I don't think there is
| general agreement that arbitrary type casting needs to avoid aliasing problems.

agree.

-- Gaby

Comment 84 gdr@cs.tamu.edu 2007-05-18 14:30:18 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenth at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Comment #81 from rguenth at gcc dot gnu dot org  2007-05-18 09:45 -------
| Yes, both testcases are valid and are using placement new.  Note the loop
| is only to confuse the optimizers enough to re-order the stores and produce
| a miscompilation.  Note the loop runs exactly once, and in essence we are doing
| 
|   int *p = XXX; /* integer memory */
|   *p = 0;
|   long *q = new (p) long;

This is OK as long as int and long have same alignment, sizeof, etc.
For example, that code is invalid on 64-bit platform where int has
alignment and the storage is not large enough to meet long's
requirement.  This is the case only when the "operator new" is not
overloaded, but is the "standard one".  The exact conditions are
so tricky to explicit that it would just be OK for us to accept it.

-- Gaby

Comment 85 Richard Biener 2007-05-18 14:46:10 UTC

To avoid confusion about what testcase we speak, let's talk about the one
I replicated below.  I believe this one is valid as the storage is a union
instantiated in main() and a pointer to it is passed to foo().  It doesn't
matter the argument pointer type is double*, as we access the memory first
as int* (which is ok and starts lifetime of int typed memory) and then we
EOL that object and start lifetime of a long via placement new (alignment
is ok, as is size).

So, anyone disagrees that the below testcase is valid?


inline void* operator new(unsigned long, void* __p) throw() { return __p; }

void __attribute__((noinline)) bar() {};

long __attribute__((noinline)) foo(double *p, int n)
{
  long *f;
  for (int i=0; i<n; ++i)
  {
    int *l = (int *)p;
    *l = 0;
    f = new (p) long;
    *f = -1;
  }
  bar ();
  return *f;
}

extern "C" void abort(void);
int main()
{
  union {
    int i;
    long l;
  } u;
  if (foo((double *)&u, 1) != -1)
    abort ();
  return 0;
}

Comment 86 Ian Lance Taylor 2007-05-18 17:24:44 UTC

Re comment #80, comment #81, comment #82.  My patch handles the placement new in comment #73 to indicate an alias between double and long.  The mis-ordered code is actually aliasing int and long.  That aliasing is due to a cast to (int*).  There is no placement new in that cast.  There is no reason for the compiler to avoid exchanging the store via int* and the store via long*.  Why should it?  If you omitted the placement new, this would be a standard aliasing violation between int and double.  With the placement new, it's still an aliasing violation between int and long.  The placement new doesn't excuse you from all aliasing violations: it just excuses you from aliasing violations involving the actual types in question.

If you want to make a principled argument that placement new excuses you from all aliasing violations, then again the result is that in a function which uses placement new we should disable TBAA for that function.  That would be OK with me and easy to implement.

Comment 87 Ian Lance Taylor 2007-05-18 17:27:41 UTC

I'm not sure what to make of comment #84.  We don't determine aliasing by alignment or size.  We determine it by type.  We don't currently treat int and long as aliasing each other even if they happen to have the same alignment and size.  I believe this is correct according to the C standard but I am less familiar with the C++ standard.  We could change the aliasing machinery in that way for C++ if it seems to be appropriate, but I would prefer to take that to a different PR.

Comment 88 Ian Lance Taylor 2007-05-18 17:35:43 UTC

Regarding comment #85, this again relies on the notion of dynamic type of a memory location such that the only possible end result is to eliminate TBAA when compiling C++.  The only thing I can say about this sort of test case is that I agree that one can make an argument from the standard that this type of code is valid.  However, as I've said before, e.g., in comment #76, I don't think that anybody really wants that end result.

Any example that inherently relies on a type cast is going to lead you straight to eliminating TBAA.  And note that your example is clearly invalid in C.  Adding the placement new doesn't really have anything to do with whether it is valid or invalid in C++.

Can you show me a test case which my patch gets wrong which does not involve type casting outside of placement new itself?

Comment 89 Mark Mitchell 2007-05-18 17:44:08 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:
> ------- Comment #86 from ian at airs dot com  2007-05-18 17:24 -------
> Re comment #80, comment #81, comment #82.  My patch handles the placement new
> in comment #73 to indicate an alias between double and long.  The mis-ordered
> code is actually aliasing int and long.  That aliasing is due to a cast to
> (int*).  There is no placement new in that cast.  There is no reason for the
> compiler to avoid exchanging the store via int* and the store via long*.  Why
> should it? 

I don't think the fact that "p" is a "double *" is relevant; it could
just as well be "void *".  This kind of code is unambiguously valid:

  void f(double *p) { (int*)p = 3; }
  void g() {
    int i;
    f((double *)&i);
  }

Pedantically, the alignment of "double" has to be no stricter than "int"
for this to be valid, but since we define pointer conversion as a no-op,
it's always valid in GNU C.

This is why I liked your earlier patch that made placement new a memory
barrier.  I think the right handling of placement new is basically to
say "everything we know about types is forgotten here".  One can limit
that to things to which the operand might point, but points-to analysis
(should not) tell you that "p" and "l" cannot point at the same place,
since we have no idea what "p" points at.

I don't think this is equivalent to totally disabling TBAA in C++.

Comment 90 Ian Lance Taylor 2007-05-18 18:38:31 UTC

I agree that this is valid:

void f(double *p) { *(int*)p = 3; }
void g() {
  int i;
  f((double *)&i);
}

But I don't think that is the question at hand.  The variable is always being accessed in the same type, which is also the type of its declaration.  The question at hand is this:

void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; }
void g() { int i; f((double *)&i); }

And the specific question is whether we are permitted to interchange the assignments to *p and *l.

Let's consider this:

void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; }

Is that valid?  Is the compiler permitted to interchange the assignments to *p and *l?  Consider that, as in comment #73, p might actually point to a union of int and long.  Does that fact that that union might exist somewhere else make this test case valid?  Presumably it does not.  Presumably this is invalid.

So if that is not valid, and the placement new case is valid, then what is the essential difference between the cases?  The variable is being accessed via two different types.  Why is that OK?

You're right that don't have to abandon TBAA to make this work, that we can make it work by turning placement new into a memory barrier.  But then you have to address comment #42.  That approach will cause a performance regression for natural valid code.  The question then becomes whether we are willing to pay that performance regression for normal code in order to support this sort of weird code.

Can anybody see a way through this maze?

Comment 91 Paolo Carlini 2007-05-18 18:46:27 UTC

(In reply to comment #90)
> Can anybody see a way through this maze?

Humbly, I'd like to suggest again that we have a look to the assembly produced by other compilers. I remember that some GCC contributors have access to compilers such as ICC, XLC++, Sun, etc. and I don't think we should strive at all costs for perfect optimization, if that means keeping busy our best contributors for months and risking producing wrong code. We can always return to the issue later, as an enhancement.

Comment 92 Andrew Pinski 2007-05-18 18:55:31 UTC

> So if that is not valid, and the placement new case is valid, then what is the
> essential difference between the cases?  The variable is being accessed via two
> different types.  Why is that OK?
> void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; }
> void g() { int i; f((double *)&i); }

Because the memory that p was pointing to, stops being an int once a placement new happens.  For F, it goes:
> void f(double* p)
> {
p points to an variable that is an int
> *(int*)p = 3
access the int via an int.
> long *l = new (p) long;
The memory is no longer an int, it has become a long, it cannot be accessed as an int no longer as that would be undefined.
> *l = 4; 
Access the memory as a long and since the type is a long, this is well defined.
>}

Now after this function returns the variable can only be accessed as a long (well and via a character type).

Hopefully this explains why this is valid.  Now should we do anything about it, I don't know because how often does this happen in real life, I don't know.

Comment 93 Mark Mitchell 2007-05-18 19:01:34 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:

> void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; }
> void g() { int i; f((double *)&i); }
> 
> And the specific question is whether we are permitted to interchange the
> assignments to *p and *l.

I do not think we are.

> void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; }
> 
> Is that valid?  Is the compiler permitted to interchange the assignments to *p
> and *l?  Consider that, as in comment #73, p might actually point to a union of
> int and long.  Does that fact that that union might exist somewhere else make
> this test case valid?  Presumably it does not.  Presumably this is invalid.

Agreed; this case is invalid.

> So if that is not valid, and the placement new case is valid, then what is the
> essential difference between the cases?  The variable is being accessed via two
> different types.  Why is that OK?

Because placement new changes the type of storage, in the same way that
using ordinary ("delete") and then using (ordinary) "new" (but getting
back the same memory pointer) does.  The placement "new" operator is
special.

> You're right that don't have to abandon TBAA to make this work, that we can
> make it work by turning placement new into a memory barrier.  But then you have
> to address comment #42.  That approach will cause a performance regression for
> natural valid code.  The question then becomes whether we are willing to pay
> that performance regression for normal code in order to support this sort of
> weird code.

I am willing to accept that performance regression.  I don't consider
that code "normal"; many C++ performance libraries now provide a way to
produce an uninitialized container, precisely to avoid default
construction.  POOMA could use that technique.

It would of course be better (though, in my opinion, not essential) to
have a more gentle barrier.  If we could tell the compiler to forget the
type of anything that the argument to placement-new might point to, but
not to assume that arbitrary weirdness has occurred, then the compiler
could still eliminate the redundant stores.

In other words, in Comment #42, the problem is that the volatile asm
tells the compiler that not only must the stores/loads not be reordered
across the barrier, but that stores before the barrier must actually
occur because their may be some arbitrary action at the barrier that
depends upon the values written.  If we had a barrier that says just
that the operations may not be reordered across the barrier -- but does
not say that the operations before the barrier are side-effecting --
then we could still eliminate them as redundant.

Comment 94 Ian Lance Taylor 2007-05-18 19:03:35 UTC

I tried the original test case with icc, and it gets the right result.  The assignment b->p = 0 is discarded.

Unfortunately I'm not sure what this actually tells us.

Comment 95 Richard Biener 2007-05-18 20:55:06 UTC

But construction/initialization of uninitalized memory in <memory> happens with
placement new!  So we're back to square one.  What this PR initially was about
is a fixed type memory allocator in C++ which needs to change memory from
allocated type T to free-space-managing-structure S at deallocation time and
the other way around at allocation time.  We absolutely _have_ to handle
this case correct.  And we need to optimize the <memory> routines that use
placement new, because they resemble patterns used in libraries like POOMA
or Boost.

Comment 96 Paolo Carlini 2007-05-18 21:12:06 UTC

(In reply to comment #95)
> But construction/initialization of uninitalized memory in <memory> happens with
> placement new!

Placement new is used only for non-POD types, to be clear.

Comment 97 Mark Mitchell 2007-05-18 21:17:37 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:

> But construction/initialization of uninitalized memory in <memory> happens with
> placement new!  So we're back to square one.  What this PR initially was about
> is a fixed type memory allocator in C++ which needs to change memory from
> allocated type T to free-space-managing-structure S at deallocation time and
> the other way around at allocation time.  We absolutely _have_ to handle
> this case correct.  And we need to optimize the <memory> routines that use
> placement new, because they resemble patterns used in libraries like POOMA
> or Boost.

First and foremeost, we have to generate correct code.  If that means
the memory barrier solution, for now, then so be it.

Comment 98 Paolo Carlini 2007-05-18 21:27:39 UTC

(In reply to comment #97)
> First and foremeost, we have to generate correct code.  If that means
> the memory barrier solution, for now, then so be it.

Yes, but I'm a little worried myself not by <memory> but by containers like <list>, <map> and <set>, where we call allocator::contruct, which uses placement new, to "install" the value in the allocated node. In that case, unfortunately, per 20.4.1.1/12, we can't optimize it for POD types :( At minimum we should benchmark a bit...

Comment 99 Paolo Carlini 2007-05-18 21:37:59 UTC

To complete my reasoning: in case we end-up with some sort of very bad pessimization of placement new, probably we'll have to adjust such containers to not call allocator::contruct at all when the default allocator is involved && a POD type is contructed, similarly to std::vector and std::deque for the uninitialized functions.

Comment 100 gdr@cs.tamu.edu 2007-05-18 22:04:21 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| I'm not sure what to make of comment #84.  We don't determine aliasing by
| alignment or size.  We determine it by type.  We don't currently treat int and
| long as aliasing each other even if they happen to have the same alignment and
| size.

That is GOOD.  :-)

| I believe this is correct according to the C standard but I am less
| familiar with the C++ standard.

It is also correct semantics with respect to C++.

C++ goes further (I don't have C standard handy to check).  The memory
pointed to by p must have the "right" alignment requirements etc.

So if you grab memory into p and it does not have the right alignment,
then the new placement yields undefined behaviour.  That is not a
property we can check statically (until we get the alignment proposal
into C++).

| We could change the aliasing machinery in that
| way for C++ if it seems to be appropriate, but I would prefer to take that to a
| different PR.

What I was trying to say was that C++ provides the same "dynamic type"
(non-)aliasing guarantees, and goes even further.  

Did I manage to confuse you again?

-- Gaby

Comment 101 gdr@cs.tamu.edu 2007-05-18 22:12:40 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| But I don't think that is the question at hand.  The variable is always being
| accessed in the same type, which is also the type of its declaration.  The
| question at hand is this:
| 
| void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; }
| void g() { int i; f((double *)&i); }
| 
| And the specific question is whether we are permitted to interchange the
| assignments to *p and *l.

After the placement new, the lifetime of *p is ended, and the lifetime
of *l starts there.  I don't think that leaves room for exchanging the
stores to *p and *l.

| Let's consider this:
| 
| void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; }
| 
| Is that valid?

If p is pointing to a union object that contain both int and long,
yes, it is valid.  Otherwisem, it is not.

|  Is the compiler permitted to interchange the assignments to *p
| and *l?

It is valid if we have a union object, otherwise, no.

-- Gaby

Comment 102 gdr@cs.tamu.edu 2007-05-18 22:15:44 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Comment #96 from pcarlini at suse dot de  2007-05-18 21:12 -------
| (In reply to comment #95)
| > But construction/initialization of uninitalized memory in <memory> happens with
| > placement new!
| 
| Placement new is used only for non-POD types, to be clear.

yes, but it is also valid for POD types.

-- Gaby

Comment 103 gdr@cs.tamu.edu 2007-05-18 22:16:59 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Comment #98 from pcarlini at suse dot de  2007-05-18 21:27 -------
| (In reply to comment #97)
| > First and foremeost, we have to generate correct code.  If that means
| > the memory barrier solution, for now, then so be it.
| 
| Yes, but I'm a little worried myself not by <memory> but by containers like
| <list>, <map> and <set>, where we call allocator::contruct, which uses
| placement new, to "install" the value in the allocated node. In that case,
| unfortunately, per 20.4.1.1/12, we can't optimize it for POD types :( At
| minimum we should benchmark a bit...

If the element type is a POD, we should use assignment, not placement new.

-- Gaby

Comment 104 Paolo Carlini 2007-05-18 22:44:02 UTC

(In reply to comment #103)
> If the element type is a POD, we should use assignment, not placement new.

Agreed, in principle. But before adding a load of templates and dispatching, let's wait a bit for the outcome of this issue and benchmark carefully: the performance difference may be minor, given that we are dynamically allocating memory for the new node.

Comment 105 Richard Biener 2007-05-22 10:50:37 UTC

Let me again do a step back and look at the problem from another view ;)

--

C++ aliasing imposes additional restrictions on transformations we
are allowed to do to memory references compared to C type-based
aliasing rules.  
Consider two memory references A and B which we need to know whether 
we can exchange them.

 - If we can prove that both memory references are
   to non-overlapping memory regions we always can exchange them (PTA
   can provide this knowledge).

 - We can always re-order two loads.

TBAA gives us another source of information to disambiguate the
two references.  So, if TBAA says the two references do not conflict then

 - we can hoist loads across stores.

     *double = 1.0;
     x = *int;

   if *int aliases *double the program is invalid.

 - we can sink stores across loads.  (this is just the opposite view
   of the above)

 - we _cannot_ re-order stores.

 - we _cannot_ sink loads across stores.

     x = *int;
     *double = 1.0;

   the store to double may change the dynamic type of what *int
   points to.

 - we _cannot_ hoist stores across loads.  (opposite view of the above)

Note that all the interesting stuff (hoisting loads and sinking stores)
is not affected by the stricter C++ aliasing rules.

The caveat is, that these rules do not map to our representation of
aliasing (VOPs) nor to type-based queries of the alias oracle.

--

So the proposal is to impose these additional restrictions ontop of
our alias representation and fixup passes that do not honour them.
One of them is loop load/store motion which messes up store ordering,
another is scheduling.

If anyone can come up with a clever way to encode the extra restrictions
into our IL be my guest (I wrapped my brain around this for some days now,
and only a may_reorder_accesses (A, B) style oracle can handle this, but
not VOPs in SSA form or something similar - we'd need a DU chain
and another set of VOPs but that looks way too costly)

Comment 106 Mark Mitchell 2007-05-22 16:04:50 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:

>  - we _cannot_ sink loads across stores.
> 
>      x = *int;
>      *double = 1.0;
> 
>    the store to double may change the dynamic type of what *int
>    points to.

To be clear, you mean something like this, right:

  int i;
  int *ip = &i;
  double *dp = (double *)&i;
  int x;
  x = *ip;
  *dp = 1.0;

?

I think that considering this code valid, and, therefore, forbidding the
interchange of the last two statements, requires a perverse reading of
the standard.  Placement new allows you to change the dynamic type of
storage; I don't think that just writing through a pointer does.  A key
goal of C++ relative to C was better type-safety.  The placement new
operator provides a facility for explicitly controlling object lifetime,
for programmers that need this.

Before we do anything to support the case above, we should have a
crystal-clear ruling from the committee that says this is valid.
Otherwise, this is exactly the kind of optimization that TBAA was
designed to perform.

For history, the reason I implemented TBAA in GCC was that the SGI
MIPSPro C/C++ compiler did these kinds of optimizations ten years ago,
and I was trying to catch us up when looking at POOMA performance on
IRIX.  G++ has had the freedom to interchange those stores for a long
time, and I believe it should continue to have that choice.

Comment 107 rguenther@suse.de 2007-05-22 16:20:36 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Tue, 22 May 2007, mark at codesourcery dot com wrote:

> rguenth at gcc dot gnu dot org wrote:
> 
> >  - we _cannot_ sink loads across stores.
> > 
> >      x = *int;
> >      *double = 1.0;
> > 
> >    the store to double may change the dynamic type of what *int
> >    points to.
> 
> To be clear, you mean something like this, right:
> 
>   int i;
>   int *ip = &i;
>   double *dp = (double *)&i;
>   int x;
>   x = *ip;
>   *dp = 1.0;
> 
> ?
> 
> I think that considering this code valid, and, therefore, forbidding the
> interchange of the last two statements, requires a perverse reading of
> the standard.  Placement new allows you to change the dynamic type of
> storage; I don't think that just writing through a pointer does.  A key
> goal of C++ relative to C was better type-safety.  The placement new
> operator provides a facility for explicitly controlling object lifetime,
> for programmers that need this.

I just wanted to point out that even with a "perverse" reading of the
C++ standard we don't loose the most interesting memory operations
(respectively load hoisting and store sinking).

> Before we do anything to support the case above, we should have a
> crystal-clear ruling from the committee that says this is valid.
> Otherwise, this is exactly the kind of optimization that TBAA was
> designed to perform.

TBAA still performs the useful cases of load hoisting and store sinking.
The "perverse" reading of the standard is not making programs invalid
that are valid as of the strict TBAA interpretation - it just implements
(IMHO) a cheap way of making placement new work.

> For history, the reason I implemented TBAA in GCC was that the SGI
> MIPSPro C/C++ compiler did these kinds of optimizations ten years ago,
> and I was trying to catch us up when looking at POOMA performance on
> IRIX.  G++ has had the freedom to interchange those stores for a long
> time, and I believe it should continue to have that choice.

So, what kind of useful transformations rely on hoisting stores or
sinking loads?

Richard.

Comment 108 gdr@cs.tamu.edu 2007-05-22 16:55:01 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
|  new does not change the dynamic type as it should
| 
| rguenth at gcc dot gnu dot org wrote:
| 
| >  - we _cannot_ sink loads across stores.
| > 
| >      x = *int;
| >      *double = 1.0;
| > 
| >    the store to double may change the dynamic type of what *int
| >    points to.
| 
| To be clear, you mean something like this, right:
| 
|   int i;
|   int *ip = &i;
|   double *dp = (double *)&i;
|   int x;
|   x = *ip;
|   *dp = 1.0;
| 
| ?
| 
| I think that considering this code valid, and, therefore, forbidding the
| interchange of the last two statements, requires a perverse reading of
| the standard.

I'm not sure Richard is suggesting that -- I believe, we all agree
that the above is invalid.  It introduces an assumption that was not
present in Richard's previous message (Richard might want to make
explicit his assumptions). Namely, that we do know the definition of
of the object  

    int i;

therefore we know that we can not possibly change its dynamic type.

Consider the following instead

   // tu-1.C
   void f(int* p) {
      *p = 90;
      // ...
      *(double *) p = 8.3748;
   };

Is the above code invalid, independent of context?   I don't think
you can find a wording in the standard that says it is invalid.
Indeed, consider this:

   // tu-2.C
   void f(int*);
   void g() {
      union {
        int i;
        double d;
      } t;

     t.i = 42;
     f(&t);
     cout << t.d << endl;
   }

I believe we can all agree the definition of g is valid.

-- Gaby

Comment 109 Mark Mitchell 2007-05-22 17:19:33 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

gdr at cs dot tamu dot edu wrote:

> Consider the following instead
> 
>    // tu-1.C
>    void f(int* p) {
>       *p = 90;
>       // ...
>       *(double *) p = 8.3748;
>    };
> 
> Is the above code invalid, independent of context?   I don't think
> you can find a wording in the standard that says it is invalid.

IMO, the standard is just not clear with respect to aliasing.  We cannot
rely upon it, except as a guide.  As I've said throughout this thread,
it doesn't make sense to try to do "close reading" of the standard for
aliasing issues because it just wasn't written with those issues in
mind, just as there are all of the famous memory model issues in C.

In any case, I consider the code above invalid.

> Indeed, consider this:
> 
>    // tu-2.C
>    void f(int*);
>    void g() {
>       union {
>         int i;
>         double d;
>       } t;
> 
>      t.i = 42;
>      f(&t);
>      cout << t.d << endl;
>    }
> 
> I believe we can all agree the definition of g is valid.

No, I do not.  And GCC historically has not; you are only allowed to use
the union for type-punning if the accesses are through the union
directly.  That was the decision we made a long time ago regarding TBAA,
and it even appears in the manual; the -fstrict-aliasing documentation says:

"The practice of reading from a different union member than the one most
recently written to (called ``type-punning'') is common.  Even with
@option{-fstrict-aliasing}, type-punning is allowed, provided the memory
is accessed through the union type.  So, the code above will work as
expected.  However, this code might not:
@smallexample
int f() @{
  a_union t;
  int* ip;
  t.d = 3.0;
  ip = &t.i;
  return *ip;
@}
@end smallexample"

The point here is that the compiler is allowed to decide that t.d does
not alias "*ip" because the latter is not a direct access through the union.

Comment 110 gdr@cs.tamu.edu 2007-05-22 17:25:20 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| > Indeed, consider this:
| > 
| >    // tu-2.C
| >    void f(int*);
| >    void g() {
| >       union {
| >         int i;
| >         double d;
| >       } t;
| > 
| >      t.i = 42;
| >      f(&t);
| >      cout << t.d << endl;
| >    }
| > 
| > I believe we can all agree the definition of g is valid.
| 
| No, I do not.  And GCC historically has not; you are only allowed to use
| the union for type-punning if the accesses are through the union
| directly. 

I am not talking of the GCC's historical behaviour here, but what the
standard actually says.  For the object "t", above the last write was
to the double field, therefore the read is well-defined.

-- Gaby

Comment 111 Mark Mitchell 2007-05-22 17:37:49 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

gdr at cs dot tamu dot edu wrote:

> | No, I do not.  And GCC historically has not; you are only allowed to use
> | the union for type-punning if the accesses are through the union
> | directly. 
> 
> I am not talking of the GCC's historical behaviour here, but what the
> standard actually says.  For the object "t", above the last write was
> to the double field, therefore the read is well-defined.

Suffice it to say that I disagree.  I'm not debating that you can read
the standard that way.  But, I don't think the standard contemplated
these issues in sufficient detail to make it useful in this respect.

Pragmatically, I don't think that we should change GCC, after years of
people using it with the current rules, to make it generate inferior
code -- without clear guidance from the standards committee.  IMO, that
needs to go beyond a reading of the current standard; there needs to be
a clear expression from the committee that, indeed, the compiler cannot
use TBAA in the way that GCC has historically used it.

I'm all for bringing G++ into better conformance with the standard, and
agree that correctness is more important than optimization, but I don't
believe that the standard was written with these considerations in mind,
so I don't think it can be relied upon in this respect.

Comment 112 gdr@cs.tamu.edu 2007-05-22 17:46:32 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

|                         But, I don't think the standard contemplated
| these issues in sufficient detail to make it useful in this respect.

The issues has been raised on the -core reflector.

-- Gaby

Comment 113 Mark Mitchell 2007-05-22 17:54:20 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

gdr at cs dot tamu dot edu wrote:
> ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> 
> |                         But, I don't think the standard contemplated
> | these issues in sufficient detail to make it useful in this respect.
> 
> The issues has been raised on the -core reflector.

So that I understand, do you mean that they have been raised in the
past, and settled, or that you've just raised them now?

Thanks,

Comment 114 gdr@cs.tamu.edu 2007-05-22 18:01:35 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
|  new does not change the dynamic type as it should
| 
| gdr at cs dot tamu dot edu wrote:
| > ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46 -------
| > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
| > dynamic type as it should
| > 
| > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
| > 
| > |                         But, I don't think the standard contemplated
| > | these issues in sufficient detail to make it useful in this respect.
| > 
| > The issues has been raised on the -core reflector.
| 
| So that I understand, do you mean that they have been raised in the
| past, and settled, or that you've just raised them now?

I just raised it, mainly because I do not believe your conclusions are
right. 

The type information you can get from write and read  are not just
those appearing lexically in a scope.  In fact, the semantics is defined
in terms of the run time read and write access.  
That is what makes "C a strongly typed and weakly check language" (DMR).

This whole issue does not mean you have to abandon TBAA.  It means you
have be careful in how you use the information gained from TBAA.  In
particular, many conclusions are OK when you have the definition of
the objects at hand.

-- Gaby

Comment 115 Daniel Berlin 2007-05-22 18:10:21 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 22 May 2007 16:54:24 -0000, mark at codesourcery dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #113 from mark at codesourcery dot com  2007-05-22 17:54 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
>
> gdr at cs dot tamu dot edu wrote:
> > ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46 -------
> > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> > dynamic type as it should
> >
> > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> >
> > |                         But, I don't think the standard contemplated
> > | these issues in sufficient detail to make it useful in this respect.
> >
> > The issues has been raised on the -core reflector.
>
> So that I understand, do you mean that they have been raised in the
> past, and settled, or that you've just raised them now?
>
And if you've raised them now, can you please try to ensure that the
discussion covers what to do if the union'ing happens outside the
current translation unit?

IE What to do about cases where
1. One union member is written
2. We are being passed the address of a union member at the same offset
3. It then it is casted to a pointer to the type of the union member from 1.
4. It is stored/loaded

It is literally impossible to know whether the pointers being passed
into a function are from unions or not, since the union'ing may occur
outside of our translation unit.
This is true unless it is a static function whose callers do not pass
in pointers that are ever acquired externally.

If you take a strict reading of the standard that union'ing makes
everything okay, even when you can't see their may have been a union,
then you simply cannot perform TBAA except for completely local
variables, or when you are dealing with static function arguments that
meet the parameters above, or when you can get a guarantee you have
the whole program.  The ideal resolution here would be that all
accesses through a union must be must be visibly accessing through a
union type.  This is one of the proposed resolutions to the similar
problem in C (there were also proposals that said something about TBAA
changing at  function boundaries, but these are ridiculous given the
fact that it does not specify what happens if you inline, etc, so that
the function boundaries aren't where they were originally).

Comment 116 Daniel Berlin 2007-05-22 18:13:27 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #114 from gdr at cs dot tamu dot edu  2007-05-22 18:01 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
>
> "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
>
> | Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
> |  new does not change the dynamic type as it should
> |
> | gdr at cs dot tamu dot edu wrote:
> | > ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46
> -------
> | > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change
> the
> | > dynamic type as it should
> | >
> | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> | >
> | > |                         But, I don't think the standard contemplated
> | > | these issues in sufficient detail to make it useful in this respect.
> | >
> | > The issues has been raised on the -core reflector.
> |
> | So that I understand, do you mean that they have been raised in the
> | past, and settled, or that you've just raised them now?
>
> I just raised it, mainly because I do not believe your conclusions are
> right.
>
> The type information you can get from write and read  are not just
> those appearing lexically in a scope.  In fact, the semantics is defined
> in terms of the run time read and write access.
> That is what makes "C a strongly typed and weakly check language" (DMR).
>
> This whole issue does not mean you have to abandon TBAA.  It means you
> have be careful in how you use the information gained from TBAA.  In
> particular, many conclusions are OK when you have the definition of
> the objects at hand.

Uh, you do more or less have to abandon TBAA for pointer arguments
unless you can account for every single caller of your function, and
ensure that none of them are passing you pointers external to what
your compiler can see.  This case is *extremely rare* outside of the
user giving us a whole program guarantee.

TBAA's main use is in fact, in disambiguating pointer arguments.  If
you take that away, it becomes pretty much useless.
It's guarantees about local objects were never interesting.

Comment 117 gdr@cs.tamu.edu 2007-05-22 18:19:33 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"dberlin at dberlin dot org" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| And if you've raised them now, can you please try to ensure that the
| discussion covers what to do if the union'ing happens outside the
| current translation unit?
| 
| IE What to do about cases where
| 1. One union member is written
| 2. We are being passed the address of a union member at the same offset
| 3. It then it is casted to a pointer to the type of the union member from 1.
| 4. It is stored/loaded


Thanks for reminding all those points.  I'll ensure that I make those
points stand in subsequence messages.


-- Gaby

Comment 118 Mark Mitchell 2007-05-22 18:34:34 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

gdr at cs dot tamu dot edu wrote:

> Thanks for reminding all those points.  I'll ensure that I make those
> points stand in subsequence messages.

I think it's also worth pointing out to the committee that the more
aggressive aliasing rules (in which only access directly through a union
is allowed) have been GNU C/C++ practice for a long time.  I would guess
that we made this change around the year 2000.  So, there's a large body
of code that conforms to the requirements of the aggressive
interpretation.

Comment 119 gdr@cs.tamu.edu 2007-05-22 18:40:49 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
|  new does not change the dynamic type as it should
| 
| gdr at cs dot tamu dot edu wrote:
| 
| > Thanks for reminding all those points.  I'll ensure that I make those
| > points stand in subsequence messages.
| 
| I think it's also worth pointing out to the committee that the more
| aggressive aliasing rules (in which only access directly through a union
| is allowed) have been GNU C/C++ practice for a long time.

Yes, I will.  

My guess of what happened was that C++98, which was based on C90, did
not pick up the new rules for C99; and the 2003 revision focused
mainly on DRs so missed opportunity to incorporate those rules.

The meeting at Toronto is supposedly the last meeting for freasing
features.  I'll make sure this issue gets attention -- I already sent
a message to the reflector; I'll also write a paper along with the
above suggestions and recommendations.

|  I would guess
| that we made this change around the year 2000.  So, there's a large body
| of code that conforms to the requirements of the aggressive
| interpretation.

Yes; those programs will continue to be conformant.

-- Gaby

Comment 120 Mark Mitchell 2007-05-22 18:55:14 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

gdr at cs dot tamu dot edu wrote:

> |  I would guess
> | that we made this change around the year 2000.  So, there's a large body
> | of code that conforms to the requirements of the aggressive
> | interpretation.
> 
> Yes; those programs will continue to be conformant.

Indeed -- but that's not the point I was trying to make.  The point is
that your changes would force G++ to generate inferior code -- for a
codebase that already works with the more aggressive interpretation.

In other words, from a GNU "product marketing" point of view, rather
than from a C++ language standards point of view, the change you want to
make is just going to hurt our users, who have made their code work with
the aggressive interpretation.  It's only going to help people moving to
G++ from other compilers that do not use the aggressive interpretation.
 And, we already have -fno-strict-aliasing for those folks.

I'm not trying to tell you not to make the argument that you think is
best for C++ as a language; obviously, that's your right.  I'm just
pointing out that the change you want to make probably isn't going to
help users who are already working with G++.

And, although I don't have the time/energy that you seem to have to work
on these standards issues, I do plan to oppose your interpretation on
the reflector.

Comment 121 gdr@cs.tamu.edu 2007-05-22 20:12:39 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| And, although I don't have the time/energy that you seem to have to work
| on these standards issues, I do plan to oppose your interpretation on
| the reflector.

I believe you'll have hard time proving that my interpretation is
INcorrect according to the *current* standard (C++98, C++03), and
under obvious reading.  

Since, I'm not suggesting a change to the current standsard to match
my interpretation -- since it already does -- and the only change I'll
be proposing is to make C++0x match C99 rules, if you're going to oppose my
proposed changes, then you are going to oppose a change that allows
TBAA in the form currently implemented in GCC.  
I'm not sure, that really is what you wanted.

-- Gaby

Comment 122 Mike Stump 2007-05-22 20:41:47 UTC

When the standard was originally written, I do think we may have missed out on some finer points of the C object model, mainly to do with restrictions on what one is not permitted to do stemming from the declared type.  The intent was to very closely match the C object model, but formalize it better and add just enough to do subtypes and dynamic types and so on.   I think it is reasonable to leave it as strict as C is for C style code so that it can be optimized well.  I think it is reasonable to push the tighening language into the standard, as that is what we meant and our wording was meant to be as tight as C.

  The effective type of an object for an access to its stored value is the declared type of the object, if any.

Comment 123 gdr@cs.tamu.edu 2007-05-22 20:53:14 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mrs at apple dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| I think it is reasonable to push the tighening language into the
| standard, as that is what we meant and our wording was meant to be
| as tight as  C.

As C99, I guess.

I'm making the distinction between what the C++ standard currently
says, and what we would like it to say to benefit from some
optimizations with not too heroic effort.  That is why I would like a
core issue, and a resolution to change the current wording so that access
to union mambers are spelled out clearly.


| The effective type of an object for an access to its stored value is the
| declared type of the object, if any.

for C++, we need also to take into account placement-new (the original
subject reported here).  

-- Gaby

Comment 124 rguenther@suse.de 2007-05-23 09:35:04 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Tue, 22 May 2007, dberlin at dberlin dot org wrote:

> ------- Comment #116 from dberlin at gcc dot gnu dot org  2007-05-22 18:13 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu
> <gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #114 from gdr at cs dot tamu dot edu  2007-05-22 18:01 -------
> > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> > dynamic type as it should
> >
> > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> >
> > | Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
> > |  new does not change the dynamic type as it should
> > |
> > | gdr at cs dot tamu dot edu wrote:
> > | > ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46
> > -------
> > | > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change
> > the
> > | > dynamic type as it should
> > | >
> > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> > | >
> > | > |                         But, I don't think the standard contemplated
> > | > | these issues in sufficient detail to make it useful in this respect.
> > | >
> > | > The issues has been raised on the -core reflector.
> > |
> > | So that I understand, do you mean that they have been raised in the
> > | past, and settled, or that you've just raised them now?
> >
> > I just raised it, mainly because I do not believe your conclusions are
> > right.
> >
> > The type information you can get from write and read  are not just
> > those appearing lexically in a scope.  In fact, the semantics is defined
> > in terms of the run time read and write access.
> > That is what makes "C a strongly typed and weakly check language" (DMR).
> >
> > This whole issue does not mean you have to abandon TBAA.  It means you
> > have be careful in how you use the information gained from TBAA.  In
> > particular, many conclusions are OK when you have the definition of
> > the objects at hand.
> 
> Uh, you do more or less have to abandon TBAA for pointer arguments
> unless you can account for every single caller of your function, and
> ensure that none of them are passing you pointers external to what
> your compiler can see.  This case is *extremely rare* outside of the
> user giving us a whole program guarantee.
> 
> TBAA's main use is in fact, in disambiguating pointer arguments.  If
> you take that away, it becomes pretty much useless.
> It's guarantees about local objects were never interesting.

But you can still perform hoisting loads of incoming pointer arguments
and sinking stores to incoming pointer arguments.  Please read comment 
#105 and come up with a testcase where we wouldn't be allowed to do
a useful transformation we do now.  So I believe making placement new
work with our current scheme will severely pessimize placement new
users, but if we slightly change rules for everyone we'll be all happy.

Richard.

Comment 125 gdr@cs.tamu.edu 2007-05-23 14:22:29 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| But you can still perform hoisting loads of incoming pointer arguments
| and sinking stores to incoming pointer arguments.  Please read comment 
| #105 and come up with a testcase where we wouldn't be allowed to do
| a useful transformation we do now.  So I believe making placement new
| work with our current scheme will severely pessimize placement new
| users, but if we slightly change rules for everyone we'll be all happy.

Update.

The only comment I have so far on the -core reflector is to the effect
that the reading that the program I posted earlier violates NO
aliasing rule.  I'll follow with the proposal to bring the rules in
line with recent C99 rules.

-- Gaby

Comment 126 rguenther@suse.de 2007-05-23 14:43:04 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, gdr at cs dot tamu dot edu wrote:

> ------- Comment #125 from gdr at cs dot tamu dot edu  2007-05-23 14:22 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:
> 
> [...]
> 
> | But you can still perform hoisting loads of incoming pointer arguments
> | and sinking stores to incoming pointer arguments.  Please read comment 
> | #105 and come up with a testcase where we wouldn't be allowed to do
> | a useful transformation we do now.  So I believe making placement new
> | work with our current scheme will severely pessimize placement new
> | users, but if we slightly change rules for everyone we'll be all happy.
> 
> Update.
> 
> The only comment I have so far on the -core reflector is to the effect
> that the reading that the program I posted earlier violates NO
> aliasing rule.  I'll follow with the proposal to bring the rules in
> line with recent C99 rules.

Note that it is important to retain the capability to implement
memory allocators in C++ that are allowed to re-use memory for different
typed objects.  Which is not possible with C99 rules.

Richard.

Comment 127 Ian Lance Taylor 2007-05-23 15:23:39 UTC

Re comment #105.

The case where TBAA is most useful is on a deeply pipelined in-order processor with multiple function units and a high latency memory cache.  One example I've worked on is an embedded VLIW processor with vector instructions.  TBAA is of relatively little interest on an out-of-order processor.

It's interesting to talk about PTA when you see the actual objects and you see how the pointers are taken.  But in practice many real functions simply take pointers to arrays in memory.  PTA can say nothing useful about those pointers, since they could point to anything.  TBAA can say a lot.

Losing the ability to sinks loads across stores and vice-versa means putting additional constraints on the scheduler which makes it harder to exploit the multiple function units.  Conversely, it constrains the register allocator by tending to extend register lifetimes.

Also, in your list of cases, you missed
    x = *int;
    *double = 1.0;
    x = *int;
With TBAA we can eliminate the second load as a duplicate.  With your proposed aliasing change we can not.

I don't understand why you argue in comment #124 that our current scheme will esverely penalize placement new.  On mainline there is no penalty for placement new.  So you must be referring to one of the patches.  But I don't think we've agreed on any of them at the moment.  And I think we see the outlines of a successful patch: make placement new return a pointer which effectively aliases everything.  That will permit us to reorder loads and eliminate dead stores.  It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical that that will count as a severe penalty.

Comment 128 rguenther@suse.de 2007-05-23 15:37:57 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, ian at airs dot com wrote:

> ------- Comment #127 from ian at airs dot com  2007-05-23 15:23 -------
> Re comment #105.
> 
> The case where TBAA is most useful is on a deeply pipelined in-order processor
> with multiple function units and a high latency memory cache.  One example I've
> worked on is an embedded VLIW processor with vector instructions.  TBAA is of
> relatively little interest on an out-of-order processor.
> 
> It's interesting to talk about PTA when you see the actual objects and you see
> how the pointers are taken.  But in practice many real functions simply take
> pointers to arrays in memory.  PTA can say nothing useful about those pointers,
> since they could point to anything.  TBAA can say a lot.
> 
> Losing the ability to sinks loads across stores and vice-versa means putting
> additional constraints on the scheduler which makes it harder to exploit the
> multiple function units.  Conversely, it constrains the register allocator by
> tending to extend register lifetimes.

Sure, scheduling is one of the places we sink loads or hoist stores.

> 
> Also, in your list of cases, you missed
>     x = *int;
>     *double = 1.0;
>     x = *int;
> With TBAA we can eliminate the second load as a duplicate.  With your proposed
> aliasing change we can not.

We still can.  We can hoist the second load before the store and so make
both loads load the same value.  The fact that there is a second load
of int after the store to double disambiguates the two memory locations.

> I don't understand why you argue in comment #124 that our current scheme will
> esverely penalize placement new.  On mainline there is no penalty for placement
> new.  So you must be referring to one of the patches.

Yes, I was refering to patches that make changes to forbid sinking a load
across a store.  With our current IL this will also forbid hoisting a
loop-invariant load - which is the key to good performance on tramp3d.

> But I don't think we've
> agreed on any of them at the moment.  And I think we see the outlines of a
> successful patch: make placement new return a pointer which effectively aliases
> everything.  That will permit us to reorder loads and eliminate dead stores. 
> It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical
> that that will count as a severe penalty.

But...

void foo(int *p)
{
  float *f = (float *)p;
  new (p) float;
  *f = 1.0;
}

there is no "new pointer" from placement new.  All placement new does is
(magically) change the dynamic pointed-to type.

Or would you argue the above is invalid?  It is hard to make
non-trivial cases work and not impose a full memory barrier at the
point of the placement new.  In the above case you would need to
make the alias sets of float conflict with everything in which case
only trivial cases will allow to DSE stores to float or hoist loads
of floats.

As I said - with our current IL design I see no nice way to express
the constraints placement new sets without imposing more constraints
than necessary.  (This is also true for my proposed changes -- those
constraints would be implicit in the IL, just optimization passes
would not need to look for PLACEMENT_NEW_EXPRs but simply follow
rules if they do optimizations on memory operations)

Maybe to find a compromise how about making PLACEMENT_NEW_EXPR
effective only after RTL expansion?  So, during tree optimization
completely ignore it (from the alias IL representation perspective)
and follow the changed rules I proposed and after RTL expansion
make placement new effects explicit, like by merging target type
alias sets with alias set zero.

Richard.

Comment 129 gdr@cs.tamu.edu 2007-05-23 15:59:44 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| Note that it is important to retain the capability to implement
| memory allocators in C++ that are allowed to re-use memory for different
| typed objects.

Yes, the C++ committee will never allow a rule that bans
platcement-new and memory allocators :-)

-- Gaby

Comment 130 Ian Lance Taylor 2007-05-23 16:43:19 UTC

In this example

void foo(int *p)
{
  float *f = (float *)p;
  new (p) float;
  *f = 1.0;
}

the pointer is p.  In fact the relevant pointer is always the argument to placement new, and every pointer which PTA can associate with it.

We may simply have an impasse here.  You have a set of rules which will change the compiler to support placement new while giving better results for your code.  I believe that your change will penalize the code I used to work with.

Comment 131 rguenther@suse.de 2007-05-23 16:54:43 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, ian at airs dot com wrote:

> ------- Comment #130 from ian at airs dot com  2007-05-23 16:43 -------
> In this example
> 
> void foo(int *p)
> {
>   float *f = (float *)p;
>   new (p) float;
>   *f = 1.0;
> }
> 
> the pointer is p.  In fact the relevant pointer is always the argument to
> placement new, and every pointer which PTA can associate with it.

I don't read that into the semantics of placement new ;)  Placement
new doesn't care about the pointer used to refer to the memory it
operates on.

> We may simply have an impasse here.  You have a set of rules which will change
> the compiler to support placement new while giving better results for your
> code.  I believe that your change will penalize the code I used to work with.

Ok, fair enough.  I'll try to teach load-store-motion for loops to not
re-order the inserted stores compared to the order on the loop exit path.
This is the only transformation on the tree-level I came across that 
violates my proposed semantics.

Richard.

Comment 132 Daniel Berlin 2007-05-23 19:03:12 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 23 May 2007 08:35:12 -0000, rguenther at suse dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #124 from rguenther at suse dot de  2007-05-23 09:35 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
>
> On Tue, 22 May 2007, dberlin at dberlin dot org wrote:
>
> > ------- Comment #116 from dberlin at gcc dot gnu dot org  2007-05-22 18:13 -------
> > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> > dynamic type as it should
> >
> > On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu
> > <gcc-bugzilla@gcc.gnu.org> wrote:
> > >
> > >
> > > ------- Comment #114 from gdr at cs dot tamu dot edu  2007-05-22 18:01 -------
> > > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> > > dynamic type as it should
> > >
> > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> > >
> > > | Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
> > > |  new does not change the dynamic type as it should
> > > |
> > > | gdr at cs dot tamu dot edu wrote:
> > > | > ------- Comment #112 from gdr at cs dot tamu dot edu  2007-05-22 17:46
> > > -------
> > > | > Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change
> > > the
> > > | > dynamic type as it should
> > > | >
> > > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> > > | >
> > > | > |                         But, I don't think the standard contemplated
> > > | > | these issues in sufficient detail to make it useful in this respect.
> > > | >
> > > | > The issues has been raised on the -core reflector.
> > > |
> > > | So that I understand, do you mean that they have been raised in the
> > > | past, and settled, or that you've just raised them now?
> > >
> > > I just raised it, mainly because I do not believe your conclusions are
> > > right.
> > >
> > > The type information you can get from write and read  are not just
> > > those appearing lexically in a scope.  In fact, the semantics is defined
> > > in terms of the run time read and write access.
> > > That is what makes "C a strongly typed and weakly check language" (DMR).
> > >
> > > This whole issue does not mean you have to abandon TBAA.  It means you
> > > have be careful in how you use the information gained from TBAA.  In
> > > particular, many conclusions are OK when you have the definition of
> > > the objects at hand.
> >
> > Uh, you do more or less have to abandon TBAA for pointer arguments
> > unless you can account for every single caller of your function, and
> > ensure that none of them are passing you pointers external to what
> > your compiler can see.  This case is *extremely rare* outside of the
> > user giving us a whole program guarantee.
> >
> > TBAA's main use is in fact, in disambiguating pointer arguments.  If
> > you take that away, it becomes pretty much useless.
> > It's guarantees about local objects were never interesting.
>
> But you can still perform hoisting loads of incoming pointer arguments
> and sinking stores to incoming pointer arguments.  Please read comment
> #105 and come up with a testcase where we wouldn't be allowed to do
> a useful transformation we do now.  So I believe making placement new
> work with our current scheme will severely pessimize placement new
> users, but if we slightly change rules for everyone we'll be all happy.
>
Have fun encoding this into the IL :)
If TBAA can't say things about particular load/store pairs, without
having to do a lot of work to see what is in between them, it's not
going to be useful to us.

Comment 133 Mark Mitchell 2007-05-23 19:43:48 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

ian at airs dot com wrote:

> The case where TBAA is most useful is on a deeply pipelined in-order processor
> with multiple function units and a high latency memory cache.  One example I've
> worked on is an embedded VLIW processor with vector instructions.  TBAA is of
> relatively little interest on an out-of-order processor.

The original motivating case for me was stuff like:

  void f (int *a, double *d) {
    for (int i = 1; i < N; ++i) {
      a[i] += i;
      d[i] = d[i-1] * a[i];
    }
  }

That's not the right code, but the point is that TBAA can allow us to
avoid reloading d[i-1] from one iteration to the next, despite the store
to a[i].  That reduces memory access and instruction count.  Ordinary
PTA does not allow this.

Of course, Gaby's memory model doesn't allow this optimization either;
we have to worry that "a" and "d" are both in some union somewhere.
That's why Gaby's model is so bad from an optimization point of view; it
makes the compiler assume a worst-case situation, even though that
worst-case situation almost never actually happens.

I'm not an expert on CPU models, so I'm not sure how out-of-order vs.
in-order might matter here.

> And I think we see the outlines of a
> successful patch: make placement new return a pointer which effectively aliases
> everything.  That will permit us to reorder loads and eliminate dead stores. 
> It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical
> that that will count as a severe penalty.

That's exactly what I think.

Comment 134 Richard Biener 2007-05-23 19:54:24 UTC

But using a union for type-punning is a gcc extension (and of course the extension
is only for access through the union), so with strict C99/C++ semantics we can avoid reloading d[i-1] even if a and d were in the same union because the code would then be invalid.  So the union case is a non-issue here (it was only used to
make available enough properly aligned storage for the particular testcase).

Comment 135 Richard Biener 2007-05-23 19:56:54 UTC

Re comment #132 -- you cannot encode this into the IL :/  And I don't propose to
do so.  And I say any working and optimal (from optimization perspective) variant
of a fix for this PR has the same problem.

Comment 136 Mark Mitchell 2007-05-23 20:10:14 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:
> ------- Comment #134 from rguenth at gcc dot gnu dot org  2007-05-23 19:54 -------
> But using a union for type-punning is a gcc extension (and of course the
> extension
> is only for access through the union), so with strict C99/C++ semantics we can
> avoid reloading d[i-1] even if a and d were in the same union because the code
> would then be invalid.  

Gaby's claim, as I understand it, is that writing to a union member,
even through a pointer, rather than directly through the union, is
valid, and activates that part of the union.  So, it is not a GCC
extension.  For code like:

  a[i] = i;
  d[i] = d[i-1] + a[i];

I guess you can argue that a[i] does not alias d[i-1], even in Gaby's
model, because a[i] is written to right before the access to d[i-1].
But, you don't know that a[m] doesn't alias d[n] for arbitrary m and n.
 So, it's easy to create variations on the case I posted that can't be
optimized, if you agree to Gaby's model.

> So the union case is a non-issue here (it was only used to
> make available enough properly aligned storage for the particular testcase).

I agree that union case *should* be a non-issue in this context, where
we were discussing how to fix placement new, but Gaby has made it one
because he is claiming that placement new is not the only way to change
the dynamic type of memory.  Gaby's claim is that given an arbitrary
pointer "p", saying:

 (int *)p = 3;

is the same as saying:

 *(new (p) int) = 3;

That makes life for the optimizers much, much harder.

Comment 137 Richard Biener 2007-05-23 20:46:22 UTC

Created attachment 13607 [details]
patch to perserve store ordering in loop load/store motion

<quote>
Gaby's claim is that given an arbitrary
pointer "p", saying:

 (int *)p = 3;

is the same as saying:

 *(new (p) int) = 3;

That makes life for the optimizers much, much harder.
</quote>

I say so as well (that those are the same), but I don't agree that this
makes life for optimizers much harder.

Anyway, this is a proposed patch for loop load/store motion that fixes the
last posted testcase.  Scheduling issues remain.

Comment 138 Richard Biener 2007-05-23 20:57:27 UTC

Note the patch is not 100% correct as we need to preserve store ordering on the
path to all exit edges which may be different for each exit.

Comment 139 jsm-csl@polyomino.org.uk 2007-05-23 21:00:38 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, mark at codesourcery dot com wrote:

> Of course, Gaby's memory model doesn't allow this optimization either;
> we have to worry that "a" and "d" are both in some union somewhere.

DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was 
what eventually said for C that you don't need to worry about that; I'd 
think the aim should be to get C++ to agree with that ruling.

DR#283, to appear in TC3, is what specifies type punning through unions.

Comment 140 Mark Mitchell 2007-05-23 21:07:09 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:

> <quote>
> Gaby's claim is that given an arbitrary
> pointer "p", saying:
> 
>  (int *)p = 3;
> 
> is the same as saying:
> 
>  *(new (p) int) = 3;
> 
> That makes life for the optimizers much, much harder.
> </quote>
> 
> I say so as well (that those are the same), but I don't agree that this
> makes life for optimizers much harder.

Placement new is rare; assignments to pointers are everywhere.

Note that the first case does not need to use an explicit cast.  In a
function:

  void f(int *p) {
    *p = 3;
  }

under Gaby's interpretation, we cannot be sure that "p" points to an
"int" before this function, so we can't be sure the write to "*p"
doesn't clobber memory of other types.  TBAA is one of the few ways to
disambiguate pointers in the absence of whole-program optimization, and
this model really undermines TBAA.

Frankly, I'm surprised that you are taking this position.  This is a
change to the compiler that can only hurt high-performance C++
applications, which is an area I know you care about.  I know that
you're unhappy about how Ian's patches might hurt programs that use
placement-new in an inner loop, but this model will impose the same
penalties on programs that write to pointers in an inner loop.

Comment 141 Mark Mitchell 2007-05-23 21:13:02 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

joseph at codesourcery dot com wrote:

> DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was 
> what eventually said for C that you don't need to worry about that; I'd 
> think the aim should be to get C++ to agree with that ruling.

Thank you for the pointer.  That seems directly on point, and makes C99
match the existing GCC practice: we don't need to worry that pointers
might point to unions.

Gaby, would you please forward that to the C++ reflector, so that the
reflector has that information as well?  They should be aware that the
model you're proposing is at odds with C99.

Thanks,

Comment 142 rguenther@suse.de 2007-05-23 21:14:56 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, mark at codesourcery dot com wrote:

> ------- Comment #140 from mark at codesourcery dot com  2007-05-23 21:07 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
> 
> rguenth at gcc dot gnu dot org wrote:
> 
> > <quote>
> > Gaby's claim is that given an arbitrary
> > pointer "p", saying:
> > 
> >  (int *)p = 3;
> > 
> > is the same as saying:
> > 
> >  *(new (p) int) = 3;
> > 
> > That makes life for the optimizers much, much harder.
> > </quote>
> > 
> > I say so as well (that those are the same), but I don't agree that this
> > makes life for optimizers much harder.
> 
> Placement new is rare; assignments to pointers are everywhere.
> 
> Note that the first case does not need to use an explicit cast.  In a
> function:
> 
>   void f(int *p) {
>     *p = 3;
>   }
> 
> under Gaby's interpretation, we cannot be sure that "p" points to an
> "int" before this function, so we can't be sure the write to "*p"
> doesn't clobber memory of other types.  TBAA is one of the few ways to
> disambiguate pointers in the absence of whole-program optimization, and
> this model really undermines TBAA.

In C inside the function f we can indeed be sure *p points to an "int".
Now what matters is, that even in C for

double g(int *p, double *d)
{
  f(p);
  return *d;
}

we cannot be sure (in absence of whole-program optimization or the
body of f available) that the call to f does not clobber *d through
the value of the pointer p.  As it can look like

void f(int *p)
{
  double *d = p;
  *d = 1.0;
}

> Frankly, I'm surprised that you are taking this position.  This is a
> change to the compiler that can only hurt high-performance C++
> applications, which is an area I know you care about.  I know that
> you're unhappy about how Ian's patches might hurt programs that use
> placement-new in an inner loop, but this model will impose the same
> penalties on programs that write to pointers in an inner loop.

No it won't.  Or at least - I belive so - unless I see a patch that
manages to implement placement new with the same minor restrictions.

If you discount scheduling on in-order machines, what would be an
optimization that can be no longer done with Gabys and my scheme?
I believe there are none.  Also other compilers manage to not
miscompile in the face of placement new but still optimize loops
with them.

Richard.

Comment 143 Mark Mitchell 2007-05-23 21:27:05 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenther at suse dot de wrote:
>
>>   void f(int *p) {
>>     *p = 3;
>>   }
>>
>> under Gaby's interpretation, we cannot be sure that "p" points to an
>> "int" before this function, so we can't be sure the write to "*p"
>> doesn't clobber memory of other types.  TBAA is one of the few ways to
>> disambiguate pointers in the absence of whole-program optimization, and
>> this model really undermines TBAA.
> 
> In C inside the function f we can indeed be sure *p points to an "int".

Not before the assignment to "p".  In:

  void f(int *p, double *q) {
    double d = *q;
    *p = 3;
    return d;
  }

your interpretation does not allow moving the load of "*q" after the
store to "*p".  That's clearly limiting the freedom of the optimizer.

Now, we can argue about how much that matters -- but it's inarguable
that it's a limitation.

> If you discount scheduling on in-order machines, what would be an
> optimization that can be no longer done with Gabys and my scheme?
> I believe there are none.  Also other compilers manage to not
> miscompile in the face of placement new but still optimize loops
> with them.

I'm lost.

What does Gaby's model have to do with placement new?

We're all agreed that (a) placement new can change the dynamic type of
memory, (b) therefore GCC currently has a bug, (c) we want the fix to
have as little optimization impact as possible.

Gaby's model says that we know less about dynamic types than we
presently think we do, because there might be a union out there
somewhere.  (Fortunately, as Joseph points out, C99 has already answered
this question.  Surely we can agree that making C99 and C++ different in
this respect is a bad idea.)

If "*p = 3" changes the dynamic type of "*p", that just means we know
even less.  The less we know, the less optimization we can do.  Making
"*p = 3" change the dynamic type of "*p" can't possibly help us
implement placement new more efficiently.  Whatever conservative
assumptions we want to make about "*p = 3" we could make about "new (p)
int" instead.

If you have a patch that fixes the placement new problem, making us
generate correct code, and with minimal/no impact on optimization,
that's great!  But, that can't possibly, in and of itself, be a reason
to change the rules we're using for TBAA.

Comment 144 rguenther@suse.de 2007-05-23 21:48:03 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, mark at codesourcery dot com wrote:

> rguenther at suse dot de wrote:
> >
> >>   void f(int *p) {
> >>     *p = 3;
> >>   }
> >>
> >> under Gaby's interpretation, we cannot be sure that "p" points to an
> >> "int" before this function, so we can't be sure the write to "*p"
> >> doesn't clobber memory of other types.  TBAA is one of the few ways to
> >> disambiguate pointers in the absence of whole-program optimization, and
> >> this model really undermines TBAA.
> > 
> > In C inside the function f we can indeed be sure *p points to an "int".
> 
> Not before the assignment to "p".  In:
> 
>   void f(int *p, double *q) {
>     double d = *q;
>     *p = 3;
>     return d;
>   }
> 
> your interpretation does not allow moving the load of "*q" after the
> store to "*p".  That's clearly limiting the freedom of the optimizer.

Yes, it's limiting the freedom of optimizers.

> Now, we can argue about how much that matters -- but it's inarguable
> that it's a limitation.
> 
> > If you discount scheduling on in-order machines, what would be an
> > optimization that can be no longer done with Gabys and my scheme?
> > I believe there are none.  Also other compilers manage to not
> > miscompile in the face of placement new but still optimize loops
> > with them.
> 
> I'm lost.
> 
> What does Gaby's model have to do with placement new?

Only so much that we seem to agree on the semantics of placement new.
Gaby extends this semantics to any store, so

  *p = X;

is equivalent to

  *(new (p) __typeof__ *p) = X;

to which semantics we thus can agree (not to whether those two should
be the same, mandated by the standard or liked by some of us or not).

> Gaby's model says that we know less about dynamic types than we
> presently think we do, because there might be a union out there
> somewhere.  (Fortunately, as Joseph points out, C99 has already answered
> this question.  Surely we can agree that making C99 and C++ different in
> this respect is a bad idea.)

I don't think dragging in unions helps us here ;)  Maybe Gaby can clarify
if and how unions relate here, but I didn't percieve any previous comment
as making implicit unions relevant here apart from what GCC and
apperantly C99 agree to.

> If "*p = 3" changes the dynamic type of "*p", that just means we know
> even less.  The less we know, the less optimization we can do.

True.

> Making "*p = 3" change the dynamic type of "*p" can't possibly help us
> implement placement new more efficiently.

I disagree here.  Making "*p = 3" change the dynamic type of "*p" will
make the placement new issue moot - the current library implementation
is fine then and we don't need any new explicit or implicit side-effects
of it.

> Whatever conservative
> assumptions we want to make about "*p = 3" we could make about "new (p)
> int" instead.

True.  I say making them about "*p = 3" is way easier as we are changing
semantics of memory operations and *p = 3 is one, but placement new is 
not.

> If you have a patch that fixes the placement new problem, making us
> generate correct code, and with minimal/no impact on optimization,
> that's great!  But, that can't possibly, in and of itself, be a reason
> to change the rules we're using for TBAA.

Well, it depends if you read it as changing TBAA rules.  Does preserving
store ordering in loop load/store motion change TBAA rules?  Not in
itself - but clearly changed TBAA rules would force us to.  Same for
limiting scheduling of memory operations.  Btw, the necessary patch
may be as simple as

Index: alias.c
===================================================================
--- alias.c     (revision 124998)
+++ alias.c     (working copy)
@@ -2284,7 +2284,8 @@ write_dependence_p (rtx mem, rtx x, int 
       || MEM_ALIAS_SET (mem) == ALIAS_SET_MEMORY_BARRIER)
     return 1;

-  if (DIFFERENT_ALIAS_SETS_P (x, mem))
+  /* We cannot rely on alias set differences for C++ aliasing semantics.  
*/
+  if (0 && DIFFERENT_ALIAS_SETS_P (x, mem))
     return 0;

   /* A read from read-only memory can't conflict with read-write memory.  */

but as I'm currently lacking a testcase that fails due to scheduling
I'm not 100% sure.

Richard.

Comment 145 Daniel Berlin 2007-05-23 22:02:06 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 23 May 2007 18:57:03 -0000, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #135 from rguenth at gcc dot gnu dot org  2007-05-23 19:56 -------
> Re comment #132 -- you cannot encode this into the IL :/  And I don't propose
> to
> do so.  And I say any working and optimal (from optimization perspective)
> variant
> of a fix for this PR has the same problem.
So instead you are going to change every single optimization to
preserve things like store ordering, even though the underlying
aliasing information says it's okay to reorder (because it's broken in
the presence of these restrictions)?
No thanks!
This would be a massive rathole to go down in the end, and completely
at odds with the idea that our middle end IL is language independent,
and like *C*.
C99 has a clear proposed resolution to DR 236, and it says you need to
visibly use a union (which takes care of the other issue mentioned
here)
If placement new needs to mean a memory barrier to get things right
for placement new/C++, it needs to mean a memory barrier to get things
right for now.
Changing language independent optimizations to preserve something
magical in the face of wrong aliasing information is not the way to
fix this PR.

I would support turning off TBAA over that solution, given the choice.

Comment 146 Mark Mitchell 2007-05-23 22:13:26 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenther at suse dot de wrote:

> Only so much that we seem to agree on the semantics of placement new.
> Gaby extends this semantics to any store, so
> 
>   *p = X;
> 
> is equivalent to
> 
>   *(new (p) __typeof__ *p) = X;
> 
> to which semantics we thus can agree (not to whether those two should
> be the same, mandated by the standard or liked by some of us or not).

I think I understand.  Let me just restate this, to make sure:

(a) Gaby's model makes the first assignment above equivalent to the second
(b) Thus, in Gaby's model, if we solve either case, we solve both.

I agree with that statement.  (I don't like the model -- but I agree
with the logic.)

>> Making "*p = 3" change the dynamic type of "*p" can't possibly help us
>> implement placement new more efficiently.
> 
> I disagree here.  Making "*p = 3" change the dynamic type of "*p" will
> make the placement new issue moot - the current library implementation
> is fine then and we don't need any new explicit or implicit side-effects
> of it.
> 
>> Whatever conservative
>> assumptions we want to make about "*p = 3" we could make about "new (p)
>> int" instead.
> 
> True.  I say making them about "*p = 3" is way easier as we are changing
> semantics of memory operations and *p = 3 is one, but placement new is 
> not.

I think I understand what you're saying here too; again, I'll restate to
make sure:

(a) In the model where "*p = 3" changes the dynamic type of memory, we
don't need to do anything special to handle placement new.
(b) It's relatively easy to implement support for "*p = 3" changing the
dynamic type of memory.
(c) Therefore, it's relatively easy to fix our placement new problem.

I agree with those statements too.

However, I don't like this approach because I believe it will result in
inferior code.  I think that you're looking at the proposed placement
new patches, then looking at what they do to a particular codebase,
which happens to use placement-new in an inner loop, and becoming
unhappy with the patches.  I suspect that the changes required to
support the "*p = 3" model, while perhaps better for that case, will be
worse for many others.

I can't prove that.  But, I did implement TBAA after looking at what
other compilers did, specifically to improve performance of (ironically)
POOMA.  So, I'm afraid that you're going to find that if we allow memory
writes to change the type of memory, that we will get worse performance.

That's why I'm much more comfortable with a change that only affects
placement new.  At least, if placement new is slow, we can tell users
not to use it in inner loops.  If using pointers are slow, there's
nothing we can do.

Comment 147 gdr@cs.tamu.edu 2007-05-23 23:42:11 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Of course, Gaby's memory model doesn't allow this optimization either;
| we have to worry that "a" and "d" are both in some union somewhere.
| That's why Gaby's model is so bad from an optimization point of view; it

A meta comment:  I would highly appreciate if you stopped calling it
"Gaby's model". 

It is the *current C++ standard object model*.  If you don't like it; 
that is fine.  But, you don't have to like it.  (I don't).  But,
please don't call it "Gaby's model".  The fact that you don't like it
does not suddenly make it not the standard model.

-- Gaby

Comment 148 gdr@cs.tamu.edu 2007-05-23 23:50:36 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Gaby's claim, as I understand it, is that writing to a union member,
| even through a pointer, rather than directly through the union, is
| valid, and activates that part of the union.

Yes; that is what the current C++ standard says.  

C99 tried to fix things with the notion of effective type, but that
notion does not fully address the issue and it is not part of C++98
and C++03.  What I'm doing is making sure that we all know where
(potential existing codes) we are starting from and make sure that we
do understand the implications of the changes we would like to make to
make optimizers easier.  I spent some of my time this afternoon
going through this with the original inventor of C++ and trying to see
whether we have choice to make things less hard.  

It is clear to us that, *if* we have a choice -- note *if* we have a
choice -- when the aliasing through union should be made appearant;
but for the moment, that is hard to formulate correctly  and simply
(with the constraint of not breaking the existing object model
badly). 

-- Gaby

Comment 149 gdr@cs.tamu.edu 2007-05-23 23:56:27 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Comment #140 from mark at codesourcery dot com  2007-05-23 21:07 -------
| Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
|  new does not change the dynamic type as it should
| 
| rguenth at gcc dot gnu dot org wrote:
| 
| > <quote>
| > Gaby's claim is that given an arbitrary
| > pointer "p", saying:
| > 
| >  (int *)p = 3;
| > 
| > is the same as saying:
| > 
| >  *(new (p) int) = 3;
| > 
| > That makes life for the optimizers much, much harder.
| > </quote>
| > 
| > I say so as well (that those are the same), but I don't agree that this
| > makes life for optimizers much harder.
| 
| Placement new is rare; assignments to pointers are everywhere.

Naked placement new may be rare; but, placement new is general are not
rare because of the STL-style containers and algorithms.

| 
| Note that the first case does not need to use an explicit cast.  In a
| function:
| 
|   void f(int *p) {
|     *p = 3;
|   }
| 
| under Gaby's interpretation, we cannot be sure that "p" points to an
| "int" before this function, so we can't be sure the write to "*p"
| doesn't clobber memory of other types.

Note that is is a problem only with PODs -- because only those can
appear in unions.  That does not help much, but it is a distinction
you have to make when you're considering what the standard says.

-- Gaby

Comment 150 gdr@cs.tamu.edu 2007-05-23 23:57:56 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

| Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
|  new does not change the dynamic type as it should
| 
| joseph at codesourcery dot com wrote:
| 
| > DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was 
| > what eventually said for C that you don't need to worry about that; I'd 
| > think the aim should be to get C++ to agree with that ruling.
| 
| Thank you for the pointer.  That seems directly on point, and makes C99
| match the existing GCC practice: we don't need to worry that pointers
| might point to unions.
| 
| Gaby, would you please forward that to the C++ reflector, so that the
| reflector has that information as well?  They should be aware that the
| model you're proposing is at odds with C99.
| 

Yes, I'll right now.

Since I brought upt this issue, Nick MacLaren sent me a note he had
circulated about these very issues on C99.  It is also being discussed
on the -core reflector.

-- Gaby

Comment 151 gdr@cs.tamu.edu 2007-05-24 00:58:28 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| > Gaby's model says that we know less about dynamic types than we
| > presently think we do, because there might be a union out there
| > somewhere.  (Fortunately, as Joseph points out, C99 has already answered
| > this question.  Surely we can agree that making C99 and C++ different in
| > this respect is a bad idea.)
| 
| I don't think dragging in unions helps us here ;)  Maybe Gaby can clarify
| if and how unions relate here, but I didn't percieve any previous comment
| as making implicit unions relevant here apart from what GCC and
| apperantly C99 agree to.

I believe we all agree that placement new changes the dynamic type.


I brought in the union example to point of a fundamental problem with
this issue.  I have been following the discussion without saying much,
until I realized that the interpretation Mark was offering is a
redefinition of the C++ object model that conflicts with the current
standard text. That was the point of the union example.  In the
example 

    void f(int* p, double* q) {
        *p = 42;
        *q = 3.12;
    }

All we know is that after the store to *p, the object there will have
type int (if it did not already have one).  Similarly, for the store
to *q, the object there will have type double.  Can the stores be
rearranged?  Under the current C++ rules (which were inherited from
C90, and not C99) "yes" if we know that the objects are distinct.
Can we infer the disjoinctness from the types?  "Not always" under
current C++ rules for union, and in this specific case, the answer is 
"no".

I never said I liked that model.  I was merely pointing out that
people where on the slope of redefining the object model.

I spent the afternoon trying to see how C++ can move forward.
The "effective types" of C99 has its own sets of incompleteness and
inconsistency problems that Nick MacLaren has brought to my attention
since I raised the issue on the C++ -core reflector.

-- Gaby

Comment 152 gdr@cs.tamu.edu 2007-05-24 01:06:51 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| However, I don't like this approach because I believe it will result in
| inferior code.

I believe (without offering data, sorry) it *may* yield inferior code.
However, my original point was to make sure that we understood your
interpretation was a change to the C++ standard object model.

Given the consequences, I think it was appropriate to raise the issue
to the C++ committee.  So far, the only response I got was the
sentiment that it is a change in semantics. Given C99's move, it is
essential for C++ to get to a better state of the object model.
However, C99's notion of "effective type" is not without its own set of
problems (already in C99).  The details can be found in Nick
MacLaren's paper.

-- Gaby

Comment 153 rguenther@suse.de 2007-05-24 09:03:39 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Thu, 23 May 2007, gdr at cs dot tamu dot edu wrote:

> ------- Comment #151 from gdr at cs dot tamu dot edu  2007-05-24 00:58 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:
> 
> [...]
> 
> | > Gaby's model says that we know less about dynamic types than we
> | > presently think we do, because there might be a union out there
> | > somewhere.  (Fortunately, as Joseph points out, C99 has already answered
> | > this question.  Surely we can agree that making C99 and C++ different in
> | > this respect is a bad idea.)
> | 
> | I don't think dragging in unions helps us here ;)  Maybe Gaby can clarify
> | if and how unions relate here, but I didn't percieve any previous comment
> | as making implicit unions relevant here apart from what GCC and
> | apperantly C99 agree to.
> 
> I believe we all agree that placement new changes the dynamic type.
> 
> 
> I brought in the union example to point of a fundamental problem with
> this issue.  I have been following the discussion without saying much,
> until I realized that the interpretation Mark was offering is a
> redefinition of the C++ object model that conflicts with the current
> standard text. That was the point of the union example.  In the
> example 
> 
>     void f(int* p, double* q) {
>         *p = 42;
>         *q = 3.12;
>     }
> 
> All we know is that after the store to *p, the object there will have
> type int (if it did not already have one).  Similarly, for the store
> to *q, the object there will have type double.  Can the stores be
> rearranged?  Under the current C++ rules (which were inherited from
> C90, and not C99) "yes" if we know that the objects are distinct.
> Can we infer the disjoinctness from the types?  "Not always" under
> current C++ rules for union, and in this specific case, the answer is 
> "no".

Right, current C++ rules forbid exchanging the stores.  But I read it
as it is because the stores may start lifetime of a different
dynamically typed object on the same memory location which p and q
may point to.  Whether there is a union "placed" at this memory
location is irrelevant.  I read the standard as if the above is
equivalent to

  *(new (p) int) = 42;
  *(new (q) double) = 3.12;

which as long as we cannot prove that p does not point to the
same memory location as q means that we cannot reorder the stores.

Richard.

Comment 154 rguenther@suse.de 2007-05-24 10:07:30 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 23 May 2007, mark at codesourcery dot com wrote:

> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
> 
> rguenther at suse dot de wrote:
> 
> >> Whatever conservative
> >> assumptions we want to make about "*p = 3" we could make about "new (p)
> >> int" instead.
> > 
> > True.  I say making them about "*p = 3" is way easier as we are changing
> > semantics of memory operations and *p = 3 is one, but placement new is 
> > not.
> 
> I think I understand what you're saying here too; again, I'll restate to
> make sure:
> 
> (a) In the model where "*p = 3" changes the dynamic type of memory, we
> don't need to do anything special to handle placement new.
> (b) It's relatively easy to implement support for "*p = 3" changing the
> dynamic type of memory.
> (c) Therefore, it's relatively easy to fix our placement new problem.
> 
> I agree with those statements too.
> 
> However, I don't like this approach because I believe it will result in
> inferior code.  I think that you're looking at the proposed placement
> new patches, then looking at what they do to a particular codebase,
> which happens to use placement-new in an inner loop, and becoming
> unhappy with the patches.  I suspect that the changes required to
> support the "*p = 3" model, while perhaps better for that case, will be
> worse for many others.
> 
> I can't prove that.  But, I did implement TBAA after looking at what
> other compilers did, specifically to improve performance of (ironically)
> POOMA.  So, I'm afraid that you're going to find that if we allow memory
> writes to change the type of memory, that we will get worse performance.

So I did some benchmarking with my two proposed patches (which I believe
will fix all current issues - that I know of) on IA64 which should be
sensitive to the more restrictive scheduling.  [You can look yourself
at the results starting from http://www.suse.de/~gcctest/, the run
without the patches is from late May23th the run with the patch from
early May24th]

I'll try to summarize only here.

 - SPECfpu2000 shows both winners and loosers, the net change is
   slightly positive for -O3 and slightly negative for -O3 -funroll-loops
   (slightly means +3 points vs. -4 points)

 - SPECint2000 shows more variance in single tests, crafty winning for 
   peak, parser loosing for base, eon winning overall, likewise bzip.
   Overall we get a drop in SPECint of about 3 points (0.3%)

 - Polyhedron is mostly the same, aermod regressing 1.5%, induct
   improving 1%.

 - The set of C++ benchmarks (including TraMP-3d, DLV, Boost wave and 
   others) show no changes.

Of course this doesn't prove anything - it only hints that it might be
not as bad as you thought.  And no, I don't have a Power6 machine to
test on yet.

Richard.

Comment 155 rguenther@suse.de 2007-05-24 10:11:05 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Thu, 24 May 2007, rguenther at suse dot de wrote:

> So I did some benchmarking with my two proposed patches (which I believe
> will fix all current issues - that I know of) on IA64 which should be
> sensitive to the more restrictive scheduling.  [You can look yourself
> at the results starting from http://www.suse.de/~gcctest/, the run
> without the patches is from late May23th the run with the patch from
> early May24th]
> 
> I'll try to summarize only here.
> 
>  - SPECfpu2000 shows both winners and loosers, the net change is
>    slightly positive for -O3 and slightly negative for -O3 -funroll-loops
>    (slightly means +3 points vs. -4 points)
> 
>  - SPECint2000 shows more variance in single tests, crafty winning for 
>    peak, parser loosing for base, eon winning overall, likewise bzip.
>    Overall we get a drop in SPECint of about 3 points (0.3%)
> 
>  - Polyhedron is mostly the same, aermod regressing 1.5%, induct
>    improving 1%.
> 
>  - The set of C++ benchmarks (including TraMP-3d, DLV, Boost wave and 
>    others) show no changes.
> 
> Of course this doesn't prove anything - it only hints that it might be
> not as bad as you thought.  And no, I don't have a Power6 machine to
> test on yet.

Btw, in case you are curious - on 
http://www.suse.de/~gcctest/c++bench/tramp3d/split-run.html you can see
the effects of the asm() memory barrier as proposed by one of Ians patches
and the effect of forcing -fno-strict-aliasing on (the two spikes in
runtime, the later one is -fno-strict-aliasing, the earlier one is the 
asm()).  Both approaches disable moving loop invariant loads which is
one thing you retain with my proposed slightly changed TBAA rules.

Richard.

Comment 156 gdr@cs.tamu.edu 2007-05-24 10:29:05 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| > I brought in the union example to point of a fundamental problem with
| > this issue.  I have been following the discussion without saying much,
| > until I realized that the interpretation Mark was offering is a
| > redefinition of the C++ object model that conflicts with the current
| > standard text. That was the point of the union example.  In the
| > example 
| > 
| >     void f(int* p, double* q) {
| >         *p = 42;
| >         *q = 3.12;
| >     }
| > 
| > All we know is that after the store to *p, the object there will have
| > type int (if it did not already have one).  Similarly, for the store
| > to *q, the object there will have type double.  Can the stores be
| > rearranged?  Under the current C++ rules (which were inherited from
| > C90, and not C99) "yes" if we know that the objects are distinct.
| > Can we infer the disjoinctness from the types?  "Not always" under
| > current C++ rules for union, and in this specific case, the answer is 
| > "no".
| 
| Right, current C++ rules forbid exchanging the stores.  But I read it
| as it is because the stores may start lifetime of a different
| dynamically typed object on the same memory location which p and q
| may point to.

Which is equivalent to having the union, in all aspects.

Now, if I understand your argument below correctly, you are saying
that even if we fixed the union rules for C++, we may still be facing
the same problem because assignment to POD objects does not just mean
that the object was there before, but that we are actually starting a
new one.  Do I understand your argument correctly?

|  Whether there is a union "placed" at this memory
| location is irrelevant.  I read the standard as if the above is
| equivalent to
| 
|   *(new (p) int) = 42;
|   *(new (q) double) = 3.12;
| 
| which as long as we cannot prove that p does not point to the
| same memory location as q means that we cannot reorder the stores.

-- Gaby

Comment 157 rguenther@suse.de 2007-05-24 10:33:19 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Thu, 24 May 2007, gdr at cs dot tamu dot edu wrote:

> 
> 
> ------- Comment #156 from gdr at cs dot tamu dot edu  2007-05-24 10:29 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:
> 
> [...]
> 
> | > I brought in the union example to point of a fundamental problem with
> | > this issue.  I have been following the discussion without saying much,
> | > until I realized that the interpretation Mark was offering is a
> | > redefinition of the C++ object model that conflicts with the current
> | > standard text. That was the point of the union example.  In the
> | > example 
> | > 
> | >     void f(int* p, double* q) {
> | >         *p = 42;
> | >         *q = 3.12;
> | >     }
> | > 
> | > All we know is that after the store to *p, the object there will have
> | > type int (if it did not already have one).  Similarly, for the store
> | > to *q, the object there will have type double.  Can the stores be
> | > rearranged?  Under the current C++ rules (which were inherited from
> | > C90, and not C99) "yes" if we know that the objects are distinct.
> | > Can we infer the disjoinctness from the types?  "Not always" under
> | > current C++ rules for union, and in this specific case, the answer is 
> | > "no".
> | 
> | Right, current C++ rules forbid exchanging the stores.  But I read it
> | as it is because the stores may start lifetime of a different
> | dynamically typed object on the same memory location which p and q
> | may point to.
> 
> Which is equivalent to having the union, in all aspects.
> 
> Now, if I understand your argument below correctly, you are saying
> that even if we fixed the union rules for C++, we may still be facing
> the same problem because assignment to POD objects does not just mean
> that the object was there before, but that we are actually starting a
> new one.  Do I understand your argument correctly?

Yes.  For non-PODs I'm not sure if it is allowed to start object
lifetime via assignment rather than only by placement new.

Richard.

> |  Whether there is a union "placed" at this memory
> | location is irrelevant.  I read the standard as if the above is
> | equivalent to
> | 
> |   *(new (p) int) = 42;
> |   *(new (q) double) = 3.12;
> | 
> | which as long as we cannot prove that p does not point to the
> | same memory location as q means that we cannot reorder the stores.

Comment 158 gdr@cs.tamu.edu 2007-05-24 10:47:12 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

"rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| > Now, if I understand your argument below correctly, you are saying
| > that even if we fixed the union rules for C++, we may still be facing
| > the same problem because assignment to POD objects does not just mean
| > that the object was there before, but that we are actually starting a
| > new one.  Do I understand your argument correctly?
| 
| Yes. 

Then, I'm not sure I agree with that.  I'll raise it as part of the
union problem.

| For non-PODs I'm not sure if it is allowed to start object
| lifetime via assignment rather than only by placement new.

Assignment for non-POD is equivalent to calling a member function for
an object, which means that the object already existed.  So, no you
cannot start lifetime for object of non-POD type by assignment.

-- Gaby

Comment 159 Ian Lance Taylor 2007-05-25 23:21:51 UTC

Created attachment 13613 [details]
Patch

This version of the patch is based on some code from Daniel Berlin.  Here we determine which pointers can potentially be associated with a placement new.  We disable TBAA specifically for those pointers.

I'm not 100% confident that this patch always does the right thing.  But it does work for the test cases I tried.

Richi, I'd be interested in results you get from your performance tests.

Comment 160 rguenther@suse.de 2007-05-27 14:57:20 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Sat, 25 May 2007, ian at airs dot com wrote:

> ------- Comment #159 from ian at airs dot com  2007-05-25 23:21 -------
> Created an attachment (id=13613)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13613&action=view)
> Patch
> 
> This version of the patch is based on some code from Daniel Berlin.  Here we
> determine which pointers can potentially be associated with a placement new. 
> We disable TBAA specifically for those pointers.
> 
> I'm not 100% confident that this patch always does the right thing.  But it
> does work for the test cases I tried.
> 
> Richi, I'd be interested in results you get from your performance tests.

Please - this patch doesn't fix FreePOOMA nor does it fix the runtime
testcase I posted here before.  We still have

<bb 3>:
  l_7 = (int *) p_6(D);
  # SMT.31_21 = VDEF <SMT.31_18>
  *l_7 = 0;
  <<<change_dynamic_type (long int *) p_6(D))>>>
  __p_14 = p_6(D);
  f_10 = (long int *) __p_14;
  # SMT.30_22 = VDEF <SMT.30_17>
  *f_10 = -1;

the store to *f_10 needs to alias the store to *l_7.  Can you just
add the runtime testcase to your patches and make sure at least that
one is working?

Thanks,
Richard.

Comment 161 rguenther@suse.de 2007-05-28 11:14:18 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

Btw, it could save us many bugs (or bug reports) if if PTA says must-alias
we'd trust it, instead of using TBAA to say "but it shouldn't".  Like for
the (invalid)

int foo(void *p)
{
  double *q = p;
  int *r = p;
  *q = 1.0;
  return *r;
}

where we correctly figure that

Points-to sets
q_2 = same as p
r_3 = same as p

but still we get (with -fstrict-aliasing)

<bb 2>:
  q_2 = (double *) p_1(D);
  r_3 = (int *) p_1(D);
  # SMT.4_7 = VDEF <SMT.4_6(D)>
  *q_2 = 1.0e+0;
  # VUSE <SMT.5_8(D)>
  D.1641_4 = *r_3;

which just for the point to make the invalid input produce unexpected
output is possibly creating many of the hard-to-fix issues like this one.

Danny, how easy is it to "fix" the above?  It looks like another 
attractive solution to this problem.

Comment 162 Daniel Berlin 2007-05-28 11:24:19 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should

On 28 May 2007 11:14:20 -0000, rguenther at suse dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #161 from rguenther at suse dot de  2007-05-28 11:14 -------
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
>  new does not change the dynamic type as it should
>
> Btw, it could save us many bugs (or bug reports) if if PTA says must-alias
> we'd trust it, instead of using TBAA to say "but it shouldn't".  Like for
> the (invalid)
>
> int foo(void *p)
> {
>   double *q = p;
>   int *r = p;
>   *q = 1.0;
>   return *r;
> }
>
> where we correctly figure that
>
> Points-to sets
> q_2 = same as p
> r_3 = same as p
>

These are just collapsed because of pointer equivalences.
There is no guarantee that we will discover all pointer equivalences,
and their are in fact cases where we won't.

Comment 163 Ian Lance Taylor 2007-05-28 17:30:29 UTC

Richi, I tested my patch on every test case I saved.  Can you just point me at the one I missed?  Thanks.

Comment 164 Ian Lance Taylor 2007-05-30 23:18:57 UTC

Created attachment 13637 [details]
Patch

This is a modification of the previous patch which correctly handles the test case in comment #85.

Comment 165 Richard Biener 2007-06-04 14:02:25 UTC

The lates patch fixes the FreePOOMA testsuite regressions.  I'll give it a
performance testing spin on vangelis tonight.

Comment 166 Richard Biener 2007-06-05 16:20:25 UTC

It causes a 10% performance regression for tramp3d.  There appear to be no significant changes in libstdc++ performance testing.  "Fixing" tramp3d-v4
with the below patch cures the performance regression.

--- tramp3d-v4.cpp      2007-05-16 15:02:47.000000000 +0200
+++ tramp3d-v4x.cpp     2007-06-05 18:11:40.000000000 +0200
@@ -29583,10 +29583,6 @@
   VectorEngine()
   {
     PoomaCTAssert<(ElementProperties<T>::hasTrivialDefaultConstructor && ElementProperties<T>::hasTrivialDestructor && ElementProperties<T>::concrete)>::test();
-    for (int i = 0; i < D; ++i)
-      {
- ElementProperties<T>::construct(&x_m[i]);
-      }
   }
   inline VectorEngine(const VectorEngine<D,T,Full>&);
   template<class X>

(ElementProperties<T>::construct calls placement new)

Comment 167 Ian Lance Taylor 2007-06-05 20:48:10 UTC

Can you give me a .ii file for the performance regression, and point me at the relevant function?

Comment 168 rguenther@suse.de 2007-06-05 21:17:08 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Tue, 5 Jun 2007, ian at airs dot com wrote:

> ------- Comment #167 from ian at airs dot com  2007-06-05 20:48 -------
> Can you give me a .ii file for the performance regression, and point me at the
> relevant function?

http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.cpp.gz

Amongst the interesting functions (yep, there are multiple) are
those called Momentumflux*::operator(), one particular example is
Adv5::Z::MomentumfluxX<Dim>::operator(), which mangles as
_ZNK4Adv51Z13MomentumfluxXILi3EEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd7CompFwdI6EngineILi3E6VectorILi3Ed4FullE10BrickViewUE3LocILi1EEEES4_ISA_dSG_ESM_SM_EEvRKT_RKT0_RKT1_RKT2_RKSI_ILi3EE
it has this initialization loop (which is fixed by the tramp3d patch)
inside the computational kernel (triple nested loop):

<bb 35>:
  D.760598_367 = &D.464122.engine_m.x_m[i_368];
  <<<change_dynamic_type (double *) D.760598_367)>>>
  iftmp.913_369 = &D.464122.engine_m.x_m[i_368];
  if (1)
    goto <bb 36>;
  else
    goto <bb 37>;

<bb 36>:
  *iftmp.913_369 = 0.0;

<bb 37>:
  i_370 = i_368 + 1;

<bb 38>:
  # i_368 = PHI <0(34), i_370(37)>
  if (i_368 <= 2)
    goto <bb 35>;
  else
    goto <bb 39>;

(that's after forwprop1 actually).  It's important that we unroll
this loop completely (to make us recognize the 0.0 stores are
all super-seeded by later stores) and that we move all loop invariant loads
out of the triple-nested loops, crossing this initialization loop.

The optimized dump for all these functions should be 'easy to grasp and
obviously fast' - at least that's what it used to be.  Now with this
patch we stil have

  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[0])>>>  
D.767646.engine_m.x_m[0] = 0.0;
  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[1])>>>  
D.767646.engine_m.x_m[1] = 0.0;
  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[2])>>>
  D.767646.engine_m.x_m[2] = 0.0;

in there and loads of index/domain variables on the MEM expressions like

  MEM[base: &D.767646, index: D.1312916, step: 8] = 
D.767266->origin_m.engine_m.x_m[i] + D.767266->spacings_m.engine_m.x_m[i] 
* (double) (MEM[base: &D.767512, index: D.1312916, step: 4] - 
D.767266->D.225459.physicalCellDomain_m.D.114276.D.113975.domain_m[i].D.110801.D.45225.D.45039.domain_m[0]);

Richard.

Comment 169 Ian Lance Taylor 2007-06-06 05:33:17 UTC

What options are you using when you compile?

I don't see the symbol you mention, although I do see

_ZNK10ScalarCodeIN4Adv51Z13MomentumfluxZILi3EEEEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10MultiPatchI7GridTag6RemoteI5BrickEEESJ_SJ_EEvRKT_RK8IntervalIXsrSK_10dimensionsEERKT0_RKT1_

Comment 170 rguenther@suse.de 2007-06-06 08:41:22 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Wed, 6 Jun 2007, ian at airs dot com wrote:

> ------- Comment #169 from ian at airs dot com  2007-06-06 05:33 -------
> What options are you using when you compile?
> 
> I don't see the symbol you mention, although I do see
> 
> _ZNK10ScalarCodeIN4Adv51Z13MomentumfluxZILi3EEEEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10MultiPatchI7GridTag6RemoteI5BrickEEESJ_SJ_EEvRKT_RK8IntervalIXsrSK_10dimensionsEERKT0_RKT1_

This was simply with -O2.  Note that the above is on 64bit x86_64-linux.

Richard.

Comment 171 Ian Lance Taylor 2007-06-08 07:49:50 UTC

Created attachment 13666 [details]
Patch

This variant of the previous patch does better on at least some of the tramp3d functions.  Here we eliminate the CHANGE_DYNAMIC_TYPE_EXPR nodes during the first aliasing pass, which prevents them from retaining references to local variables and generally clogging up optimizations.

For this test case it generates the same results as mainline, which was not true of the previous patch.

typedef __SIZE_TYPE__ size_t;

inline void* operator new(size_t, void* __p) throw() { return __p; }

template <class T, int D>
class Vector
{
public:
   Vector()
   {
     for (int i = 0; i < D; ++i)
        new (&x_m[i]) T();
   }
   T& operator[](int i) { return x_m[i]; }

private:
   T x_m[D];
};

struct sia
{
  int ai[3];
};

struct s
{
  struct si
  {
    sia* p;
  } asi[3];
  float* pd;
};

class c
{
  int foo(int, int, int);
  s sm;
};


extern void bar(Vector<float, 3>*, int);
int c::foo(int f1, int f2, int f3)
{
  float sum = 0;
  for (int i = 0; i < 3; ++i)
    {
      for (int j = 0; j < 3; ++j)
	{
	  Vector<float, 3> v;
	  v[0] = 1.0;
	  v[1] = 2.0;
	  v[2] = 3.0;
	  for (int k = 0; k < 3; ++k)
	    {
	      float f = (f1 * this->sm.asi[0].p->ai[0]
			 + f2 * this->sm.asi[1].p->ai[0]
			 + f3 * this->sm.asi[2].p->ai[0]);
	      sum += f * v[k];
	    }
	  *this->sm.pd = sum;
	}
    }
  return sum;
}

Comment 172 rguenther@suse.de 2007-06-08 13:47:06 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Fri, 8 Jun 2007, ian at airs dot com wrote:

> 
> 
> ------- Comment #171 from ian at airs dot com  2007-06-08 07:49 -------
> Created an attachment (id=13666)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13666&action=view)
> Patch
> 
> This variant of the previous patch does better on at least some of the tramp3d
> functions.  Here we eliminate the CHANGE_DYNAMIC_TYPE_EXPR nodes during the
> first aliasing pass, which prevents them from retaining references to local
> variables and generally clogging up optimizations.

Quick testing shows that this patch indeed shows better performance.
I'll give it complete testing coverage tonight.

Richard.

Comment 173 Richard Biener 2007-06-09 09:48:45 UTC

Full testing shows no (or at least < 0.5%) performance regression on tramp3d
runtime.  Compile time seems to go up by 0.5% (that looks consistent, about the
same with previous patches) for tramp3d.  Testcases not using placement new
don't seem to be affected.

The patch still fixes the FreePOOMA problems that caused me to file this PR.

So, from my point of view the patch is ready to be exposed to more eyes.

Thanks,
Richard.

Comment 174 Paolo Carlini 2007-06-09 09:55:09 UTC

Great!

Comment 175 Benjamin Kosnik 2007-06-09 12:03:49 UTC

Nice, thanks Ian.

Comment 176 Mark Mitchell 2007-06-09 19:29:29 UTC

Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

rguenth at gcc dot gnu dot org wrote:

> So, from my point of view the patch is ready to be exposed to more eyes.

The C++ bits are fine.

Comment 177 ian@gcc.gnu.org 2007-06-12 17:47:49 UTC

Subject: Bug 29286

Author: ian
Date: Tue Jun 12 17:47:37 2007
New Revision: 125653

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125653
Log:
./:
	PR libstdc++/29286
	* tree.def: Add CHANGE_DYNAMIC_TYPE_EXPR.
	* tree.h (CHANGE_DYNAMIC_TYPE_NEW_TYPE): Define.
	(CHANGE_DYNAMIC_TYPE_LOCATION): Define.
	(DECL_NO_TBAA_P): Define.
	(struct tree_decl_common): Add no_tbaa_flag field.
	* tree-ssa-structalias.c (struct variable_info): Add
	no_tbaa_pruning field.
	(new_var_info): Initialize no_tbaa_pruning field.
	(unify_nodes): Copy no_tbaa_pruning field.
	(find_func_aliases): Handle CHANGE_DYNAMIC_TYPE_EXPR.
	(dump_solution_for_var): Print no_tbaa_pruning flag.
	(set_uids_in_ptset): Add no_tbaa_pruning parameter.  Change all
	callers.
	(compute_tbaa_pruning): New static function.
	(compute_points_to_sets): Remove CHANGE_DYNAMIC_TYPE_EXPR nodes.
	Call compute_tbaa_pruning.
	* tree-ssa-alias.c (may_alias_p): Test no_tbaa_flag for pointers.
	* gimplify.c (gimplify_expr): Handle CHANGE_DYNAMIC_TYPE_EXPR.
	* gimple-low.c (lower_stmt): Likewise.
	* tree-gimple.c (is_gimple_stmt): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise.
	* tree-inline.c (estimate_num_insns_1): Likewise.
	(copy_result_decl_to_var): Likewise.
	* expr.c (expand_expr_real_1): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-inline.c (copy_decl_to_var): Copy DECL_NO_TBAA_P flag.
	* omp-low.c (omp_copy_decl_2): Likewise.
	* print-tree.c (print_node): Print DECL_NO_TBAA_P flag.
	* doc/c-tree.texi (Expression trees): Document
	CHANGE_DYNAMIC_TYPE_EXPR.
cp/:
	PR libstdc++/29286
	* init.c (avoid_placement_new_aliasing): New static function.
	(build_new_1): Call it.
testsuite/:
	PR libstdc++/29286
	* g++.dg/init/new16.C: New test.
	* g++.dg/init/new17.C: New test.
	* g++.dg/init/new18.C: New test.
	* g++.dg/init/new19.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/init/new16.C
    trunk/gcc/testsuite/g++.dg/init/new17.C
    trunk/gcc/testsuite/g++.dg/init/new18.C
    trunk/gcc/testsuite/g++.dg/init/new19.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/init.c
    trunk/gcc/doc/c-tree.texi
    trunk/gcc/expr.c
    trunk/gcc/gimple-low.c
    trunk/gcc/gimplify.c
    trunk/gcc/omp-low.c
    trunk/gcc/print-tree.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-gimple.c
    trunk/gcc/tree-inline.c
    trunk/gcc/tree-pretty-print.c
    trunk/gcc/tree-ssa-alias.c
    trunk/gcc/tree-ssa-dce.c
    trunk/gcc/tree-ssa-operands.c
    trunk/gcc/tree-ssa-structalias.c
    trunk/gcc/tree.def
    trunk/gcc/tree.h

Comment 178 Ian Lance Taylor 2007-06-12 18:10:56 UTC

Fixed on mainline.  No plans to backport the patch to previous releases.

Comment 179 Johannes Schaub 2010-06-20 00:01:35 UTC

(In reply to comment #158)
> Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement new does not change the
> dynamic type as it should
> 
> "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:
> 
> [...]
> 
> | > Now, if I understand your argument below correctly, you are saying
> | > that even if we fixed the union rules for C++, we may still be facing
> | > the same problem because assignment to POD objects does not just mean
> | > that the object was there before, but that we are actually starting a
> | > new one.  Do I understand your argument correctly?
> | 
> | Yes. 
> 
> Then, I'm not sure I agree with that.  I'll raise it as part of the
> union problem.
> 

I'm sorry for commenting this late, but were there any outcomes of the discussion about changing the dynamic type by assignments for non-PODs (maybe on the reflector)? The way I see it, and how Richard may see it, is by 3.8/1 which says

  The lifetime of an object of type T begins when
   - storage with the proper alignment and size for type T is obtained
  The lifetime of an object of type T ends when
   - the storage which the object occupies is reused or released.

This isn't specific to unions, but it never seems to define what constitutes a "reuse", but Richard seems to interpret this to include assignments and so do I, and i've always used this interpretation to explain the mechanism by which unions change their active member.