Placement new is specified to end life-time of the object starting life-time of a new one. The current implementation messes up with aliasing though, so stores are re-ordered: #include <new> struct Foo { long i; }; struct Bar { void *p; }; long foo(int n) { Foo *f = new Foo; f->i = 1; for (int i=0; i<n; ++i) { Bar *b = new (f) Bar; b->p = 0; f = new (f) Foo; f->i = i; } return f->i; } Produces .099t.optimized: long int foo(int) (n) { struct Foo * f; void * D.2530; <bb 2>: D.2530 = operator new (8); f = (struct Foo *) D.2530; f->i = 1; if (n > 0) goto <L7>; else goto <L2>; <L7>:; f->i = (long int) ((unsigned int) n - 1); ((struct Bar *) f)->p = 0B; <L2>:; return f->i; } where the stores to Bar::p and Foo:i are reordered. Until .147r.bbro everything looks like above on RTL level, too: ... ;; Start of basic block 3, registers live: 1 [dx] 3 [bx] 7 [sp] (note:HI 21 19 22 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (note:HI 22 21 48 3 NOTE_INSN_DELETED) (insn 48 22 24 3 (set (reg:DI 0 ax [63]) (zero_extend:DI (subreg:SI (plus:DI (reg:DI 3 bx [orig:61 n ] [61]) (const_int -1 [0xffffffffffffffff])) 0))) 195 {*lea_1_zext} (nil) (expr_list:REG_DEAD (reg:DI 3 bx [orig:61 n ] [61]) (nil))) (insn:HI 24 48 25 3 (set (mem/s:DI (reg/v/f:DI 1 dx [orig:58 f ] [58]) [3 <variable>.i+0 S8 A64]) (reg:DI 0 ax [63])) 81 {*movdi_1_rex64} (insn_list:REG_DEP_TRUE 23 (nil)) (expr_list:REG_DEAD (reg:DI 0 ax [63]) (nil))) (insn:HI 25 24 26 3 (set (mem/s/f:DI (reg/v/f:DI 1 dx [orig:58 f ] [58]) [5 <variable>.p+0 S8 A64]) (const_int 0 [0x0])) 81 {*movdi_1_rex64} (nil) (nil)) ;; End of basic block 3, registers live: 1 [dx] 7 [sp] ... but now (unfortunately) scheduling reorders them back (on x86_64), so the assembly looks good again: _Z3fooi: .LFB15: pushq %rbx .LCFI0: movl %edi, %ebx movl $8, %edi call _Znwm testl %ebx, %ebx movq %rax, %rdx movq $1, (%rax) jle .L2 leal -1(%rbx), %eax movq $0, (%rdx) movq %rax, (%rdx) .L2: popq %rbx movq (%rdx), %rax ret But with -O2 -fno-schedule-insns2 we finally get wrong-code for this: _Z3fooi: .LFB15: pushq %rbx .LCFI0: movl %edi, %ebx movl $8, %edi call _Znwm movq %rax, %rdx movq $1, (%rax) testl %ebx, %ebx jle .L2 leal -1(%rbx), %eax movq %rax, (%rdx) movq $0, (%rdx) .L2: movq (%rdx), %rax popq %rbx ret
The root of the problem is "wrong" alias information: <bb 2>: # SMT.6_27 = V_MAY_DEF <SMT.6_25>; # SMT.7_28 = V_MAY_DEF <SMT.7_26>; D.2530_3 = operator new (8); f_4 = (struct Foo *) D.2530_3; # SMT.6_29 = V_MAY_DEF <SMT.6_27>; f_4->i = 1; i_5 = 0; goto <bb 4> (<L1>); <L0>:; D.2542_7 = 8; __p_8 = f_1; D.2544_9 = __p_8; D.2544_10 = D.2544_9; D.2535_11 = D.2544_10; b_12 = (struct Bar *) D.2535_11; # SMT.7_30 = V_MAY_DEF <SMT.7_24>; b_12->p = 0B; D.2545_13 = 8; __p_14 = f_1; D.2547_15 = __p_14; D.2547_16 = D.2547_15; D.2536_17 = D.2547_16; f_18 = (struct Foo *) D.2536_17; D.2537_19 = (long int) i_2; # SMT.6_31 = V_MAY_DEF <SMT.6_23>; f_18->i = D.2537_19; i_20 = i_2 + 1; # SMT.7_24 = PHI <SMT.7_28(2), SMT.7_30(3)>; # SMT.6_23 = PHI <SMT.6_29(2), SMT.6_31(3)>; # i_2 = PHI <i_5(2), i_20(3)>; # f_1 = PHI <f_4(2), f_18(3)>; <L1>:; if (i_2 < n_6) goto <L0>; else goto <L2>; <L2>:; # VUSE <SMT.6_23>; D.2538_21 = f_1->i; return D.2538_21; the stores to b_12->p and f_18->i appear to be independent. But there is an implicit barrier between them from the std::new.
4.1 get's it correct by luck.
Hi Richard. For sure, I'm missing many details here, having to do with alias analysis, but I'm puzzled anyway ;) I mean, the current libsupc++ "implementation" of placement new is: // Default placement versions of operator new. inline void* operator new(std::size_t, void* __p) throw() { return __p; } inline void* operator new[](std::size_t, void* __p) throw() { return __p; } Can you explain a bit how a fix would look like?
One way to paper over the problem is to move std::new out-of-line :( Otherwise I cannot see how we can fix this in libsupc++ without gcc help. Basically we somehow need to insert (at least) a memory barrier here, like with inline void* operator new(std::size_t, void* __p) throw() { __asm__ volatile ("" : : : "memory"); return __p; } but that might pessimize code too much.
(In reply to comment #4) > One way to paper over the problem is to move std::new out-of-line :( Otherwise > I cannot see how we can fix this in libsupc++ without gcc help. Basically we > somehow need to insert (at least) a memory barrier here, like with > > inline void* operator new(std::size_t, void* __p) throw() { > __asm__ volatile ("" : : : "memory"); > return __p; > } > > but that might pessimize code too much. In case, memory barriers are available, and already used in guard.cc, but indeed, naively, something seems wrong with such a pure-library approach: a memory barrier needed for single thread code?!?!
Actually I think it should be: inline void* operator new(std::size_t, void* __p) throw() { void *p1; __asm__ ("" : "=r"(p1): "0"(__p): "memory"); return __p; } Which is less pessimize. Note the reason why you need this memory barrier is because the C++ standard says inplacement news change the dynamic type.
Note here is a slightly better testcase as now we know that Foo and Bar are the same size: #include <new> struct Foo { void *i; }; struct Bar { void *p; }; long foo(int n, int *t) { Foo *f = new Foo; f->i = t; for (int i=0; i<n; ++i) { Bar *b = new (f) Bar; b->p = 0; f = new (f) Foo; f->i = t; } return f->i; }
Let's suppose for a moment we actually try to fix this issue in the library: is adding a barrier conforming to the letter of 18.4.1.3/2-3, 5-6, that is: Returns: ptr. Notes: Intentionally performs no other action.
(In reply to comment #8) > Let's suppose for a moment we actually try to fix this issue in the library: is > adding a barrier conforming to the letter of 18.4.1.3/2-3, 5-6, that is: > > Returns: ptr. > Notes: Intentionally performs no other action. There has to be some communutation between the library and the compiler to tell it that the memory location is being reused as defined by 3.8 of the standard.
(In reply to comment #9) > There has to be some communutation between the library and the compiler to tell > it that the memory location is being reused as defined by 3.8 of the standard. Yes. But considering that the placement forms cannot be replaced by an user program, isn't conceptually cleaner for the C++ front-end to simply recognize those forms and do the right thing?
I don't really know. But for this issue I would like to introduce middle-end no-op builtins __builtin_clobber (void*) clobber the pointer argument alias set like we do for malloc/free (and other calls): int foo(void) { int i, *p, *q; p = malloc(4); *p = 0; free(p); q = malloc(4); *q = 1; i = *q; free(q); return i; } <bb 2>: # HEAP.4_10 = V_MAY_DEF <HEAP.4_8>; # HEAP.5_11 = V_MAY_DEF <HEAP.5_9>; D.1530_1 = malloc (4); p_2 = (int *) D.1530_1; # HEAP.4_12 = V_MAY_DEF <HEAP.4_10>; *p_2 = 0; # HEAP.4_13 = V_MAY_DEF <HEAP.4_12>; # HEAP.5_14 = V_MAY_DEF <HEAP.5_11>; free (p_2); # HEAP.4_15 = V_MAY_DEF <HEAP.4_13>; # HEAP.5_16 = V_MAY_DEF <HEAP.5_14>; D.1531_3 = malloc (4); q_4 = (int *) D.1531_3; # HEAP.5_17 = V_MAY_DEF <HEAP.5_16>; *q_4 = 1; # VUSE <HEAP.5_17>; i_5 = *q_4; # HEAP.4_18 = V_MAY_DEF <HEAP.4_15>; # HEAP.5_19 = V_MAY_DEF <HEAP.5_17>; free (q_4); D.1532_6 = i_5; return D.1532_6; or even just a way to add a VUSE of the pointer alias sets, or even void * __builtin_new_object (void *p) which would do # VUSE <p> # HEAP.n = V_MAY_DEF <HEAP.n> p = __builtin_new_object (p) i.e. associate a new alias tag with the return value. Can you benchmark the workaround in the library?
Yes, maybe we can do something in the C++ frontend.
And here is a testcase that fails also for 4.0.0 and self contained also: #include <new> int foo(int n, int *f, long *f1) { long t = 0; *f = 1; long *fp = new(f) long; *fp = 1; for (int i=0; i<n; ++i) { t += *fp; int *b = new (f) int; *b = i + *f1; fp = new (f) long; *fp = t*t; } t+=*fp; return t+*f1; } extern "C" void abort (); int main(void) { if (sizeof(int) != sizeof(long) return 0; int *a = new int; long *b = new long; if (foo(2, a, b) != 6) abort (); return 0; }
(In reply to comment #13) > And here is a testcase that fails also for 4.0.0 and self contained also: replace main with: int main(void) { if (sizeof(int) != sizeof(long)) return 0; int *a = new int; long *b = new long; *a = 0; *b = 0; if (foo(2, a, b) != 6) abort (); return 0; } ---- This is what you get when thinking about 2 steps ahead of yourself.
I cannot see any plausible way to argue that this is a library bug. Implementation of placement new should not need barriers or other compiler intrinsics. It's twisted to argue that the C++ standard itself would prevent implementation of placement new without non-standard facilities, since the people that wrote the standard all certainly expected the implementation to be just: void *operator new(size_t, void *p) { return p; } Variations on that definition also occur in various user-defined new operators, including those for class-scoped operators. I don't think we can reasonably ask people to change this code, which works with every other C++ compiler. Therefore, the only possible conclusion is that either the compiler is buggy or that the examples are invalid. Unfortunately, I don't think Andrew's example is invalid. There are two possible approaches to fixing the compiler: either (a) the C++ front end should mark all new operators as special, in some way that the middle end then uses to avoid this over-optimization, or (b) relax the aliasing rules used by the middle end. I think the right solution is (b). Fundamentally, I'm not sure that the object models in C and C++ are sufficiently well-defined to permit the kind of optimization that we are discussing. I also think that people use functions not named "operator new" to do the kinds of things being discussed (i.e., to allocate blocks of memory) and that C and C++ are intended to support that usage. Certainly, in C, people can't use "operator new". So, I'd probably suggest that we relax the aliasing rules, at least for heap-allocated objects. As an example of the kind of problems the standard poses, [basic.life] says: "The lifetime of an object is a runtime property of the object. The lifetime of an object of type T begins when: --storage with the proper alignment and size for type T is obtained, and --if T is a class type with a non-trivial constructor (_class.ctor_), the constructor call has completed. The lifetime of an object of type T ends when: --if T is a class type with a non-trivial destructor (_class.dtor_), the destructor call starts, or --the storage which the object occupies is reused or released." So, by that definition, it might seem that: int i = 3; *((float *)(&i)) = 7.0; is valid; we're reusing the "int" memory, as a "float", and it has the right size and alignment (on most systems). But, that would effectively nullify the [basic.lval] language that suggests that you can't access an object through a type other than its dynamic type -- or, at least, it would limit such access to writes. I don't think anyone would expect the dynamic type of "i" to be "float" in this example, even if someone had written: new (&i) float; Surely, the dynamic type of something explicitly declared "int" must be "int". But, for something with dynamic storage duration, perhaps we should assume that any write to the storage, with a type other than the one that it already has, forces it to possibly alias things of both types.
The dynamic type of the object at &i is indeed float and has the value 7.0 (iff align and sizes work out). We permitted this so that can can do: class C { ... }; char buf[1024]; new (&buf[n]) C and have the dynamic type of the storage be C. You can know the dynamic type is C, by doing a virtual dispatch, or by asking for the name of the type with rtti. Likewise, you can do: new (&i) float(7.0) with the same effects. The dynamic type is float, and the value is 7.0, and this is in all ways the same as: *(float*)&i = 7.0 Thinking about this some more, it would appear that it is unsafe to reorder writes of otherwise non-conflicting types past each other as type based analysis alone isn't enough to ensure they don't conflict. It might be reasonable to add the C rules that the dynamic type _must_ match the static (declared) type, if there is one for types other than pointer or pointer to member the the language standard and to clarify the dynamic type of an object when a bitwise copy of the object is made to better match the C rules. Reads should be fine as they are required to be the same type as the write (or character). In practice, if the optimizer can't see the placement new, it can't reorder things so this is safe; if it can see it, then it should do value based analysis in addition to type analysis and conclude they conflict (or can conflict) and still avoid doing the wrong thing. I do agree, that for now, this isn't a bug in the library. It could be made a bug in the library by requring a placement new operator as the only way in which storage can be reused as a different dynamic type.
I disagree with Mike's analysis. I certainly understand the goals of placement new, but I don't think that anyone intended this code: int i; *(float *)&i = 7.0; to be valid. I don't even think people intended this variant: int i; new (&i) float(7.0); to be valid. Note that a consequence of allowing this is that: int f() { int i = 3; new (&i) float(7.0); return i; } is undefined -- not because of the overwriting of i, but because the return statement reads memory of the wrong type. Mike's position is that accessing a variable using its static type is not always safe; I find this bizarre. My personal opinion is that the authors of the standard just didn't think of all the possible interactions of the various aspects. I believe placement new was intended to handle only arrays of characters, consistent with the special dispensation given to character pointers. However, I don't think my differences with Mike can be resolved without going directly to the standards committee. Fortunately, from the compiler point of view, I'm saying "punt" and Mike is saying "punt harder". So, it's probably just a question of in how many cases we decide to disable type-based alias analysis.
What is your position based upon? Mine is based upon having been in the room when we decided what the C rules probably were, what the C++ rules could be and the up side and down side of each choice and where we decided what our intent was going to be and lots of word smithing sessions as well. You can see some of the depth with which we discussed it in our formulation of the wording in the standard: 6 Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any lvalue which refers to the original object may be used but only in limited ways. Such an lvalue refers to allocated storage (_basic.stc.dynamic.deallocation_), and using the properties of the lvalue which do not depend on its value is well-defined. If an lvalue-to-rvalue conversion (_conv.lval_) is applied to such an lvalue, the program has undefined behavior; if the original object will be or was of a non-POD class type, the program has undefined behavior if: --the lvalue is used to access a non-static data member or call a non- static member function of the object, or --the lvalue is implicitly converted (_conv.ptr_) to a reference to a base class type, or --the lvalue is used as the operand of a static_cast (_expr.static.cast_) (except when the conversion is ultimately to char& or unsigned char&), or --the lvalue is used as the operand of a dynamic_cast (_expr.dynamic.cast_) or as the operand of typeid. 7 If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is cre- ated at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will auto- matically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if: --the storage for the new object exactly overlays the storage location which the original object occupied, and --the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and --the original object was a most derived object (_intro.object_) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects). [Example: struct C { int i; void f(); const C& operator=( const C& ); }; const C& C::operator=( const C& other) { if ( this != &other ) { this->~C(); // lifetime of *this ends new (this) C(other); // new object of type C created f(); // well-defined } return *this; } C c1; C c2; c1 = c2; // well-defined c1.f(); // well-defined; c1 refers to a new object of type C --end example] 8 If a program ends the lifetime of an object of type T with static (_basic.stc.static_) or automatic (_basic.stc.auto_) storage duration and if T has a non-trivial destructor,35) the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined. This is true even if the block is exited with an exception. We explcitly did consider overlaying one type with a completely different type and did so intend for the standard to allow that. Further, our intent of saying that automatics and objects with static duration could be so changed provided certain conditions were met was meant no only to state that this could be done, but to state the exact sitations in which it was defined to do so. > Mike's position is that accessing a variable using its static type is not always safe; I find this bizarre. Welcome to C++, let me quote the standard: 8 If a program ends the lifetime of an object of type T with static (_basic.stc.static_) or automatic (_basic.stc.auto_) storage duration and if T has a non-trivial destructor,35) the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined. If we didn't mean exactly what we wrote, why on earth would we write it? Hint, in this, we meant exactlty what was written. I was there, in every single meeting where we talked about it. I know what our intent was, and I know what the standard says. At no time does the standard says it is bizarre. It prescribe what you, the user _must_ do, if it is to work. You have to swap the type back to a valid type, before the system has a chance to notice that the type is wrong, for it to work, and then, only if there is a non-trivial dtor. A trivial dtor doesn't even need to do that, as stated. You can imagine it doesn't state
Mike suggests: "it would appear that it is unsafe to reorder writes of otherwise non-conflicting types past each other as type based analysis alone isn't enough to ensure they don't conflict." That would be bad in the general case. It would not be as bad as prohibiting the reorder of reads and writes, but it would be bad. Let's find a way such that we do not have to do that. Fortunately I believe that in a correct program we only have a problem when we can actually see the placement new (can any disprove that)? I personally don't have a problem with saying that placement new is special. When placement new is used, it has to move the pointer into alias set 0. With regard to Mark's comment #15, the problem here is not heap allocation, as I see it. The C standard explains how to use heap allocation: use a char* pointer. The problem is doing something akin to heap allocation without using a char* pointer.
Subject: Re: [4.0/4.1/4.2 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > Fortunately I believe that in a correct program we only have a problem when we > can actually see the placement new (can any disprove that)? I personally don't > have a problem with saying that placement new is special. When placement new > is used, it has to move the pointer into alias set 0. What about in C, as opposed to C++? I agree that the obvious special cases are access through char*, arrays of characters, and placement new. But, I'm afraid that there are lots of other allocation functions out there that are not spelled "operator new", and requiring decoration for them isn't something that (as far as I know) other compilers require.
In C a general allocation function should work with a char array. A specific allocation function should use a union. I don't think there are many existing exceptions to these guidelines. I think that code like that in PR 29272, which casts Node* to void* to Foo*, is an unusual case. And I believe we can handle that code correctly because of the use of memcpy. And if the code didn't use memcpy it would be clearly noncomformant--which isn't to say that we should deliberately break it, but we don't need to try extra hard to make it work. So I don't see a serious problem in C either. Am I missing something/
Subject: Re: [4.0/4.1/4.2 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > ------- Comment #21 from ian at airs dot com 2006-10-03 23:44 ------- > In C a general allocation function should work with a char array. A specific > allocation function should use a union. I don't think there are many existing > exceptions to these guidelines. > So I don't see a serious problem in C either. Am I missing something/ I think there are two remaining issues: int i; *(float*)(&i) = 7.0; IIUC, Mike's position is that this is valid -- and that, in fact, after this point "i" can no longer be accessed as an int. Do you agree? Or do you think that this is (should be?) undefined behavior? I think you are saying that: int i; new (&i) float; is valid, and that after this point accessing "i" as an int is invalid. Is that correct? If so, I assume that you conider the following C code valid as well: int i; float f; memcpy (&i, &f, sizeof (i)); ? Also, in C++, given: extern void f(int *); void g() { int i; f(&i); /* HERE */ } do you believe that at the point marked HERE the dynamic type of "i" is indeterminate? I think that if we allow the examples above, we have to assume that something may have reset the dynamic type of "i" here, meaning that we cannot assume that future stores through "float *" in the function do not alias "i".
One point to remember is that C does not allow re-using of storage with a different type (which is what PR29272 is about and why that testcase is invalid). The storage type is either the declared one or the one assigned by virtue of the first assignment (or memcpy). So, int i; float f; memcpy(&f, &i, sizeof(f)); is valid, it doesn't change fs dynamic type but assigns it the bit-pattern of i. What the C++ standard seems to imply is that the storage type of a bunch of memory (or an object with automatic storage) changes on "assignment". So, indeed for C++ re-ordering writes is not allowed, and escaping pointers must be assumed to change dynamic type. So for C++ and 4.2 the best solution looks like to disable type based aliasing completely. For C I'm not sure the behavior the standard mandates is very appealing - at least changing dynamic type via the use of memcpy should be allowed as a GCC extension (like we allow type-punning via the union trick).
Subject: Re: [4.0/4.1/4.2 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | ------- Comment #22 from mark at codesourcery dot com 2006-10-04 05:39 ------- | Subject: Re: [4.0/4.1/4.2 Regression] placement new | does not change the dynamic type as it should | | ian at airs dot com wrote: | > ------- Comment #21 from ian at airs dot com 2006-10-03 23:44 ------- | > In C a general allocation function should work with a char array. A specific | > allocation function should use a union. I don't think there are many existing | > exceptions to these guidelines. | | > So I don't see a serious problem in C either. Am I missing something/ | | I think there are two remaining issues: | | int i; | *(float*)(&i) = 7.0; | | IIUC, Mike's position is that this is valid -- and that, in fact, after | this point "i" can no longer be accessed as an int. Do you agree? I don't see how that code is supported by the paragraphs quoted by Mike. There was a recent discussion (at most two weeks ago) on the C++ standard reflector -core about similar topic. I do agree with Mark that the proper way to resolve this is to go to the C++ committee. It would be incredible that C++ was less type-strict than C. -- Gaby
So the solution for 4.3 here, from what i understand, seems pretty easy at the tree level, no matter what reading we wish to give this. If we add a placement_new_expr, and not try to revisit our interpretation of the standard, we can just DTRT and fix placement new. This would be best for optimizations, and IMHO, for users. Otherwise, we are at the point in 4.3 (In the past month this has changed, it's not really possible to backport this stuff to 4.2 very easily) where we have two types of variables from an aliasing perspective: SSA variables Global/static variables For both, we know if they are *actually* dereferenced (and not just escaping to a call, and thus "possibly dereferenced"). SSA variables never escape, and all casting will have been done in a statement that leads to the SSA variable. The SSA variable will only appear on the LHS once The only time they are currently pruned using TBAA information is at the dereference site (IE b_3 = *foo_5). This seems right to me, as it should have the correct dynamic type at this point, and anywhere else we prune using TBAA info without an explicit dereference is a bug. Under the reading that stores with different types are legal and start new object lifetimes, we would have to change the pruning to only prune at *load* dereference sites. Under this reading, we would never be able to reorder stores based on TBAA because they are never illegal under the reading that they start new object lifetimes. For globals/statics ("bare variables"), we currently will prune even if they are only "maybe-dereferenced". I think this is a bug under our current interpretation, and we should fix it. Because bare variables may appear on the LHS multiple times, we have a couple options, depending on the readings. Note that any of these require IPA and in the case of true globals, whole-program mode, in order to *ever be able to reorder a store/load based on TBAA*. We could gather the types that occur, and prune at load dereference sites assuming it is one of these, *if the variable does not escape the module*. Most statics do escape through library calls that we could add attributes to, but don't escape otherwise. So without attributing, you will lose roughly 90-100% of TBAA enabled store/load reordering of globals and statics. With attributing, you will probably lose 10-20% of TBAA enabled *load* reordering for statics, 100% of TBAA enabled reordering for globals (unless you promote them), and 100% of TBAA enabled store reordering. We could disable TBAA entirely for globals and statics, and you will just lose 100% everywhere :) TBAA accounts for about 7-25% of alias disambiguations in the real world for C and C++ (combined). This is according to both papers published by Intel about ICC, papers published by MS about VC, papers published by IBM about XLC, and testing of Google's codebase with and without -fno-strict-aliasing (with the caveat that we know our codebase is not entirely strict aliasing safe) That said, it is wildly variant from program to program as to how effective it is, and pretty unpredictable just from staring at code. I'll leave the decision about what we want to do up to someone else, i'm just explaining the aliasing options we have.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should dberlin at gcc dot gnu dot org wrote: > If we add a placement_new_expr, and not try to revisit our interpretation of > the standard, we can just DTRT and fix placement new. This would be best for > optimizations, and IMHO, for users. I agree that treating placement new specially makes sense. The first argument to a placement new operator could be considered to have an unspecified dynamic type on entrance to the operator, while the return value has the dynamic type specified by the operator. (So that the pointer returned by "new (x) int" has type "int *".) I'm not sure that placement_new_expr is the best way to accomplish this, but, maybe it is. Another possibility would be to define an attribute or attributes to specify the dynamic type of arguments and return types, and then have the C++ front end annotate all placement new operators with those attributes. I don't think that the other approach you sketch is very attractive because it sounds like we're going to lose a lot of our TBAA opportunities.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 1 Jan 2007 00:41:44 -0000, mark at codesourcery dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #26 from mark at codesourcery dot com 2007-01-01 00:41 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > dberlin at gcc dot gnu dot org wrote: > > > If we add a placement_new_expr, and not try to revisit our interpretation of > > the standard, we can just DTRT and fix placement new. This would be best for > > optimizations, and IMHO, for users. > > I agree that treating placement new specially makes sense. The first > argument to a placement new operator could be considered to have an > unspecified dynamic type on entrance to the operator, while the return > value has the dynamic type specified by the operator. (So that the > pointer returned by "new (x) int" has type "int *".) Right. > > I'm not sure that placement_new_expr is the best way to accomplish this, > but, maybe it is. Another possibility would be to define an attribute > or attributes to specify the dynamic type of arguments and return types, > and then have the C++ front end annotate all placement new operators > with those attributes. It would be nice if we could transform those attributes on gimplification to something like an an "alias preserving cast" (or something of that nature) that states that the cast is type unioning for alias purposes (IE that the possible types of the result for TBAA/etc purposes is the union of the type of the cast and the type of the cast's operand).. Not a fully fleshed out idea, just something that popped into my head.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should dberlin at dberlin dot org wrote: > It would be nice if we could transform those attributes on > gimplification to something like an > an "alias preserving cast" (or something of that nature) that states > that the cast is type unioning for alias purposes (IE that the > possible types of the result for TBAA/etc purposes is the union of the > type of the cast and the type of the cast's operand).. That sounds reasonable to me.
Created attachment 13497 [details] Patch Here is one approach which fixes the test case. This introduces a new tree code, ALIASING_CONVERT_EXPR. It is conveyed into RTL via a flag on REGS: REG_ALIAS_ALL. I didn't try to really union the alias sets, I just said that the result of placement new can alias anything. This patch is essentially untested. I'm not very happy with this approach because it doesn't fail safe: it's too easy to lose the special aliasing, and then the problem appears again, but only with a more complicated test case. A safer approach might be to change the type returned by placement new and mark it as TYPE_REF_CAN_ALIAS_ALL, but then I'm worried about type conversion and type comparison problems, since it isn't actually a different type.
Only one suggestion for the patch, instead of: - if (is_gimple_cast (t) || TREE_CODE (t) == VIEW_CONVERT_EXPR) + if (is_gimple_cast (t) || TREE_CODE (t) == VIEW_CONVERT_EXPR + || TREE_CODE (t) == ALIASING_CONVERT_EXPR) Shouldn't ALIASING_CONVERT_EXPR be considered a gimple cast? Yes you might need to change some places which use is_gimple_cast but it seems like a better idea to treat it as a cast rather than anything else.
Also should we fold the case where ALIASING_CONVERT_EXPR and the argument type are the same. Also I think this right now causes a regression on the tunk interacting with the patch which fixed PR 31146.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > Here is one approach which fixes the test case. This introduces a new tree > code, ALIASING_CONVERT_EXPR. It is conveyed into RTL via a flag on REGS: > REG_ALIAS_ALL. I didn't try to really union the alias sets, I just said that > the result of placement new can alias anything. This patch is essentially > untested. I think this is a reasonable approach I agree that it will require care to make sure this is threaded through the compiler in all places (for example, we may need variants of STRIP_NOPs that do/don't strip it), but anything else is going to be too pessimistic. I think that creating a separate type, with TYPE_REF_CAN_ALIAS_ALL set, is worse: it's not the type of the expression that matters, but the action of the expression itself. It's the act of placement-newing that destroys type information; after that, the type that you have is what you expect.
Created attachment 13504 [details] Patch Here is a much simpler patch which also fixes the problem. This simply inserts a memory clobber before each placement new. This is safe, but is not as efficient as possible. How much do we care about optimizing placement new? How often is placement new used in a way which we could in practice optimize anyhow? For the test case in this PR, the penalty is pretty severe: gcc can no longer eliminate the loop. But this code is obviously artificial in that the loop serves no purpose.
First blush, I'm not enthusiastic about this last proposal. Seems to me just a concealed way to do what Richard suggested at the beginning of this thread. As such, I'm still finding it, in spirit at least, against the letter of 18.4.1.3/1-3, according to which placement new returns the ptr, and "*intentionally* (emphasis mine) performs no other action".
(In reply to comment #34) > "*intentionally* (emphasis mine) performs no other action". There is no other action taken, it is just a barrior, not even really an action in my mind.
Well, the patch in comment #33 really does what I did with the patch to libstdc++. If we take the first patch (from comment #29) and change its effects on the alias machinery so that for q = ALIAS_CONVERT_EXPR (p) we have two VDEFs on qs and ps alias tags like # SMT.1_2 = VDEF <SMT.1_1> # SMT.2_2 = VDEF <SMT.2_1> q = ALIAS_CONVERT_EXPR (p) (and optimize q = ALIAS_CONVERT_EXPR (p) to q = p if the types of q and p are the same, as pinskia noted), then we are safe at the tree level (the ALIAS_CONVERT_EXPR represents a load/store barrier for qs and pq type). At expansion time we'd need to do the same, which would be emitting clobbers for mem expressions representing dereferences of q/p. I believe we cannot really do better here.
By the way, would it be possible to have a look at the assembly of competitors? I'm under the impression that a quick look would more than enough to understand whether we are optimizing "enough"...
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > ------- Comment #36 from rguenth at gcc dot gnu dot org 2007-05-03 21:59 ------- > Well, the patch in comment #33 really does what I did with the patch to > libstdc++. I think that Ian's second patch (the heavy hammer of a memory clobber, generated by the front end) is also reasonable. It's more heavy-handed than Ian's earlier solution, but certainly less intrusive and easier to get right. I think that either of these solutions will make user code work as expected, which is the most important consideration. I guess my inclination would be to go with the heavy hammer, until/unless we start finding real code that's seriously impacted.
From my perspective as a user, I wouldn't find a memory clobber in front of every placement new all that terrible. People don't do that in tight loops, and even if they did all that would happen is that the clobber presents a barrier across which no memory accesses can be moved and across which no dead stores etc can be eliminated. Presumably with every placement new/delete, there comes some housekeeping of available memory, so there will be a significant amount of code before and after, and so a single barrier isn't going to have the effect of slowing down a code by any significant factor. This appears to me to be a case where one could overengineer a solution. I would just go with the memory clobber and see whether we get bug reports of people complaining about missed optimizations :-) W.
Created attachment 13506 [details] Patch Here is yet another patch, an improvement over the previous one. It finally dawned on me that we can use an asm to represent the exact memory operations we care about, namely a use of the memory accessed via the old type and a def of the memory accessed via the new type. This way we don't mess with alias sets at all per se, we just put down a barrier for exactly the memory addresses that matter (of course the effect may be to VUSE and VDEF SMT tags, but the aliasing machinery takes care of that for us). This patch still needs some more testing--in particular I need to make sure it does the right thing for templates--but it passes bootstrap and the g++ testsuite. Richard, could you test it on your C++ test case(s)? Thanks.
In case it wasn't clear already, many thanks Ian and everyone for fixing this annoying issue.
I put it on the tester. A testcase that does regress (simplified from tramp3d): template <class T, int D> class Vector { public: Vector() { for (int i = 0; i < D; ++i) new (&x_m[i]) T(); } T& operator[](int i) { return x_m[i]; } private: T x_m[D]; }; void foo(Vector<double, 3> *m) { Vector<double, 3> v; v[0] = 1.0; v[1] = 2.0; v[3] = 3.0; *m = v; } where we can no longer optimize the redundant stores to v: <bb 2>: v.x_m[0] = 0.0; __asm__ __volatile__("":"=m" v.x_m[0]:"m" v.x_m[0]); v.x_m[1] = 0.0; __asm__ __volatile__("":"=m" v.x_m[1]:"m" v.x_m[1]); v.x_m[2] = 0.0; __asm__ __volatile__("":"=m" v.x_m[2]:"m" v.x_m[2]); v.x_m[0] = 1.0e+0; v.x_m[1] = 2.0e+0; v.x_m[3] = 3.0e+0; *m = v; return; vs. <bb 2>: v.x_m[2] = 0.0; v.x_m[0] = 1.0e+0; v.x_m[1] = 2.0e+0; v.x_m[3] = 3.0e+0; *m = v; return; and assembly: _Z3fooP6VectorIdLi3EE: .LFB17: xorl %r9d, %r9d movq %r9, -40(%rsp) movq %r9, -32(%rsp) movq %r9, -24(%rsp) movabsq $4607182418800017408, %r8 movabsq $4611686018427387904, %rsi movq -24(%rsp), %rax movq %r8, -40(%rsp) movq %rsi, -32(%rsp) movq -40(%rsp), %rcx movq -32(%rsp), %rdx movq %rax, 16(%rdi) movq %rcx, (%rdi) movq %rdx, 8(%rdi) ret vs. _Z3fooP6VectorIdLi3EE: .LFB17: movabsq $4607182418800017408, %r8 movabsq $4611686018427387904, %rsi movq $0, -24(%rsp) movq %r8, -40(%rsp) movq %rsi, -32(%rsp) movq -40(%rsp), %rcx movq -32(%rsp), %rdx movq -24(%rsp), %rax movq %rcx, (%rdi) movq %rdx, 8(%rdi) movq %rax, 16(%rdi) ret this happens in hot tramp3d loops and is a quite common idiom for initializing storage. To fix this we need to avoid creating the asm if the type of the original storage is the same as the other.
(In reply to comment #42) > this happens in hot tramp3d loops and is a quite common idiom for initializing > storage. To fix this we need to avoid creating the asm if the type of the > original storage is the same as the other. Many thanks Richard. Indeed, I find this already mentioned optimization a very sensible thing to add. (By the way, someone should tell the tramp3d developers that using placement new for a POD type like double isn't a very smart idea ;) The only tricky point for default inizialization is whether using memset or a plain loop)
(In reply to comment #43) > (By the way, someone should tell the tramp3d developers that using placement > new for a POD type like double isn't a very smart idea ;) The only tricky point > for default inizialization is whether using memset or a plain loop) Forgot to add a detail that may be of interest, however: C++03 *fails* to classify as POD some very common and simple types which are morally PODs: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2172.html Thus let's not assume users to be too good here: sometimes, in generic code, they may end-up calling placement new also for very simple types.
Of course in the tramp3d case the code is written in such a generic way that it supports non-POD data members... (And yes, "upstream" has this optimized, but tramp3d-v4.cpp is about having a gcc testcase, not perfectly user-written code ;))
http://www.suse.de/~gcctest/c++bench/tramp3d/ 2nd graph from the top :( boost ICEs: /gcc/spec/sb-vangelis-head-64/boost_1_33_1/boost/wave/util/unput_queue_iterator.hpp:221: warning: suggest parentheses around && within || ../cpp.cpp: In member function 'typename boost::spirit::parser_result<boost::spirit::sequence<A, B>, ScannerT>::type boost::spirit::sequence<A, B>::parse(const ScannerT&) const [with ScannerT = boost::spirit::scanner<const char*, boost::spirit::scanner_policies<boost::spirit::skipper_iteration_policy<boost::spirit::iteration_policy>, boost::spirit::match_policy, boost::spirit::action_policy> >, A = boost::spirit::sequence<boost::spirit::action<boost::spirit::symbols<int, char, boost::spirit::impl::tst<int, char> >, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> >, boost::spirit::action<boost::spirit::int_parser<int, 10, 1u, -0x00000000000000001>, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> > >, B = boost::spirit::action<boost::spirit::int_parser<int, 10, 1u, -0x00000000000000001>, boost::spirit::ref_value_actor<int, boost::spirit::assign_action> >]': ../cpp.cpp:715: internal compiler error: in get_indirect_ref_operands, at tree-ssa-operands.c:1704 Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions. the run is not yet finished (currently mico is running).
Unfortunately there is no obvious way to avoid creating the asm if the types are the same. The reason is that the dynamic types are different, and we don't know the dynamic type. Look at your original test case: Bar *b = new (f) Bar; b->p = 0; f = new (f) Foo; Here f has type Foo*. So if we test whether the types are the same, we conclude that we don't need any blockage in the last statement quoted above. But that misses the fact that although the static type of *f is Foo, the dynamic type of *f is Bar. Any ideas?
Doh, you are right. So we're back to doing it in the aliasing machinery and with a new tree-code I think. On IRC Danny said he "can do it" ;)
Created attachment 13543 [details] Patch Like sands through the hourglass, so are the patches to PR 29286. This version creates a new tree code. At the RTL level the aliasing is represented by a list of merged alias sets. I think this is the best one so far. Any comments?
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 11 May 2007 23:22:14 -0000, ian at airs dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #49 from ian at airs dot com 2007-05-12 00:22 ------- > Created an attachment (id=13543) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13543&action=view) > Patch > > Like sands through the hourglass, so are the patches to PR 29286. > > This version creates a new tree code. At the RTL level the aliasing is > represented by a list of merged alias sets. I think this is the best one so > far. Any comments? > I like this a lot, and it looks like it should do exactly the right thing.
This looks indeed very promising. I'll give it a spin on tramp3d. The only thing I noticed is that + case CHANGE_DYNAMIC_TYPE_EXPR: + /* We can access the location with the given type. We don't + track pointers in RTL, so we handle this by assuming that any + reference to one type can alias any reference to the other + type across the whole function. No code is generated. */ + { + tree type = CHANGE_DYNAMIC_TYPE_NEW_TYPE (exp); + tree ptr = CHANGE_DYNAMIC_TYPE_LOCATION (exp); + tree ptr_type = TREE_TYPE (ptr); + + gcc_assert (TREE_CODE (type) == POINTER_TYPE); + gcc_assert (TREE_CODE (ptr_type) == POINTER_TYPE); + merge_alias_sets_in_function (cfun, + get_alias_set (TREE_TYPE (type)), + get_alias_set (TREE_TYPE (ptr_type))); + return const0_rtx; + } doesn't seem to handle the case the original pointed to type is not the old dynamic type (which you rightfully were concerned about in comment #47). So it looks like (in absence of points-to information) we need to make the new dynamic type alias everything?
But I get Program received signal SIGSEGV, Segmentation fault. 0x00000000006e646f in find_assert_locations (bb=0x2b56e8984480) at ../../trunk/gcc/tree-vrp.c:3193 3193 if (flag_delete_null_pointer_checks && POINTER_TYPE_P (TREE_TYPE (op))) (gdb) call debug_bb (bb) ;; basic block 3, loop depth 1, count 0 ;; prev block 2, next block 4 ;; pred: 6 [91.0%] (true,exec) ;; succ: 4 [85.0%] (true,exec) 5 [15.0%] (false,exec) <bb 3>: D.926012_14 = D.926053_23; <<<change_dynamic_type (struct Loc *) D.926018_15)>>> this_16 = __cur_1; if (__cur_1 != 0B) goto <bb 4>; else goto <bb 5> (<L9>); compiling tramp3d. Supposedly because we walk all USEs in the stmt and find (struct Loc *) which is not exactly a "use".
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 12 May 2007 11:11:03 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #52 from rguenth at gcc dot gnu dot org 2007-05-12 12:11 ------- > But I get > > Program received signal SIGSEGV, Segmentation fault. > 0x00000000006e646f in find_assert_locations (bb=0x2b56e8984480) > at ../../trunk/gcc/tree-vrp.c:3193 > 3193 if (flag_delete_null_pointer_checks && POINTER_TYPE_P (TREE_TYPE > (op))) > (gdb) call debug_bb (bb) > ;; basic block 3, loop depth 1, count 0 > ;; prev block 2, next block 4 > ;; pred: 6 [91.0%] (true,exec) > ;; succ: 4 [85.0%] (true,exec) 5 [15.0%] (false,exec) > <bb 3>: > D.926012_14 = D.926053_23; > <<<change_dynamic_type (struct Loc *) D.926018_15)>>> > this_16 = __cur_1; > if (__cur_1 != 0B) > goto <bb 4>; > else > goto <bb 5> (<L9>); > > > compiling tramp3d. Supposedly because we walk all USEs in the stmt and > find (struct Loc *) which is not exactly a "use". Probably need to teach tree-ssa-operands that only one of the operands is really a use. > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
Regarding comment #51: I think the code is OK. What it does is, at the RTL level, make the two types alias each other for the entire function. This is different from what was happening on comment #47; that patch was making the two types alias at that specific point in the program. At the tree level we want to alias specific pointers, and the new patch retains that. It would be nice to do that at the RTL level also, but we unfortunately have no mechanism for that. So people who use placement new with different types, neither of which is char*, may see some optimization regression in some cases, because the scheduler will be more constrained than is really required. I don't see a way to avoid that. I also don't think it's very important.
Regarding comment #52: do you have a reasonably sized test case? I didn't see any problems when I ran the C++ testsuite. I think tree-ssa-operands.c does the right thing: case CHANGE_DYNAMIC_TYPE_EXPR: get_expr_operands (stmt, &CHANGE_DYNAMIC_TYPE_LOCATION (expr), opf_use); return; But there may well be something I misunderstand.
I'm trying to reduce it. The problem seems to be somewhere else (-O -fstrict-aliasing): /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h: In function '_ForwardIterator std::__uninitialized_copy_aux(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = __gnu_cxx::__normal_iterator<const Loc<1>*, std::vector<Loc<1>, std::allocator<Loc<1> > > >, _ForwardIterator = Loc<1>*]': /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: error: SSA name in freelist but still referenced D.927916_15 /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: internal compiler error: verify_stmts failed Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions.
It's become a libstdc++ test, but anyway: -O -fstrict-aliasing namespace Pooma { typedef int Context_t; namespace Arch { } inline Context_t context() { } inline int contexts() { } } template<class DomT, class T, class NewDom1T> struct DomainTraitsScalar { }; template<class T> struct DomainTraits : public DomainTraitsScalar<T, T, T> { }; template<int Dim> class Grid; template<class DT> class DomainBase { }; template<int Dim, class DT> class Domain : public DomainBase<DT> { }; #include <vector> template<> class Grid<1> : public Domain<1, DomainTraits<Grid<1> > > { }; namespace Pooma { class PatchSizeSyncer { typedef Grid<1> Grid_t; PatchSizeSyncer(int contextKey, Grid_t &localGrid); int myContext_m; int numContexts_m; int localKey_m; Grid_t localGrid_m; typedef std::pair<int,Grid_t *> Elem_t; std::vector<Elem_t> gridList_m; }; } namespace Pooma { PatchSizeSyncer::PatchSizeSyncer(int contextKey, Grid_t &localGrid) : myContext_m(Pooma::context()), numContexts_m(Pooma::contexts()), localKey_m(contextKey), localGrid_m(localGrid) { if (myContext_m == 0) gridList_m.reserve(numContexts_m); } } /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h: In function '_ForwardIterator std::__uninitialized_copy_aux(_InputIterator, _InputIterator, _ForwardIterator) [with _InputIterator = std::pair<int, Grid<1>*>*, _ForwardIterator = std::pair<int, Grid<1>*>*]': /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: error: SSA name in freelist but still referenced D.8564_13 /space/rguenther/tramp3d/install/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../include/c++/4.3.0/bits/stl_uninitialized.h:71: internal compiler error: verify_stmts failed Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://gcc.gnu.org/bugs.html> for instructions.
Created attachment 13551 [details] Patch That test case turns out to be an oversight in my patch to tree-ssa-dce.c. Here is the new patch against mainline, which has a one line change to tree-ssa-dce.c compared to the old patch.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 14 May 2007 03:45:25 -0000, ian at airs dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #58 from ian at airs dot com 2007-05-14 04:45 ------- > Created an attachment (id=13551) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13551&action=view) > Patch > > That test case turns out to be an oversight in my patch to tree-ssa-dce.c. > Here is the new patch against mainline, which has a one line change to > tree-ssa-dce.c compared to the old patch. I believe your tree-ssa-dce part is wrong. I see no reason for them to be marked always necessary. They are only necessary if the result is used, passed somewhere else, etc. These should be properly marked in the regular course of DCE. > > > -- > > ian at airs dot com changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Attachment #13543 [details]|0 |1 > is obsolete| | > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
But it doesn't have a result, does it? Given that, I wonder how moving stmts across it is prevented?
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 14 May 2007 08:25:27 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #60 from rguenth at gcc dot gnu dot org 2007-05-14 09:25 ------- > But it doesn't have a result, does it? Given that, I wonder how moving stmts > across it is prevented? Okay, so then it needs an LHS that defines a new SSA name, otherwise, we'll end up with dead ones everywhere, and they will keep other dead code alive.
We ICE with template <class T, class T2, int D> class Vector { public: Vector() { for (int i = 0; i < D; ++i) new (&x_m[i]) T2(); } T2& operator[](int i) { return *reinterpret_cast<T2*>(&x_m[i]); } private: T x_m[D]; }; void foo(Vector<double, long, 3> *m) { Vector<double, long, 3> v; v[0] = 1; v[1] = 2; v[2] = 3; *m = v; } (adjust the storage type to be of same size for double/long): Program received signal SIGSEGV, Segmentation fault. 0x0000000000b11a5e in unmodifiable_var_p (var=0x0) at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598 1598 if (TREE_CODE (var) == SSA_NAME) #0 0x0000000000b11a5e in unmodifiable_var_p (var=0x0) at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598 #1 0x0000000000b194c3 in may_alias_p (ptr=0x2b537d358210, mem_alias_set=2, var=0x2b537d359e00, var_alias_set=3, alias_set_only=0 '\0') at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2685 #2 0x0000000000b17f2b in add_flow_insensitive_var_aliases (ai=0x14d7890, pvar=0x2b537d358210, set=2, tag=0x2b537d356ba0) at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2190 #3 0x0000000000b184d3 in compute_change_dynamic_type_aliasing ( vcdt=0x2b537d357a00, vai=0x14d7890) at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2242 #4 0x000000000098a387 in pointer_set_traverse (pset=0x14e4e10, fn=0xb17f84 <compute_change_dynamic_type_aliasing>, data=0x14d7890) at /space/rguenther/src/svn/trunk/gcc/pointer-set.c:181 #5 0x0000000000b18719 in compute_flow_insensitive_aliasing (ai=0x14d7890) at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2295 #6 0x0000000000b15b17 in compute_may_aliases () where in may_alias_p the ptr we passed does not have a symbol memory tag associated. (but var is SFT.3)
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 14 May 2007 15:20:14 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #62 from rguenth at gcc dot gnu dot org 2007-05-14 16:20 ------- > We ICE with > > template <class T, class T2, int D> > class Vector > { > public: > Vector() > { > for (int i = 0; i < D; ++i) > new (&x_m[i]) T2(); > } > T2& operator[](int i) { return *reinterpret_cast<T2*>(&x_m[i]); } > > private: > T x_m[D]; > }; > > void foo(Vector<double, long, 3> *m) > { > Vector<double, long, 3> v; > v[0] = 1; > v[1] = 2; > v[2] = 3; > *m = v; > } > > (adjust the storage type to be of same size for double/long): > > Program received signal SIGSEGV, Segmentation fault. > 0x0000000000b11a5e in unmodifiable_var_p (var=0x0) > at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598 > 1598 if (TREE_CODE (var) == SSA_NAME) > > #0 0x0000000000b11a5e in unmodifiable_var_p (var=0x0) > at /space/rguenther/src/svn/trunk/gcc/tree-flow-inline.h:1598 > #1 0x0000000000b194c3 in may_alias_p (ptr=0x2b537d358210, mem_alias_set=2, > var=0x2b537d359e00, var_alias_set=3, alias_set_only=0 '\0') > at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2685 > #2 0x0000000000b17f2b in add_flow_insensitive_var_aliases (ai=0x14d7890, > pvar=0x2b537d358210, set=2, tag=0x2b537d356ba0) > at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2190 > #3 0x0000000000b184d3 in compute_change_dynamic_type_aliasing ( > vcdt=0x2b537d357a00, vai=0x14d7890) > at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2242 > #4 0x000000000098a387 in pointer_set_traverse (pset=0x14e4e10, > fn=0xb17f84 <compute_change_dynamic_type_aliasing>, data=0x14d7890) > at /space/rguenther/src/svn/trunk/gcc/pointer-set.c:181 > #5 0x0000000000b18719 in compute_flow_insensitive_aliasing (ai=0x14d7890) > at /space/rguenther/src/svn/trunk/gcc/tree-ssa-alias.c:2295 > #6 0x0000000000b15b17 in compute_may_aliases () > > where in may_alias_p the ptr we passed does not have a symbol memory tag > associated. (but var is SFT.3) unmodifiable_var_p is not allowed to alias anything, so we don't create SMT's for them.
Re: comment #59: CHANGE_DYNAMIC_TYPE_EXPR does not have a result. I set it up so that it is essentially volatile, and represents an additional type aliasing at that point in the program. One of the earlier patches set it up as a cast. But it was hard to convince myself that it really did the right thing. A lot of code in the compiler strips casts for one reason or another. One would have to examine each of those places and make sure that it did not strip the cast inappropriately. Or, conversely, that it did strip the cast when it should. This would tend to introduce a bit of code in many different parts of the compiler, all to handle a very unusual case. Generating a statement with no result seemed more reliable to me. It's true that there is a downside: it may cause a placement new to be preserved inappropriately. But that seems like an unlikely case to me. I doubt that very much code does a placement new which changes types and then does not use the resulting memory. I think it's OK to misoptimize such code slightly.
Created attachment 13553 [details] Patch Thanks for the new test case. Here is a new patch. The change compared to the previous patch is just that we check whether the variable has a memory tag (i.e., the variable has been written) before calling add_flow_insensitive_var_aliases, rather than after.
Will not be fixed in 4.2.0; retargeting at 4.2.1.
The patch in comment #65 passes bootstrap and the gcc and libstdc++-v3 testsuites on i686-pc-linux-gnu.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 14 May 2007 21:35:58 -0000, ian at airs dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #67 from ian at airs dot com 2007-05-14 22:35 ------- > The patch in comment #65 passes bootstrap and the gcc and libstdc++-v3 > testsuites on i686-pc-linux-gnu. > Ian has made a convincing argument that optimizing placement new just ain't gonna matter here, so i'm fine with his patch to DCE :)
Unless somebody feels strongly otherwise, I propose that we only fix this in mainline (i.e., 4.3). Any fix is going to be delicate, this bug is unusual and difficult to trigger, and there is a source code workaround.
The last patch looks good sofar.
But it doesn't fix the POOMA miscompilations if I change its Pool allocator to use placement new instead of this // Release a block to the pool. inline void free(void *b) { // Record a free. outstandingAllocs_m -= 1; // Cast the pointer to the right type. Link *p = (Link*)b; // Make it point to the current head of the free list. p->next_m = head_m; // Make the next one the new head of the list. // We can't do head_m = p->next_m since p will soon be treated // as something other than a Link. By doing this assignment // with memcpy, we ensure that p->next_m will be read before // it is clobbered. memcpy(&p->next_m, &head_m, sizeof(head_m)); // Make it the head of the free list. head_m = p; } memcpy hack. I'll try to investigate...
inline void* operator new(unsigned long, void* __p) throw() { return __p; } extern void bar(); float foo(double *p) { long *l = (long *)p; *l = 1; float *f = new (p) float; *f = 0.0; bar (); return *f; } with -O2 I see <bb 2>: l_2 = (long int *) p_1(D); # SMT.4_12 = VDEF <SMT.4_11(D)> *l_2 = 1; <<<change_dynamic_type (float *) p_1(D))>>> __p_8 = p_1(D); f_5 = (float *) __p_8; # SMT.5_14 = VDEF <SMT.5_13(D)> *f_5 = 0.0; # SMT.4_15 = VDEF <SMT.4_12> # SMT.5_16 = VDEF <SMT.5_14> bar (); # VUSE <SMT.5_16> D.2462_6 = *f_5; return D.2462_6; which looks wrong - the store to f_5 needs to alias *l_2?
For the following testcase, loop store motion re-orders the writes to *l and *f at -O2. (Note as is, the testcase is only valid for n == 1) inline void* operator new(unsigned long, void* __p) throw() { return __p; } extern void bar(); long foo(double *p, int n) { long *f; for (int i=0; i<n; ++i) { int *l = (int *)p; *l = 0; f = new (p) long; *f = -1; } bar (); return *f; } and we end up with <bb 2>: if (n > 0) goto <bb 3>; else goto <bb 4>; <bb 3>: f = (long int *) p; <<<change_dynamic_type (long int *) p)>>> # SMT.4_27 = VDEF <SMT.4_19(D)> *f = -1; # SMT.5_35 = VDEF <SMT.5_20(D)> *(int *) p = 0; <bb 4>: # SMT.4_23 = VDEF <SMT.4_32> # SMT.5_24 = VDEF <SMT.5_34> bar (); return *f; which is wrong. Runtime testcase that aborts: inline void* operator new(unsigned long, void* __p) throw() { return __p; } void __attribute__((noinline)) bar() {}; long __attribute__((noinline)) foo(double *p, int n) { long *f; for (int i=0; i<n; ++i) { int *l = (int *)p; *l = 0; f = new (p) long; *f = -1; } bar (); return *f; } extern "C" void abort(void); int main() { union { int i; long l; } u; if (foo((double *)&u, 1) != -1) abort (); return 0; }
Note that another solution to this whole problem is to record conflicts for all stored-to types that we cannot disambiguate by points-to analysis. This also solves the reading of the standard that allows int i = 1; (float *)&i = 0.0; at RTL expansion time we need to do the same apart from that we don't have points-to information available. The above supports the notion of "re-using memory changes its dynamic type". Which of course gets a little interesting for aggregates that are initialized piece-wise (not that this problem isn't present now).
In the chance of endlessly repeating myself here (;)) - what comment #73 boils down to is that we may not prune alias sets of stores based on TBAA results. In fact, we need to realize that C++ object lifetime rules need to be modeled somehow - it's those rules that count after all, not some magic side-effects of placement new. Of course I realize that without TBAA pruning we have no alias information left on the RTL side for stores (so stores effectively have alias set zero).
I don't believe that the C++ standards writers really meant to eliminate TBAA. And that is the inevitable consequence of the dynamic memory type approach if you are allowed to change the dynamic type in a function. If we really want to adopt that approach, then the simple fix is for the C++ frontend to set flag_strict_aliasing to 0. But I don't think we want to do that.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > I don't believe that the C++ standards writers really meant to eliminate TBAA. > And that is the inevitable consequence of the dynamic memory type approach if > you are allowed to change the dynamic type in a function. I agree with Ian. I think there is good evidence that the authors of the standard intended C++ to be *more* typesafe than C, and we should read the standard in that way. It's unfortunate that neither the C or C++ standards is very precise about various aspects of the memory model, but that is what it is. I think trying to read the standard to divine the answers to these questions is essentially futile; this is a situation where we should accept that the standard doesn't say, and do our best to balance performance with existing practice and expectations. I don't fully understand the point of Comment #73. I thought the whole point of this series of patches was to make the compiler understand that memory returned by placement new could alias other memory, or to otherwise introduce a barrier that would prevent the compiler from reordering accesses across the call to operator new. If that's the case, why does the post-patch compiler still think that the writes to "f" and "l" don't alias?
The test case in comment #73 is just a standard aliasing violation. You are casting a double* to an int* and writing to it both ways. As I argued in comment #76, the only way this is going to work is if we eliminate TBAA in C++.
The test case in comment #71 doesn't use placement new either. This PR was originally about placement new. I think there is general agreement thta placement new needs to avoid aliasing problem. I don't think there is general agreement that arbitrary type casting needs to avoid aliasing problems. Have you found any problems with my most recent patch and placement new?
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > ------- Comment #78 from ian at airs dot com 2007-05-18 07:14 ------- > The test case in comment #73 is just a standard aliasing violation. You are > casting a double* to an int* and writing to it both ways. I'm confused. The double-ness looks irrelevant to me; it could just as well be "void *". The only actual accesses to the memory are through an "int *" pointer and a "long *" pointer, and there's a placement new between the two. I thought the whole point of these patches was to allow placement new to change the type in exactly this way?
Yes, both testcases are valid and are using placement new. Note the loop is only to confuse the optimizers enough to re-order the stores and produce a miscompilation. Note the loop runs exactly once, and in essence we are doing int *p = XXX; /* integer memory */ *p = 0; long *q = new (p) long; *q = -1; and the compiler is re-ordering the stores which is wrong. *p and *q need to alias because of the placement new.
Oh, and the double-ness is to show that at RTL expansion we actually unify alias sets of long and double, not long and int, which is wrong again. See my comment #51.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes: | The test case in comment #71 doesn't use placement new either. | | This PR was originally about placement new. Yes. | I think there is general agreement | thta placement new needs to avoid aliasing problem. Yes. The intent of placement new is to specify a new dynamic type, and end the lifetime of previous object. This is type safe, beause the "old" object no longer exists. | I don't think there is | general agreement that arbitrary type casting needs to avoid aliasing problems. agree. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenth at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes: | ------- Comment #81 from rguenth at gcc dot gnu dot org 2007-05-18 09:45 ------- | Yes, both testcases are valid and are using placement new. Note the loop | is only to confuse the optimizers enough to re-order the stores and produce | a miscompilation. Note the loop runs exactly once, and in essence we are doing | | int *p = XXX; /* integer memory */ | *p = 0; | long *q = new (p) long; This is OK as long as int and long have same alignment, sizeof, etc. For example, that code is invalid on 64-bit platform where int has alignment and the storage is not large enough to meet long's requirement. This is the case only when the "operator new" is not overloaded, but is the "standard one". The exact conditions are so tricky to explicit that it would just be OK for us to accept it. -- Gaby
To avoid confusion about what testcase we speak, let's talk about the one I replicated below. I believe this one is valid as the storage is a union instantiated in main() and a pointer to it is passed to foo(). It doesn't matter the argument pointer type is double*, as we access the memory first as int* (which is ok and starts lifetime of int typed memory) and then we EOL that object and start lifetime of a long via placement new (alignment is ok, as is size). So, anyone disagrees that the below testcase is valid? inline void* operator new(unsigned long, void* __p) throw() { return __p; } void __attribute__((noinline)) bar() {}; long __attribute__((noinline)) foo(double *p, int n) { long *f; for (int i=0; i<n; ++i) { int *l = (int *)p; *l = 0; f = new (p) long; *f = -1; } bar (); return *f; } extern "C" void abort(void); int main() { union { int i; long l; } u; if (foo((double *)&u, 1) != -1) abort (); return 0; }
Re comment #80, comment #81, comment #82. My patch handles the placement new in comment #73 to indicate an alias between double and long. The mis-ordered code is actually aliasing int and long. That aliasing is due to a cast to (int*). There is no placement new in that cast. There is no reason for the compiler to avoid exchanging the store via int* and the store via long*. Why should it? If you omitted the placement new, this would be a standard aliasing violation between int and double. With the placement new, it's still an aliasing violation between int and long. The placement new doesn't excuse you from all aliasing violations: it just excuses you from aliasing violations involving the actual types in question. If you want to make a principled argument that placement new excuses you from all aliasing violations, then again the result is that in a function which uses placement new we should disable TBAA for that function. That would be OK with me and easy to implement.
I'm not sure what to make of comment #84. We don't determine aliasing by alignment or size. We determine it by type. We don't currently treat int and long as aliasing each other even if they happen to have the same alignment and size. I believe this is correct according to the C standard but I am less familiar with the C++ standard. We could change the aliasing machinery in that way for C++ if it seems to be appropriate, but I would prefer to take that to a different PR.
Regarding comment #85, this again relies on the notion of dynamic type of a memory location such that the only possible end result is to eliminate TBAA when compiling C++. The only thing I can say about this sort of test case is that I agree that one can make an argument from the standard that this type of code is valid. However, as I've said before, e.g., in comment #76, I don't think that anybody really wants that end result. Any example that inherently relies on a type cast is going to lead you straight to eliminating TBAA. And note that your example is clearly invalid in C. Adding the placement new doesn't really have anything to do with whether it is valid or invalid in C++. Can you show me a test case which my patch gets wrong which does not involve type casting outside of placement new itself?
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > ------- Comment #86 from ian at airs dot com 2007-05-18 17:24 ------- > Re comment #80, comment #81, comment #82. My patch handles the placement new > in comment #73 to indicate an alias between double and long. The mis-ordered > code is actually aliasing int and long. That aliasing is due to a cast to > (int*). There is no placement new in that cast. There is no reason for the > compiler to avoid exchanging the store via int* and the store via long*. Why > should it? I don't think the fact that "p" is a "double *" is relevant; it could just as well be "void *". This kind of code is unambiguously valid: void f(double *p) { (int*)p = 3; } void g() { int i; f((double *)&i); } Pedantically, the alignment of "double" has to be no stricter than "int" for this to be valid, but since we define pointer conversion as a no-op, it's always valid in GNU C. This is why I liked your earlier patch that made placement new a memory barrier. I think the right handling of placement new is basically to say "everything we know about types is forgotten here". One can limit that to things to which the operand might point, but points-to analysis (should not) tell you that "p" and "l" cannot point at the same place, since we have no idea what "p" points at. I don't think this is equivalent to totally disabling TBAA in C++.
I agree that this is valid: void f(double *p) { *(int*)p = 3; } void g() { int i; f((double *)&i); } But I don't think that is the question at hand. The variable is always being accessed in the same type, which is also the type of its declaration. The question at hand is this: void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; } void g() { int i; f((double *)&i); } And the specific question is whether we are permitted to interchange the assignments to *p and *l. Let's consider this: void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; } Is that valid? Is the compiler permitted to interchange the assignments to *p and *l? Consider that, as in comment #73, p might actually point to a union of int and long. Does that fact that that union might exist somewhere else make this test case valid? Presumably it does not. Presumably this is invalid. So if that is not valid, and the placement new case is valid, then what is the essential difference between the cases? The variable is being accessed via two different types. Why is that OK? You're right that don't have to abandon TBAA to make this work, that we can make it work by turning placement new into a memory barrier. But then you have to address comment #42. That approach will cause a performance regression for natural valid code. The question then becomes whether we are willing to pay that performance regression for normal code in order to support this sort of weird code. Can anybody see a way through this maze?
(In reply to comment #90) > Can anybody see a way through this maze? Humbly, I'd like to suggest again that we have a look to the assembly produced by other compilers. I remember that some GCC contributors have access to compilers such as ICC, XLC++, Sun, etc. and I don't think we should strive at all costs for perfect optimization, if that means keeping busy our best contributors for months and risking producing wrong code. We can always return to the issue later, as an enhancement.
> So if that is not valid, and the placement new case is valid, then what is the > essential difference between the cases? The variable is being accessed via two > different types. Why is that OK? > void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; } > void g() { int i; f((double *)&i); } Because the memory that p was pointing to, stops being an int once a placement new happens. For F, it goes: > void f(double* p) > { p points to an variable that is an int > *(int*)p = 3 access the int via an int. > long *l = new (p) long; The memory is no longer an int, it has become a long, it cannot be accessed as an int no longer as that would be undefined. > *l = 4; Access the memory as a long and since the type is a long, this is well defined. >} Now after this function returns the variable can only be accessed as a long (well and via a character type). Hopefully this explains why this is valid. Now should we do anything about it, I don't know because how often does this happen in real life, I don't know.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; } > void g() { int i; f((double *)&i); } > > And the specific question is whether we are permitted to interchange the > assignments to *p and *l. I do not think we are. > void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; } > > Is that valid? Is the compiler permitted to interchange the assignments to *p > and *l? Consider that, as in comment #73, p might actually point to a union of > int and long. Does that fact that that union might exist somewhere else make > this test case valid? Presumably it does not. Presumably this is invalid. Agreed; this case is invalid. > So if that is not valid, and the placement new case is valid, then what is the > essential difference between the cases? The variable is being accessed via two > different types. Why is that OK? Because placement new changes the type of storage, in the same way that using ordinary ("delete") and then using (ordinary) "new" (but getting back the same memory pointer) does. The placement "new" operator is special. > You're right that don't have to abandon TBAA to make this work, that we can > make it work by turning placement new into a memory barrier. But then you have > to address comment #42. That approach will cause a performance regression for > natural valid code. The question then becomes whether we are willing to pay > that performance regression for normal code in order to support this sort of > weird code. I am willing to accept that performance regression. I don't consider that code "normal"; many C++ performance libraries now provide a way to produce an uninitialized container, precisely to avoid default construction. POOMA could use that technique. It would of course be better (though, in my opinion, not essential) to have a more gentle barrier. If we could tell the compiler to forget the type of anything that the argument to placement-new might point to, but not to assume that arbitrary weirdness has occurred, then the compiler could still eliminate the redundant stores. In other words, in Comment #42, the problem is that the volatile asm tells the compiler that not only must the stores/loads not be reordered across the barrier, but that stores before the barrier must actually occur because their may be some arbitrary action at the barrier that depends upon the values written. If we had a barrier that says just that the operations may not be reordered across the barrier -- but does not say that the operations before the barrier are side-effecting -- then we could still eliminate them as redundant.
I tried the original test case with icc, and it gets the right result. The assignment b->p = 0 is discarded. Unfortunately I'm not sure what this actually tells us.
But construction/initialization of uninitalized memory in <memory> happens with placement new! So we're back to square one. What this PR initially was about is a fixed type memory allocator in C++ which needs to change memory from allocated type T to free-space-managing-structure S at deallocation time and the other way around at allocation time. We absolutely _have_ to handle this case correct. And we need to optimize the <memory> routines that use placement new, because they resemble patterns used in libraries like POOMA or Boost.
(In reply to comment #95) > But construction/initialization of uninitalized memory in <memory> happens with > placement new! Placement new is used only for non-POD types, to be clear.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > But construction/initialization of uninitalized memory in <memory> happens with > placement new! So we're back to square one. What this PR initially was about > is a fixed type memory allocator in C++ which needs to change memory from > allocated type T to free-space-managing-structure S at deallocation time and > the other way around at allocation time. We absolutely _have_ to handle > this case correct. And we need to optimize the <memory> routines that use > placement new, because they resemble patterns used in libraries like POOMA > or Boost. First and foremeost, we have to generate correct code. If that means the memory barrier solution, for now, then so be it.
(In reply to comment #97) > First and foremeost, we have to generate correct code. If that means > the memory barrier solution, for now, then so be it. Yes, but I'm a little worried myself not by <memory> but by containers like <list>, <map> and <set>, where we call allocator::contruct, which uses placement new, to "install" the value in the allocated node. In that case, unfortunately, per 20.4.1.1/12, we can't optimize it for POD types :( At minimum we should benchmark a bit...
To complete my reasoning: in case we end-up with some sort of very bad pessimization of placement new, probably we'll have to adjust such containers to not call allocator::contruct at all when the default allocator is involved && a POD type is contructed, similarly to std::vector and std::deque for the uninitialized functions.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes: | I'm not sure what to make of comment #84. We don't determine aliasing by | alignment or size. We determine it by type. We don't currently treat int and | long as aliasing each other even if they happen to have the same alignment and | size. That is GOOD. :-) | I believe this is correct according to the C standard but I am less | familiar with the C++ standard. It is also correct semantics with respect to C++. C++ goes further (I don't have C standard handy to check). The memory pointed to by p must have the "right" alignment requirements etc. So if you grab memory into p and it does not have the right alignment, then the new placement yields undefined behaviour. That is not a property we can check statically (until we get the alignment proposal into C++). | We could change the aliasing machinery in that | way for C++ if it seems to be appropriate, but I would prefer to take that to a | different PR. What I was trying to say was that C++ provides the same "dynamic type" (non-)aliasing guarantees, and goes even further. Did I manage to confuse you again? -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "ian at airs dot com" <gcc-bugzilla@gcc.gnu.org> writes: | But I don't think that is the question at hand. The variable is always being | accessed in the same type, which is also the type of its declaration. The | question at hand is this: | | void f(double* p) { *(int*)p = 3; long *l = new (p) long; *l = 4; } | void g() { int i; f((double *)&i); } | | And the specific question is whether we are permitted to interchange the | assignments to *p and *l. After the placement new, the lifetime of *p is ended, and the lifetime of *l starts there. I don't think that leaves room for exchanging the stores to *p and *l. | Let's consider this: | | void f(double* p) { *(int*)p = 3; long *l = (long*)p; *l = 4; } | | Is that valid? If p is pointing to a union object that contain both int and long, yes, it is valid. Otherwisem, it is not. | Is the compiler permitted to interchange the assignments to *p | and *l? It is valid if we have a union object, otherwise, no. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: | ------- Comment #96 from pcarlini at suse dot de 2007-05-18 21:12 ------- | (In reply to comment #95) | > But construction/initialization of uninitalized memory in <memory> happens with | > placement new! | | Placement new is used only for non-POD types, to be clear. yes, but it is also valid for POD types. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: | ------- Comment #98 from pcarlini at suse dot de 2007-05-18 21:27 ------- | (In reply to comment #97) | > First and foremeost, we have to generate correct code. If that means | > the memory barrier solution, for now, then so be it. | | Yes, but I'm a little worried myself not by <memory> but by containers like | <list>, <map> and <set>, where we call allocator::contruct, which uses | placement new, to "install" the value in the allocated node. In that case, | unfortunately, per 20.4.1.1/12, we can't optimize it for POD types :( At | minimum we should benchmark a bit... If the element type is a POD, we should use assignment, not placement new. -- Gaby
(In reply to comment #103) > If the element type is a POD, we should use assignment, not placement new. Agreed, in principle. But before adding a load of templates and dispatching, let's wait a bit for the outcome of this issue and benchmark carefully: the performance difference may be minor, given that we are dynamically allocating memory for the new node.
Let me again do a step back and look at the problem from another view ;) -- C++ aliasing imposes additional restrictions on transformations we are allowed to do to memory references compared to C type-based aliasing rules. Consider two memory references A and B which we need to know whether we can exchange them. - If we can prove that both memory references are to non-overlapping memory regions we always can exchange them (PTA can provide this knowledge). - We can always re-order two loads. TBAA gives us another source of information to disambiguate the two references. So, if TBAA says the two references do not conflict then - we can hoist loads across stores. *double = 1.0; x = *int; if *int aliases *double the program is invalid. - we can sink stores across loads. (this is just the opposite view of the above) - we _cannot_ re-order stores. - we _cannot_ sink loads across stores. x = *int; *double = 1.0; the store to double may change the dynamic type of what *int points to. - we _cannot_ hoist stores across loads. (opposite view of the above) Note that all the interesting stuff (hoisting loads and sinking stores) is not affected by the stricter C++ aliasing rules. The caveat is, that these rules do not map to our representation of aliasing (VOPs) nor to type-based queries of the alias oracle. -- So the proposal is to impose these additional restrictions ontop of our alias representation and fixup passes that do not honour them. One of them is loop load/store motion which messes up store ordering, another is scheduling. If anyone can come up with a clever way to encode the extra restrictions into our IL be my guest (I wrapped my brain around this for some days now, and only a may_reorder_accesses (A, B) style oracle can handle this, but not VOPs in SSA form or something similar - we'd need a DU chain and another set of VOPs but that looks way too costly)
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > - we _cannot_ sink loads across stores. > > x = *int; > *double = 1.0; > > the store to double may change the dynamic type of what *int > points to. To be clear, you mean something like this, right: int i; int *ip = &i; double *dp = (double *)&i; int x; x = *ip; *dp = 1.0; ? I think that considering this code valid, and, therefore, forbidding the interchange of the last two statements, requires a perverse reading of the standard. Placement new allows you to change the dynamic type of storage; I don't think that just writing through a pointer does. A key goal of C++ relative to C was better type-safety. The placement new operator provides a facility for explicitly controlling object lifetime, for programmers that need this. Before we do anything to support the case above, we should have a crystal-clear ruling from the committee that says this is valid. Otherwise, this is exactly the kind of optimization that TBAA was designed to perform. For history, the reason I implemented TBAA in GCC was that the SGI MIPSPro C/C++ compiler did these kinds of optimizations ten years ago, and I was trying to catch us up when looking at POOMA performance on IRIX. G++ has had the freedom to interchange those stores for a long time, and I believe it should continue to have that choice.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Tue, 22 May 2007, mark at codesourcery dot com wrote: > rguenth at gcc dot gnu dot org wrote: > > > - we _cannot_ sink loads across stores. > > > > x = *int; > > *double = 1.0; > > > > the store to double may change the dynamic type of what *int > > points to. > > To be clear, you mean something like this, right: > > int i; > int *ip = &i; > double *dp = (double *)&i; > int x; > x = *ip; > *dp = 1.0; > > ? > > I think that considering this code valid, and, therefore, forbidding the > interchange of the last two statements, requires a perverse reading of > the standard. Placement new allows you to change the dynamic type of > storage; I don't think that just writing through a pointer does. A key > goal of C++ relative to C was better type-safety. The placement new > operator provides a facility for explicitly controlling object lifetime, > for programmers that need this. I just wanted to point out that even with a "perverse" reading of the C++ standard we don't loose the most interesting memory operations (respectively load hoisting and store sinking). > Before we do anything to support the case above, we should have a > crystal-clear ruling from the committee that says this is valid. > Otherwise, this is exactly the kind of optimization that TBAA was > designed to perform. TBAA still performs the useful cases of load hoisting and store sinking. The "perverse" reading of the standard is not making programs invalid that are valid as of the strict TBAA interpretation - it just implements (IMHO) a cheap way of making placement new work. > For history, the reason I implemented TBAA in GCC was that the SGI > MIPSPro C/C++ compiler did these kinds of optimizations ten years ago, > and I was trying to catch us up when looking at POOMA performance on > IRIX. G++ has had the freedom to interchange those stores for a long > time, and I believe it should continue to have that choice. So, what kind of useful transformations rely on hoisting stores or sinking loads? Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement | new does not change the dynamic type as it should | | rguenth at gcc dot gnu dot org wrote: | | > - we _cannot_ sink loads across stores. | > | > x = *int; | > *double = 1.0; | > | > the store to double may change the dynamic type of what *int | > points to. | | To be clear, you mean something like this, right: | | int i; | int *ip = &i; | double *dp = (double *)&i; | int x; | x = *ip; | *dp = 1.0; | | ? | | I think that considering this code valid, and, therefore, forbidding the | interchange of the last two statements, requires a perverse reading of | the standard. I'm not sure Richard is suggesting that -- I believe, we all agree that the above is invalid. It introduces an assumption that was not present in Richard's previous message (Richard might want to make explicit his assumptions). Namely, that we do know the definition of of the object int i; therefore we know that we can not possibly change its dynamic type. Consider the following instead // tu-1.C void f(int* p) { *p = 90; // ... *(double *) p = 8.3748; }; Is the above code invalid, independent of context? I don't think you can find a wording in the standard that says it is invalid. Indeed, consider this: // tu-2.C void f(int*); void g() { union { int i; double d; } t; t.i = 42; f(&t); cout << t.d << endl; } I believe we can all agree the definition of g is valid. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should gdr at cs dot tamu dot edu wrote: > Consider the following instead > > // tu-1.C > void f(int* p) { > *p = 90; > // ... > *(double *) p = 8.3748; > }; > > Is the above code invalid, independent of context? I don't think > you can find a wording in the standard that says it is invalid. IMO, the standard is just not clear with respect to aliasing. We cannot rely upon it, except as a guide. As I've said throughout this thread, it doesn't make sense to try to do "close reading" of the standard for aliasing issues because it just wasn't written with those issues in mind, just as there are all of the famous memory model issues in C. In any case, I consider the code above invalid. > Indeed, consider this: > > // tu-2.C > void f(int*); > void g() { > union { > int i; > double d; > } t; > > t.i = 42; > f(&t); > cout << t.d << endl; > } > > I believe we can all agree the definition of g is valid. No, I do not. And GCC historically has not; you are only allowed to use the union for type-punning if the accesses are through the union directly. That was the decision we made a long time ago regarding TBAA, and it even appears in the manual; the -fstrict-aliasing documentation says: "The practice of reading from a different union member than the one most recently written to (called ``type-punning'') is common. Even with @option{-fstrict-aliasing}, type-punning is allowed, provided the memory is accessed through the union type. So, the code above will work as expected. However, this code might not: @smallexample int f() @{ a_union t; int* ip; t.d = 3.0; ip = &t.i; return *ip; @} @end smallexample" The point here is that the compiler is allowed to decide that t.d does not alias "*ip" because the latter is not a direct access through the union.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | > Indeed, consider this: | > | > // tu-2.C | > void f(int*); | > void g() { | > union { | > int i; | > double d; | > } t; | > | > t.i = 42; | > f(&t); | > cout << t.d << endl; | > } | > | > I believe we can all agree the definition of g is valid. | | No, I do not. And GCC historically has not; you are only allowed to use | the union for type-punning if the accesses are through the union | directly. I am not talking of the GCC's historical behaviour here, but what the standard actually says. For the object "t", above the last write was to the double field, therefore the read is well-defined. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should gdr at cs dot tamu dot edu wrote: > | No, I do not. And GCC historically has not; you are only allowed to use > | the union for type-punning if the accesses are through the union > | directly. > > I am not talking of the GCC's historical behaviour here, but what the > standard actually says. For the object "t", above the last write was > to the double field, therefore the read is well-defined. Suffice it to say that I disagree. I'm not debating that you can read the standard that way. But, I don't think the standard contemplated these issues in sufficient detail to make it useful in this respect. Pragmatically, I don't think that we should change GCC, after years of people using it with the current rules, to make it generate inferior code -- without clear guidance from the standards committee. IMO, that needs to go beyond a reading of the current standard; there needs to be a clear expression from the committee that, indeed, the compiler cannot use TBAA in the way that GCC has historically used it. I'm all for bringing G++ into better conformance with the standard, and agree that correctness is more important than optimization, but I don't believe that the standard was written with these considerations in mind, so I don't think it can be relied upon in this respect.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | But, I don't think the standard contemplated | these issues in sufficient detail to make it useful in this respect. The issues has been raised on the -core reflector. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should gdr at cs dot tamu dot edu wrote: > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > | But, I don't think the standard contemplated > | these issues in sufficient detail to make it useful in this respect. > > The issues has been raised on the -core reflector. So that I understand, do you mean that they have been raised in the past, and settled, or that you've just raised them now? Thanks,
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement | new does not change the dynamic type as it should | | gdr at cs dot tamu dot edu wrote: | > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 ------- | > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the | > dynamic type as it should | > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | > | > | But, I don't think the standard contemplated | > | these issues in sufficient detail to make it useful in this respect. | > | > The issues has been raised on the -core reflector. | | So that I understand, do you mean that they have been raised in the | past, and settled, or that you've just raised them now? I just raised it, mainly because I do not believe your conclusions are right. The type information you can get from write and read are not just those appearing lexically in a scope. In fact, the semantics is defined in terms of the run time read and write access. That is what makes "C a strongly typed and weakly check language" (DMR). This whole issue does not mean you have to abandon TBAA. It means you have be careful in how you use the information gained from TBAA. In particular, many conclusions are OK when you have the definition of the objects at hand. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 22 May 2007 16:54:24 -0000, mark at codesourcery dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #113 from mark at codesourcery dot com 2007-05-22 17:54 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > gdr at cs dot tamu dot edu wrote: > > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 ------- > > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > > dynamic type as it should > > > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > > > | But, I don't think the standard contemplated > > | these issues in sufficient detail to make it useful in this respect. > > > > The issues has been raised on the -core reflector. > > So that I understand, do you mean that they have been raised in the > past, and settled, or that you've just raised them now? > And if you've raised them now, can you please try to ensure that the discussion covers what to do if the union'ing happens outside the current translation unit? IE What to do about cases where 1. One union member is written 2. We are being passed the address of a union member at the same offset 3. It then it is casted to a pointer to the type of the union member from 1. 4. It is stored/loaded It is literally impossible to know whether the pointers being passed into a function are from unions or not, since the union'ing may occur outside of our translation unit. This is true unless it is a static function whose callers do not pass in pointers that are ever acquired externally. If you take a strict reading of the standard that union'ing makes everything okay, even when you can't see their may have been a union, then you simply cannot perform TBAA except for completely local variables, or when you are dealing with static function arguments that meet the parameters above, or when you can get a guarantee you have the whole program. The ideal resolution here would be that all accesses through a union must be must be visibly accessing through a union type. This is one of the proposed resolutions to the similar problem in C (there were also proposals that said something about TBAA changing at function boundaries, but these are ridiculous given the fact that it does not specify what happens if you inline, etc, so that the function boundaries aren't where they were originally).
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #114 from gdr at cs dot tamu dot edu 2007-05-22 18:01 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > | new does not change the dynamic type as it should > | > | gdr at cs dot tamu dot edu wrote: > | > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 > ------- > | > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change > the > | > dynamic type as it should > | > > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > | > > | > | But, I don't think the standard contemplated > | > | these issues in sufficient detail to make it useful in this respect. > | > > | > The issues has been raised on the -core reflector. > | > | So that I understand, do you mean that they have been raised in the > | past, and settled, or that you've just raised them now? > > I just raised it, mainly because I do not believe your conclusions are > right. > > The type information you can get from write and read are not just > those appearing lexically in a scope. In fact, the semantics is defined > in terms of the run time read and write access. > That is what makes "C a strongly typed and weakly check language" (DMR). > > This whole issue does not mean you have to abandon TBAA. It means you > have be careful in how you use the information gained from TBAA. In > particular, many conclusions are OK when you have the definition of > the objects at hand. Uh, you do more or less have to abandon TBAA for pointer arguments unless you can account for every single caller of your function, and ensure that none of them are passing you pointers external to what your compiler can see. This case is *extremely rare* outside of the user giving us a whole program guarantee. TBAA's main use is in fact, in disambiguating pointer arguments. If you take that away, it becomes pretty much useless. It's guarantees about local objects were never interesting.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "dberlin at dberlin dot org" <gcc-bugzilla@gcc.gnu.org> writes: [...] | And if you've raised them now, can you please try to ensure that the | discussion covers what to do if the union'ing happens outside the | current translation unit? | | IE What to do about cases where | 1. One union member is written | 2. We are being passed the address of a union member at the same offset | 3. It then it is casted to a pointer to the type of the union member from 1. | 4. It is stored/loaded Thanks for reminding all those points. I'll ensure that I make those points stand in subsequence messages. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should gdr at cs dot tamu dot edu wrote: > Thanks for reminding all those points. I'll ensure that I make those > points stand in subsequence messages. I think it's also worth pointing out to the committee that the more aggressive aliasing rules (in which only access directly through a union is allowed) have been GNU C/C++ practice for a long time. I would guess that we made this change around the year 2000. So, there's a large body of code that conforms to the requirements of the aggressive interpretation.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement | new does not change the dynamic type as it should | | gdr at cs dot tamu dot edu wrote: | | > Thanks for reminding all those points. I'll ensure that I make those | > points stand in subsequence messages. | | I think it's also worth pointing out to the committee that the more | aggressive aliasing rules (in which only access directly through a union | is allowed) have been GNU C/C++ practice for a long time. Yes, I will. My guess of what happened was that C++98, which was based on C90, did not pick up the new rules for C99; and the 2003 revision focused mainly on DRs so missed opportunity to incorporate those rules. The meeting at Toronto is supposedly the last meeting for freasing features. I'll make sure this issue gets attention -- I already sent a message to the reflector; I'll also write a paper along with the above suggestions and recommendations. | I would guess | that we made this change around the year 2000. So, there's a large body | of code that conforms to the requirements of the aggressive | interpretation. Yes; those programs will continue to be conformant. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should gdr at cs dot tamu dot edu wrote: > | I would guess > | that we made this change around the year 2000. So, there's a large body > | of code that conforms to the requirements of the aggressive > | interpretation. > > Yes; those programs will continue to be conformant. Indeed -- but that's not the point I was trying to make. The point is that your changes would force G++ to generate inferior code -- for a codebase that already works with the more aggressive interpretation. In other words, from a GNU "product marketing" point of view, rather than from a C++ language standards point of view, the change you want to make is just going to hurt our users, who have made their code work with the aggressive interpretation. It's only going to help people moving to G++ from other compilers that do not use the aggressive interpretation. And, we already have -fno-strict-aliasing for those folks. I'm not trying to tell you not to make the argument that you think is best for C++ as a language; obviously, that's your right. I'm just pointing out that the change you want to make probably isn't going to help users who are already working with G++. And, although I don't have the time/energy that you seem to have to work on these standards issues, I do plan to oppose your interpretation on the reflector.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: [...] | And, although I don't have the time/energy that you seem to have to work | on these standards issues, I do plan to oppose your interpretation on | the reflector. I believe you'll have hard time proving that my interpretation is INcorrect according to the *current* standard (C++98, C++03), and under obvious reading. Since, I'm not suggesting a change to the current standsard to match my interpretation -- since it already does -- and the only change I'll be proposing is to make C++0x match C99 rules, if you're going to oppose my proposed changes, then you are going to oppose a change that allows TBAA in the form currently implemented in GCC. I'm not sure, that really is what you wanted. -- Gaby
When the standard was originally written, I do think we may have missed out on some finer points of the C object model, mainly to do with restrictions on what one is not permitted to do stemming from the declared type. The intent was to very closely match the C object model, but formalize it better and add just enough to do subtypes and dynamic types and so on. I think it is reasonable to leave it as strict as C is for C style code so that it can be optimized well. I think it is reasonable to push the tighening language into the standard, as that is what we meant and our wording was meant to be as tight as C. The effective type of an object for an access to its stored value is the declared type of the object, if any.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mrs at apple dot com" <gcc-bugzilla@gcc.gnu.org> writes: | I think it is reasonable to push the tighening language into the | standard, as that is what we meant and our wording was meant to be | as tight as C. As C99, I guess. I'm making the distinction between what the C++ standard currently says, and what we would like it to say to benefit from some optimizations with not too heroic effort. That is why I would like a core issue, and a resolution to change the current wording so that access to union mambers are spelled out clearly. | The effective type of an object for an access to its stored value is the | declared type of the object, if any. for C++, we need also to take into account placement-new (the original subject reported here). -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Tue, 22 May 2007, dberlin at dberlin dot org wrote: > ------- Comment #116 from dberlin at gcc dot gnu dot org 2007-05-22 18:13 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu > <gcc-bugzilla@gcc.gnu.org> wrote: > > > > > > ------- Comment #114 from gdr at cs dot tamu dot edu 2007-05-22 18:01 ------- > > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > > dynamic type as it should > > > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > > > | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > > | new does not change the dynamic type as it should > > | > > | gdr at cs dot tamu dot edu wrote: > > | > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 > > ------- > > | > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change > > the > > | > dynamic type as it should > > | > > > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > | > > > | > | But, I don't think the standard contemplated > > | > | these issues in sufficient detail to make it useful in this respect. > > | > > > | > The issues has been raised on the -core reflector. > > | > > | So that I understand, do you mean that they have been raised in the > > | past, and settled, or that you've just raised them now? > > > > I just raised it, mainly because I do not believe your conclusions are > > right. > > > > The type information you can get from write and read are not just > > those appearing lexically in a scope. In fact, the semantics is defined > > in terms of the run time read and write access. > > That is what makes "C a strongly typed and weakly check language" (DMR). > > > > This whole issue does not mean you have to abandon TBAA. It means you > > have be careful in how you use the information gained from TBAA. In > > particular, many conclusions are OK when you have the definition of > > the objects at hand. > > Uh, you do more or less have to abandon TBAA for pointer arguments > unless you can account for every single caller of your function, and > ensure that none of them are passing you pointers external to what > your compiler can see. This case is *extremely rare* outside of the > user giving us a whole program guarantee. > > TBAA's main use is in fact, in disambiguating pointer arguments. If > you take that away, it becomes pretty much useless. > It's guarantees about local objects were never interesting. But you can still perform hoisting loads of incoming pointer arguments and sinking stores to incoming pointer arguments. Please read comment #105 and come up with a testcase where we wouldn't be allowed to do a useful transformation we do now. So I believe making placement new work with our current scheme will severely pessimize placement new users, but if we slightly change rules for everyone we'll be all happy. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: [...] | But you can still perform hoisting loads of incoming pointer arguments | and sinking stores to incoming pointer arguments. Please read comment | #105 and come up with a testcase where we wouldn't be allowed to do | a useful transformation we do now. So I believe making placement new | work with our current scheme will severely pessimize placement new | users, but if we slightly change rules for everyone we'll be all happy. Update. The only comment I have so far on the -core reflector is to the effect that the reading that the program I posted earlier violates NO aliasing rule. I'll follow with the proposal to bring the rules in line with recent C99 rules. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, gdr at cs dot tamu dot edu wrote: > ------- Comment #125 from gdr at cs dot tamu dot edu 2007-05-23 14:22 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: > > [...] > > | But you can still perform hoisting loads of incoming pointer arguments > | and sinking stores to incoming pointer arguments. Please read comment > | #105 and come up with a testcase where we wouldn't be allowed to do > | a useful transformation we do now. So I believe making placement new > | work with our current scheme will severely pessimize placement new > | users, but if we slightly change rules for everyone we'll be all happy. > > Update. > > The only comment I have so far on the -core reflector is to the effect > that the reading that the program I posted earlier violates NO > aliasing rule. I'll follow with the proposal to bring the rules in > line with recent C99 rules. Note that it is important to retain the capability to implement memory allocators in C++ that are allowed to re-use memory for different typed objects. Which is not possible with C99 rules. Richard.
Re comment #105. The case where TBAA is most useful is on a deeply pipelined in-order processor with multiple function units and a high latency memory cache. One example I've worked on is an embedded VLIW processor with vector instructions. TBAA is of relatively little interest on an out-of-order processor. It's interesting to talk about PTA when you see the actual objects and you see how the pointers are taken. But in practice many real functions simply take pointers to arrays in memory. PTA can say nothing useful about those pointers, since they could point to anything. TBAA can say a lot. Losing the ability to sinks loads across stores and vice-versa means putting additional constraints on the scheduler which makes it harder to exploit the multiple function units. Conversely, it constrains the register allocator by tending to extend register lifetimes. Also, in your list of cases, you missed x = *int; *double = 1.0; x = *int; With TBAA we can eliminate the second load as a duplicate. With your proposed aliasing change we can not. I don't understand why you argue in comment #124 that our current scheme will esverely penalize placement new. On mainline there is no penalty for placement new. So you must be referring to one of the patches. But I don't think we've agreed on any of them at the moment. And I think we see the outlines of a successful patch: make placement new return a pointer which effectively aliases everything. That will permit us to reorder loads and eliminate dead stores. It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical that that will count as a severe penalty.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, ian at airs dot com wrote: > ------- Comment #127 from ian at airs dot com 2007-05-23 15:23 ------- > Re comment #105. > > The case where TBAA is most useful is on a deeply pipelined in-order processor > with multiple function units and a high latency memory cache. One example I've > worked on is an embedded VLIW processor with vector instructions. TBAA is of > relatively little interest on an out-of-order processor. > > It's interesting to talk about PTA when you see the actual objects and you see > how the pointers are taken. But in practice many real functions simply take > pointers to arrays in memory. PTA can say nothing useful about those pointers, > since they could point to anything. TBAA can say a lot. > > Losing the ability to sinks loads across stores and vice-versa means putting > additional constraints on the scheduler which makes it harder to exploit the > multiple function units. Conversely, it constrains the register allocator by > tending to extend register lifetimes. Sure, scheduling is one of the places we sink loads or hoist stores. > > Also, in your list of cases, you missed > x = *int; > *double = 1.0; > x = *int; > With TBAA we can eliminate the second load as a duplicate. With your proposed > aliasing change we can not. We still can. We can hoist the second load before the store and so make both loads load the same value. The fact that there is a second load of int after the store to double disambiguates the two memory locations. > I don't understand why you argue in comment #124 that our current scheme will > esverely penalize placement new. On mainline there is no penalty for placement > new. So you must be referring to one of the patches. Yes, I was refering to patches that make changes to forbid sinking a load across a store. With our current IL this will also forbid hoisting a loop-invariant load - which is the key to good performance on tramp3d. > But I don't think we've > agreed on any of them at the moment. And I think we see the outlines of a > successful patch: make placement new return a pointer which effectively aliases > everything. That will permit us to reorder loads and eliminate dead stores. > It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical > that that will count as a severe penalty. But... void foo(int *p) { float *f = (float *)p; new (p) float; *f = 1.0; } there is no "new pointer" from placement new. All placement new does is (magically) change the dynamic pointed-to type. Or would you argue the above is invalid? It is hard to make non-trivial cases work and not impose a full memory barrier at the point of the placement new. In the above case you would need to make the alias sets of float conflict with everything in which case only trivial cases will allow to DSE stores to float or hoist loads of floats. As I said - with our current IL design I see no nice way to express the constraints placement new sets without imposing more constraints than necessary. (This is also true for my proposed changes -- those constraints would be implicit in the IL, just optimization passes would not need to look for PLACEMENT_NEW_EXPRs but simply follow rules if they do optimizations on memory operations) Maybe to find a compromise how about making PLACEMENT_NEW_EXPR effective only after RTL expansion? So, during tree optimization completely ignore it (from the alias IL representation perspective) and follow the changed rules I proposed and after RTL expansion make placement new effects explicit, like by merging target type alias sets with alias set zero. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: | Note that it is important to retain the capability to implement | memory allocators in C++ that are allowed to re-use memory for different | typed objects. Yes, the C++ committee will never allow a rule that bans platcement-new and memory allocators :-) -- Gaby
In this example void foo(int *p) { float *f = (float *)p; new (p) float; *f = 1.0; } the pointer is p. In fact the relevant pointer is always the argument to placement new, and every pointer which PTA can associate with it. We may simply have an impasse here. You have a set of rules which will change the compiler to support placement new while giving better results for your code. I believe that your change will penalize the code I used to work with.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, ian at airs dot com wrote: > ------- Comment #130 from ian at airs dot com 2007-05-23 16:43 ------- > In this example > > void foo(int *p) > { > float *f = (float *)p; > new (p) float; > *f = 1.0; > } > > the pointer is p. In fact the relevant pointer is always the argument to > placement new, and every pointer which PTA can associate with it. I don't read that into the semantics of placement new ;) Placement new doesn't care about the pointer used to refer to the memory it operates on. > We may simply have an impasse here. You have a set of rules which will change > the compiler to support placement new while giving better results for your > code. I believe that your change will penalize the code I used to work with. Ok, fair enough. I'll try to teach load-store-motion for loops to not re-order the inserted stores compared to the order on the loop exit path. This is the only transformation on the tree-level I came across that violates my proposed semantics. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 23 May 2007 08:35:12 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #124 from rguenther at suse dot de 2007-05-23 09:35 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > On Tue, 22 May 2007, dberlin at dberlin dot org wrote: > > > ------- Comment #116 from dberlin at gcc dot gnu dot org 2007-05-22 18:13 ------- > > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > > dynamic type as it should > > > > On 22 May 2007 17:01:40 -0000, gdr at cs dot tamu dot edu > > <gcc-bugzilla@gcc.gnu.org> wrote: > > > > > > > > > ------- Comment #114 from gdr at cs dot tamu dot edu 2007-05-22 18:01 ------- > > > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > > > dynamic type as it should > > > > > > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > > > > > | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > > > | new does not change the dynamic type as it should > > > | > > > | gdr at cs dot tamu dot edu wrote: > > > | > ------- Comment #112 from gdr at cs dot tamu dot edu 2007-05-22 17:46 > > > ------- > > > | > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change > > > the > > > | > dynamic type as it should > > > | > > > > | > "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: > > > | > > > > | > | But, I don't think the standard contemplated > > > | > | these issues in sufficient detail to make it useful in this respect. > > > | > > > > | > The issues has been raised on the -core reflector. > > > | > > > | So that I understand, do you mean that they have been raised in the > > > | past, and settled, or that you've just raised them now? > > > > > > I just raised it, mainly because I do not believe your conclusions are > > > right. > > > > > > The type information you can get from write and read are not just > > > those appearing lexically in a scope. In fact, the semantics is defined > > > in terms of the run time read and write access. > > > That is what makes "C a strongly typed and weakly check language" (DMR). > > > > > > This whole issue does not mean you have to abandon TBAA. It means you > > > have be careful in how you use the information gained from TBAA. In > > > particular, many conclusions are OK when you have the definition of > > > the objects at hand. > > > > Uh, you do more or less have to abandon TBAA for pointer arguments > > unless you can account for every single caller of your function, and > > ensure that none of them are passing you pointers external to what > > your compiler can see. This case is *extremely rare* outside of the > > user giving us a whole program guarantee. > > > > TBAA's main use is in fact, in disambiguating pointer arguments. If > > you take that away, it becomes pretty much useless. > > It's guarantees about local objects were never interesting. > > But you can still perform hoisting loads of incoming pointer arguments > and sinking stores to incoming pointer arguments. Please read comment > #105 and come up with a testcase where we wouldn't be allowed to do > a useful transformation we do now. So I believe making placement new > work with our current scheme will severely pessimize placement new > users, but if we slightly change rules for everyone we'll be all happy. > Have fun encoding this into the IL :) If TBAA can't say things about particular load/store pairs, without having to do a lot of work to see what is in between them, it's not going to be useful to us.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should ian at airs dot com wrote: > The case where TBAA is most useful is on a deeply pipelined in-order processor > with multiple function units and a high latency memory cache. One example I've > worked on is an embedded VLIW processor with vector instructions. TBAA is of > relatively little interest on an out-of-order processor. The original motivating case for me was stuff like: void f (int *a, double *d) { for (int i = 1; i < N; ++i) { a[i] += i; d[i] = d[i-1] * a[i]; } } That's not the right code, but the point is that TBAA can allow us to avoid reloading d[i-1] from one iteration to the next, despite the store to a[i]. That reduces memory access and instruction count. Ordinary PTA does not allow this. Of course, Gaby's memory model doesn't allow this optimization either; we have to worry that "a" and "d" are both in some union somewhere. That's why Gaby's model is so bad from an optimization point of view; it makes the compiler assume a worst-case situation, even though that worst-case situation almost never actually happens. I'm not an expert on CPU models, so I'm not sure how out-of-order vs. in-order might matter here. > And I think we see the outlines of a > successful patch: make placement new return a pointer which effectively aliases > everything. That will permit us to reorder loads and eliminate dead stores. > It won't permit us to arbitrarily re-order loads and stores, but I'm skeptical > that that will count as a severe penalty. That's exactly what I think.
But using a union for type-punning is a gcc extension (and of course the extension is only for access through the union), so with strict C99/C++ semantics we can avoid reloading d[i-1] even if a and d were in the same union because the code would then be invalid. So the union case is a non-issue here (it was only used to make available enough properly aligned storage for the particular testcase).
Re comment #132 -- you cannot encode this into the IL :/ And I don't propose to do so. And I say any working and optimal (from optimization perspective) variant of a fix for this PR has the same problem.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > ------- Comment #134 from rguenth at gcc dot gnu dot org 2007-05-23 19:54 ------- > But using a union for type-punning is a gcc extension (and of course the > extension > is only for access through the union), so with strict C99/C++ semantics we can > avoid reloading d[i-1] even if a and d were in the same union because the code > would then be invalid. Gaby's claim, as I understand it, is that writing to a union member, even through a pointer, rather than directly through the union, is valid, and activates that part of the union. So, it is not a GCC extension. For code like: a[i] = i; d[i] = d[i-1] + a[i]; I guess you can argue that a[i] does not alias d[i-1], even in Gaby's model, because a[i] is written to right before the access to d[i-1]. But, you don't know that a[m] doesn't alias d[n] for arbitrary m and n. So, it's easy to create variations on the case I posted that can't be optimized, if you agree to Gaby's model. > So the union case is a non-issue here (it was only used to > make available enough properly aligned storage for the particular testcase). I agree that union case *should* be a non-issue in this context, where we were discussing how to fix placement new, but Gaby has made it one because he is claiming that placement new is not the only way to change the dynamic type of memory. Gaby's claim is that given an arbitrary pointer "p", saying: (int *)p = 3; is the same as saying: *(new (p) int) = 3; That makes life for the optimizers much, much harder.
Created attachment 13607 [details] patch to perserve store ordering in loop load/store motion <quote> Gaby's claim is that given an arbitrary pointer "p", saying: (int *)p = 3; is the same as saying: *(new (p) int) = 3; That makes life for the optimizers much, much harder. </quote> I say so as well (that those are the same), but I don't agree that this makes life for optimizers much harder. Anyway, this is a proposed patch for loop load/store motion that fixes the last posted testcase. Scheduling issues remain.
Note the patch is not 100% correct as we need to preserve store ordering on the path to all exit edges which may be different for each exit.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, mark at codesourcery dot com wrote: > Of course, Gaby's memory model doesn't allow this optimization either; > we have to worry that "a" and "d" are both in some union somewhere. DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was what eventually said for C that you don't need to worry about that; I'd think the aim should be to get C++ to agree with that ruling. DR#283, to appear in TC3, is what specifies type punning through unions.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > <quote> > Gaby's claim is that given an arbitrary > pointer "p", saying: > > (int *)p = 3; > > is the same as saying: > > *(new (p) int) = 3; > > That makes life for the optimizers much, much harder. > </quote> > > I say so as well (that those are the same), but I don't agree that this > makes life for optimizers much harder. Placement new is rare; assignments to pointers are everywhere. Note that the first case does not need to use an explicit cast. In a function: void f(int *p) { *p = 3; } under Gaby's interpretation, we cannot be sure that "p" points to an "int" before this function, so we can't be sure the write to "*p" doesn't clobber memory of other types. TBAA is one of the few ways to disambiguate pointers in the absence of whole-program optimization, and this model really undermines TBAA. Frankly, I'm surprised that you are taking this position. This is a change to the compiler that can only hurt high-performance C++ applications, which is an area I know you care about. I know that you're unhappy about how Ian's patches might hurt programs that use placement-new in an inner loop, but this model will impose the same penalties on programs that write to pointers in an inner loop.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should joseph at codesourcery dot com wrote: > DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was > what eventually said for C that you don't need to worry about that; I'd > think the aim should be to get C++ to agree with that ruling. Thank you for the pointer. That seems directly on point, and makes C99 match the existing GCC practice: we don't need to worry that pointers might point to unions. Gaby, would you please forward that to the C++ reflector, so that the reflector has that information as well? They should be aware that the model you're proposing is at odds with C99. Thanks,
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, mark at codesourcery dot com wrote: > ------- Comment #140 from mark at codesourcery dot com 2007-05-23 21:07 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > rguenth at gcc dot gnu dot org wrote: > > > <quote> > > Gaby's claim is that given an arbitrary > > pointer "p", saying: > > > > (int *)p = 3; > > > > is the same as saying: > > > > *(new (p) int) = 3; > > > > That makes life for the optimizers much, much harder. > > </quote> > > > > I say so as well (that those are the same), but I don't agree that this > > makes life for optimizers much harder. > > Placement new is rare; assignments to pointers are everywhere. > > Note that the first case does not need to use an explicit cast. In a > function: > > void f(int *p) { > *p = 3; > } > > under Gaby's interpretation, we cannot be sure that "p" points to an > "int" before this function, so we can't be sure the write to "*p" > doesn't clobber memory of other types. TBAA is one of the few ways to > disambiguate pointers in the absence of whole-program optimization, and > this model really undermines TBAA. In C inside the function f we can indeed be sure *p points to an "int". Now what matters is, that even in C for double g(int *p, double *d) { f(p); return *d; } we cannot be sure (in absence of whole-program optimization or the body of f available) that the call to f does not clobber *d through the value of the pointer p. As it can look like void f(int *p) { double *d = p; *d = 1.0; } > Frankly, I'm surprised that you are taking this position. This is a > change to the compiler that can only hurt high-performance C++ > applications, which is an area I know you care about. I know that > you're unhappy about how Ian's patches might hurt programs that use > placement-new in an inner loop, but this model will impose the same > penalties on programs that write to pointers in an inner loop. No it won't. Or at least - I belive so - unless I see a patch that manages to implement placement new with the same minor restrictions. If you discount scheduling on in-order machines, what would be an optimization that can be no longer done with Gabys and my scheme? I believe there are none. Also other compilers manage to not miscompile in the face of placement new but still optimize loops with them. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenther at suse dot de wrote: > >> void f(int *p) { >> *p = 3; >> } >> >> under Gaby's interpretation, we cannot be sure that "p" points to an >> "int" before this function, so we can't be sure the write to "*p" >> doesn't clobber memory of other types. TBAA is one of the few ways to >> disambiguate pointers in the absence of whole-program optimization, and >> this model really undermines TBAA. > > In C inside the function f we can indeed be sure *p points to an "int". Not before the assignment to "p". In: void f(int *p, double *q) { double d = *q; *p = 3; return d; } your interpretation does not allow moving the load of "*q" after the store to "*p". That's clearly limiting the freedom of the optimizer. Now, we can argue about how much that matters -- but it's inarguable that it's a limitation. > If you discount scheduling on in-order machines, what would be an > optimization that can be no longer done with Gabys and my scheme? > I believe there are none. Also other compilers manage to not > miscompile in the face of placement new but still optimize loops > with them. I'm lost. What does Gaby's model have to do with placement new? We're all agreed that (a) placement new can change the dynamic type of memory, (b) therefore GCC currently has a bug, (c) we want the fix to have as little optimization impact as possible. Gaby's model says that we know less about dynamic types than we presently think we do, because there might be a union out there somewhere. (Fortunately, as Joseph points out, C99 has already answered this question. Surely we can agree that making C99 and C++ different in this respect is a bad idea.) If "*p = 3" changes the dynamic type of "*p", that just means we know even less. The less we know, the less optimization we can do. Making "*p = 3" change the dynamic type of "*p" can't possibly help us implement placement new more efficiently. Whatever conservative assumptions we want to make about "*p = 3" we could make about "new (p) int" instead. If you have a patch that fixes the placement new problem, making us generate correct code, and with minimal/no impact on optimization, that's great! But, that can't possibly, in and of itself, be a reason to change the rules we're using for TBAA.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, mark at codesourcery dot com wrote: > rguenther at suse dot de wrote: > > > >> void f(int *p) { > >> *p = 3; > >> } > >> > >> under Gaby's interpretation, we cannot be sure that "p" points to an > >> "int" before this function, so we can't be sure the write to "*p" > >> doesn't clobber memory of other types. TBAA is one of the few ways to > >> disambiguate pointers in the absence of whole-program optimization, and > >> this model really undermines TBAA. > > > > In C inside the function f we can indeed be sure *p points to an "int". > > Not before the assignment to "p". In: > > void f(int *p, double *q) { > double d = *q; > *p = 3; > return d; > } > > your interpretation does not allow moving the load of "*q" after the > store to "*p". That's clearly limiting the freedom of the optimizer. Yes, it's limiting the freedom of optimizers. > Now, we can argue about how much that matters -- but it's inarguable > that it's a limitation. > > > If you discount scheduling on in-order machines, what would be an > > optimization that can be no longer done with Gabys and my scheme? > > I believe there are none. Also other compilers manage to not > > miscompile in the face of placement new but still optimize loops > > with them. > > I'm lost. > > What does Gaby's model have to do with placement new? Only so much that we seem to agree on the semantics of placement new. Gaby extends this semantics to any store, so *p = X; is equivalent to *(new (p) __typeof__ *p) = X; to which semantics we thus can agree (not to whether those two should be the same, mandated by the standard or liked by some of us or not). > Gaby's model says that we know less about dynamic types than we > presently think we do, because there might be a union out there > somewhere. (Fortunately, as Joseph points out, C99 has already answered > this question. Surely we can agree that making C99 and C++ different in > this respect is a bad idea.) I don't think dragging in unions helps us here ;) Maybe Gaby can clarify if and how unions relate here, but I didn't percieve any previous comment as making implicit unions relevant here apart from what GCC and apperantly C99 agree to. > If "*p = 3" changes the dynamic type of "*p", that just means we know > even less. The less we know, the less optimization we can do. True. > Making "*p = 3" change the dynamic type of "*p" can't possibly help us > implement placement new more efficiently. I disagree here. Making "*p = 3" change the dynamic type of "*p" will make the placement new issue moot - the current library implementation is fine then and we don't need any new explicit or implicit side-effects of it. > Whatever conservative > assumptions we want to make about "*p = 3" we could make about "new (p) > int" instead. True. I say making them about "*p = 3" is way easier as we are changing semantics of memory operations and *p = 3 is one, but placement new is not. > If you have a patch that fixes the placement new problem, making us > generate correct code, and with minimal/no impact on optimization, > that's great! But, that can't possibly, in and of itself, be a reason > to change the rules we're using for TBAA. Well, it depends if you read it as changing TBAA rules. Does preserving store ordering in loop load/store motion change TBAA rules? Not in itself - but clearly changed TBAA rules would force us to. Same for limiting scheduling of memory operations. Btw, the necessary patch may be as simple as Index: alias.c =================================================================== --- alias.c (revision 124998) +++ alias.c (working copy) @@ -2284,7 +2284,8 @@ write_dependence_p (rtx mem, rtx x, int || MEM_ALIAS_SET (mem) == ALIAS_SET_MEMORY_BARRIER) return 1; - if (DIFFERENT_ALIAS_SETS_P (x, mem)) + /* We cannot rely on alias set differences for C++ aliasing semantics. */ + if (0 && DIFFERENT_ALIAS_SETS_P (x, mem)) return 0; /* A read from read-only memory can't conflict with read-write memory. */ but as I'm currently lacking a testcase that fails due to scheduling I'm not 100% sure. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 23 May 2007 18:57:03 -0000, rguenth at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #135 from rguenth at gcc dot gnu dot org 2007-05-23 19:56 ------- > Re comment #132 -- you cannot encode this into the IL :/ And I don't propose > to > do so. And I say any working and optimal (from optimization perspective) > variant > of a fix for this PR has the same problem. So instead you are going to change every single optimization to preserve things like store ordering, even though the underlying aliasing information says it's okay to reorder (because it's broken in the presence of these restrictions)? No thanks! This would be a massive rathole to go down in the end, and completely at odds with the idea that our middle end IL is language independent, and like *C*. C99 has a clear proposed resolution to DR 236, and it says you need to visibly use a union (which takes care of the other issue mentioned here) If placement new needs to mean a memory barrier to get things right for placement new/C++, it needs to mean a memory barrier to get things right for now. Changing language independent optimizations to preserve something magical in the face of wrong aliasing information is not the way to fix this PR. I would support turning off TBAA over that solution, given the choice.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenther at suse dot de wrote: > Only so much that we seem to agree on the semantics of placement new. > Gaby extends this semantics to any store, so > > *p = X; > > is equivalent to > > *(new (p) __typeof__ *p) = X; > > to which semantics we thus can agree (not to whether those two should > be the same, mandated by the standard or liked by some of us or not). I think I understand. Let me just restate this, to make sure: (a) Gaby's model makes the first assignment above equivalent to the second (b) Thus, in Gaby's model, if we solve either case, we solve both. I agree with that statement. (I don't like the model -- but I agree with the logic.) >> Making "*p = 3" change the dynamic type of "*p" can't possibly help us >> implement placement new more efficiently. > > I disagree here. Making "*p = 3" change the dynamic type of "*p" will > make the placement new issue moot - the current library implementation > is fine then and we don't need any new explicit or implicit side-effects > of it. > >> Whatever conservative >> assumptions we want to make about "*p = 3" we could make about "new (p) >> int" instead. > > True. I say making them about "*p = 3" is way easier as we are changing > semantics of memory operations and *p = 3 is one, but placement new is > not. I think I understand what you're saying here too; again, I'll restate to make sure: (a) In the model where "*p = 3" changes the dynamic type of memory, we don't need to do anything special to handle placement new. (b) It's relatively easy to implement support for "*p = 3" changing the dynamic type of memory. (c) Therefore, it's relatively easy to fix our placement new problem. I agree with those statements too. However, I don't like this approach because I believe it will result in inferior code. I think that you're looking at the proposed placement new patches, then looking at what they do to a particular codebase, which happens to use placement-new in an inner loop, and becoming unhappy with the patches. I suspect that the changes required to support the "*p = 3" model, while perhaps better for that case, will be worse for many others. I can't prove that. But, I did implement TBAA after looking at what other compilers did, specifically to improve performance of (ironically) POOMA. So, I'm afraid that you're going to find that if we allow memory writes to change the type of memory, that we will get worse performance. That's why I'm much more comfortable with a change that only affects placement new. At least, if placement new is slow, we can tell users not to use it in inner loops. If using pointers are slow, there's nothing we can do.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Of course, Gaby's memory model doesn't allow this optimization either; | we have to worry that "a" and "d" are both in some union somewhere. | That's why Gaby's model is so bad from an optimization point of view; it A meta comment: I would highly appreciate if you stopped calling it "Gaby's model". It is the *current C++ standard object model*. If you don't like it; that is fine. But, you don't have to like it. (I don't). But, please don't call it "Gaby's model". The fact that you don't like it does not suddenly make it not the standard model. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Gaby's claim, as I understand it, is that writing to a union member, | even through a pointer, rather than directly through the union, is | valid, and activates that part of the union. Yes; that is what the current C++ standard says. C99 tried to fix things with the notion of effective type, but that notion does not fully address the issue and it is not part of C++98 and C++03. What I'm doing is making sure that we all know where (potential existing codes) we are starting from and make sure that we do understand the implications of the changes we would like to make to make optimizers easier. I spent some of my time this afternoon going through this with the original inventor of C++ and trying to see whether we have choice to make things less hard. It is clear to us that, *if* we have a choice -- note *if* we have a choice -- when the aliasing through union should be made appearant; but for the moment, that is hard to formulate correctly and simply (with the constraint of not breaking the existing object model badly). -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | ------- Comment #140 from mark at codesourcery dot com 2007-05-23 21:07 ------- | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement | new does not change the dynamic type as it should | | rguenth at gcc dot gnu dot org wrote: | | > <quote> | > Gaby's claim is that given an arbitrary | > pointer "p", saying: | > | > (int *)p = 3; | > | > is the same as saying: | > | > *(new (p) int) = 3; | > | > That makes life for the optimizers much, much harder. | > </quote> | > | > I say so as well (that those are the same), but I don't agree that this | > makes life for optimizers much harder. | | Placement new is rare; assignments to pointers are everywhere. Naked placement new may be rare; but, placement new is general are not rare because of the STL-style containers and algorithms. | | Note that the first case does not need to use an explicit cast. In a | function: | | void f(int *p) { | *p = 3; | } | | under Gaby's interpretation, we cannot be sure that "p" points to an | "int" before this function, so we can't be sure the write to "*p" | doesn't clobber memory of other types. Note that is is a problem only with PODs -- because only those can appear in unions. That does not help much, but it is a distinction you have to make when you're considering what the standard says. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: | Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement | new does not change the dynamic type as it should | | joseph at codesourcery dot com wrote: | | > DR#236 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_236.htm> was | > what eventually said for C that you don't need to worry about that; I'd | > think the aim should be to get C++ to agree with that ruling. | | Thank you for the pointer. That seems directly on point, and makes C99 | match the existing GCC practice: we don't need to worry that pointers | might point to unions. | | Gaby, would you please forward that to the C++ reflector, so that the | reflector has that information as well? They should be aware that the | model you're proposing is at odds with C99. | Yes, I'll right now. Since I brought upt this issue, Nick MacLaren sent me a note he had circulated about these very issues on C99. It is also being discussed on the -core reflector. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: [...] | > Gaby's model says that we know less about dynamic types than we | > presently think we do, because there might be a union out there | > somewhere. (Fortunately, as Joseph points out, C99 has already answered | > this question. Surely we can agree that making C99 and C++ different in | > this respect is a bad idea.) | | I don't think dragging in unions helps us here ;) Maybe Gaby can clarify | if and how unions relate here, but I didn't percieve any previous comment | as making implicit unions relevant here apart from what GCC and | apperantly C99 agree to. I believe we all agree that placement new changes the dynamic type. I brought in the union example to point of a fundamental problem with this issue. I have been following the discussion without saying much, until I realized that the interpretation Mark was offering is a redefinition of the C++ object model that conflicts with the current standard text. That was the point of the union example. In the example void f(int* p, double* q) { *p = 42; *q = 3.12; } All we know is that after the store to *p, the object there will have type int (if it did not already have one). Similarly, for the store to *q, the object there will have type double. Can the stores be rearranged? Under the current C++ rules (which were inherited from C90, and not C99) "yes" if we know that the objects are distinct. Can we infer the disjoinctness from the types? "Not always" under current C++ rules for union, and in this specific case, the answer is "no". I never said I liked that model. I was merely pointing out that people where on the slope of redefining the object model. I spent the afternoon trying to see how C++ can move forward. The "effective types" of C99 has its own sets of incompleteness and inconsistency problems that Nick MacLaren has brought to my attention since I raised the issue on the C++ -core reflector. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "mark at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org> writes: [...] | However, I don't like this approach because I believe it will result in | inferior code. I believe (without offering data, sorry) it *may* yield inferior code. However, my original point was to make sure that we understood your interpretation was a change to the C++ standard object model. Given the consequences, I think it was appropriate to raise the issue to the C++ committee. So far, the only response I got was the sentiment that it is a change in semantics. Given C99's move, it is essential for C++ to get to a better state of the object model. However, C99's notion of "effective type" is not without its own set of problems (already in C99). The details can be found in Nick MacLaren's paper. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Thu, 23 May 2007, gdr at cs dot tamu dot edu wrote: > ------- Comment #151 from gdr at cs dot tamu dot edu 2007-05-24 00:58 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: > > [...] > > | > Gaby's model says that we know less about dynamic types than we > | > presently think we do, because there might be a union out there > | > somewhere. (Fortunately, as Joseph points out, C99 has already answered > | > this question. Surely we can agree that making C99 and C++ different in > | > this respect is a bad idea.) > | > | I don't think dragging in unions helps us here ;) Maybe Gaby can clarify > | if and how unions relate here, but I didn't percieve any previous comment > | as making implicit unions relevant here apart from what GCC and > | apperantly C99 agree to. > > I believe we all agree that placement new changes the dynamic type. > > > I brought in the union example to point of a fundamental problem with > this issue. I have been following the discussion without saying much, > until I realized that the interpretation Mark was offering is a > redefinition of the C++ object model that conflicts with the current > standard text. That was the point of the union example. In the > example > > void f(int* p, double* q) { > *p = 42; > *q = 3.12; > } > > All we know is that after the store to *p, the object there will have > type int (if it did not already have one). Similarly, for the store > to *q, the object there will have type double. Can the stores be > rearranged? Under the current C++ rules (which were inherited from > C90, and not C99) "yes" if we know that the objects are distinct. > Can we infer the disjoinctness from the types? "Not always" under > current C++ rules for union, and in this specific case, the answer is > "no". Right, current C++ rules forbid exchanging the stores. But I read it as it is because the stores may start lifetime of a different dynamically typed object on the same memory location which p and q may point to. Whether there is a union "placed" at this memory location is irrelevant. I read the standard as if the above is equivalent to *(new (p) int) = 42; *(new (q) double) = 3.12; which as long as we cannot prove that p does not point to the same memory location as q means that we cannot reorder the stores. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 23 May 2007, mark at codesourcery dot com wrote: > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > rguenther at suse dot de wrote: > > >> Whatever conservative > >> assumptions we want to make about "*p = 3" we could make about "new (p) > >> int" instead. > > > > True. I say making them about "*p = 3" is way easier as we are changing > > semantics of memory operations and *p = 3 is one, but placement new is > > not. > > I think I understand what you're saying here too; again, I'll restate to > make sure: > > (a) In the model where "*p = 3" changes the dynamic type of memory, we > don't need to do anything special to handle placement new. > (b) It's relatively easy to implement support for "*p = 3" changing the > dynamic type of memory. > (c) Therefore, it's relatively easy to fix our placement new problem. > > I agree with those statements too. > > However, I don't like this approach because I believe it will result in > inferior code. I think that you're looking at the proposed placement > new patches, then looking at what they do to a particular codebase, > which happens to use placement-new in an inner loop, and becoming > unhappy with the patches. I suspect that the changes required to > support the "*p = 3" model, while perhaps better for that case, will be > worse for many others. > > I can't prove that. But, I did implement TBAA after looking at what > other compilers did, specifically to improve performance of (ironically) > POOMA. So, I'm afraid that you're going to find that if we allow memory > writes to change the type of memory, that we will get worse performance. So I did some benchmarking with my two proposed patches (which I believe will fix all current issues - that I know of) on IA64 which should be sensitive to the more restrictive scheduling. [You can look yourself at the results starting from http://www.suse.de/~gcctest/, the run without the patches is from late May23th the run with the patch from early May24th] I'll try to summarize only here. - SPECfpu2000 shows both winners and loosers, the net change is slightly positive for -O3 and slightly negative for -O3 -funroll-loops (slightly means +3 points vs. -4 points) - SPECint2000 shows more variance in single tests, crafty winning for peak, parser loosing for base, eon winning overall, likewise bzip. Overall we get a drop in SPECint of about 3 points (0.3%) - Polyhedron is mostly the same, aermod regressing 1.5%, induct improving 1%. - The set of C++ benchmarks (including TraMP-3d, DLV, Boost wave and others) show no changes. Of course this doesn't prove anything - it only hints that it might be not as bad as you thought. And no, I don't have a Power6 machine to test on yet. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Thu, 24 May 2007, rguenther at suse dot de wrote: > So I did some benchmarking with my two proposed patches (which I believe > will fix all current issues - that I know of) on IA64 which should be > sensitive to the more restrictive scheduling. [You can look yourself > at the results starting from http://www.suse.de/~gcctest/, the run > without the patches is from late May23th the run with the patch from > early May24th] > > I'll try to summarize only here. > > - SPECfpu2000 shows both winners and loosers, the net change is > slightly positive for -O3 and slightly negative for -O3 -funroll-loops > (slightly means +3 points vs. -4 points) > > - SPECint2000 shows more variance in single tests, crafty winning for > peak, parser loosing for base, eon winning overall, likewise bzip. > Overall we get a drop in SPECint of about 3 points (0.3%) > > - Polyhedron is mostly the same, aermod regressing 1.5%, induct > improving 1%. > > - The set of C++ benchmarks (including TraMP-3d, DLV, Boost wave and > others) show no changes. > > Of course this doesn't prove anything - it only hints that it might be > not as bad as you thought. And no, I don't have a Power6 machine to > test on yet. Btw, in case you are curious - on http://www.suse.de/~gcctest/c++bench/tramp3d/split-run.html you can see the effects of the asm() memory barrier as proposed by one of Ians patches and the effect of forcing -fno-strict-aliasing on (the two spikes in runtime, the later one is -fno-strict-aliasing, the earlier one is the asm()). Both approaches disable moving loop invariant loads which is one thing you retain with my proposed slightly changed TBAA rules. Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: [...] | > I brought in the union example to point of a fundamental problem with | > this issue. I have been following the discussion without saying much, | > until I realized that the interpretation Mark was offering is a | > redefinition of the C++ object model that conflicts with the current | > standard text. That was the point of the union example. In the | > example | > | > void f(int* p, double* q) { | > *p = 42; | > *q = 3.12; | > } | > | > All we know is that after the store to *p, the object there will have | > type int (if it did not already have one). Similarly, for the store | > to *q, the object there will have type double. Can the stores be | > rearranged? Under the current C++ rules (which were inherited from | > C90, and not C99) "yes" if we know that the objects are distinct. | > Can we infer the disjoinctness from the types? "Not always" under | > current C++ rules for union, and in this specific case, the answer is | > "no". | | Right, current C++ rules forbid exchanging the stores. But I read it | as it is because the stores may start lifetime of a different | dynamically typed object on the same memory location which p and q | may point to. Which is equivalent to having the union, in all aspects. Now, if I understand your argument below correctly, you are saying that even if we fixed the union rules for C++, we may still be facing the same problem because assignment to POD objects does not just mean that the object was there before, but that we are actually starting a new one. Do I understand your argument correctly? | Whether there is a union "placed" at this memory | location is irrelevant. I read the standard as if the above is | equivalent to | | *(new (p) int) = 42; | *(new (q) double) = 3.12; | | which as long as we cannot prove that p does not point to the | same memory location as q means that we cannot reorder the stores. -- Gaby
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Thu, 24 May 2007, gdr at cs dot tamu dot edu wrote: > > > ------- Comment #156 from gdr at cs dot tamu dot edu 2007-05-24 10:29 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: > > [...] > > | > I brought in the union example to point of a fundamental problem with > | > this issue. I have been following the discussion without saying much, > | > until I realized that the interpretation Mark was offering is a > | > redefinition of the C++ object model that conflicts with the current > | > standard text. That was the point of the union example. In the > | > example > | > > | > void f(int* p, double* q) { > | > *p = 42; > | > *q = 3.12; > | > } > | > > | > All we know is that after the store to *p, the object there will have > | > type int (if it did not already have one). Similarly, for the store > | > to *q, the object there will have type double. Can the stores be > | > rearranged? Under the current C++ rules (which were inherited from > | > C90, and not C99) "yes" if we know that the objects are distinct. > | > Can we infer the disjoinctness from the types? "Not always" under > | > current C++ rules for union, and in this specific case, the answer is > | > "no". > | > | Right, current C++ rules forbid exchanging the stores. But I read it > | as it is because the stores may start lifetime of a different > | dynamically typed object on the same memory location which p and q > | may point to. > > Which is equivalent to having the union, in all aspects. > > Now, if I understand your argument below correctly, you are saying > that even if we fixed the union rules for C++, we may still be facing > the same problem because assignment to POD objects does not just mean > that the object was there before, but that we are actually starting a > new one. Do I understand your argument correctly? Yes. For non-PODs I'm not sure if it is allowed to start object lifetime via assignment rather than only by placement new. Richard. > | Whether there is a union "placed" at this memory > | location is irrelevant. I read the standard as if the above is > | equivalent to > | > | *(new (p) int) = 42; > | *(new (q) double) = 3.12; > | > | which as long as we cannot prove that p does not point to the > | same memory location as q means that we cannot reorder the stores.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: [...] | > Now, if I understand your argument below correctly, you are saying | > that even if we fixed the union rules for C++, we may still be facing | > the same problem because assignment to POD objects does not just mean | > that the object was there before, but that we are actually starting a | > new one. Do I understand your argument correctly? | | Yes. Then, I'm not sure I agree with that. I'll raise it as part of the union problem. | For non-PODs I'm not sure if it is allowed to start object | lifetime via assignment rather than only by placement new. Assignment for non-POD is equivalent to calling a member function for an object, which means that the object already existed. So, no you cannot start lifetime for object of non-POD type by assignment. -- Gaby
Created attachment 13613 [details] Patch This version of the patch is based on some code from Daniel Berlin. Here we determine which pointers can potentially be associated with a placement new. We disable TBAA specifically for those pointers. I'm not 100% confident that this patch always does the right thing. But it does work for the test cases I tried. Richi, I'd be interested in results you get from your performance tests.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Sat, 25 May 2007, ian at airs dot com wrote: > ------- Comment #159 from ian at airs dot com 2007-05-25 23:21 ------- > Created an attachment (id=13613) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13613&action=view) > Patch > > This version of the patch is based on some code from Daniel Berlin. Here we > determine which pointers can potentially be associated with a placement new. > We disable TBAA specifically for those pointers. > > I'm not 100% confident that this patch always does the right thing. But it > does work for the test cases I tried. > > Richi, I'd be interested in results you get from your performance tests. Please - this patch doesn't fix FreePOOMA nor does it fix the runtime testcase I posted here before. We still have <bb 3>: l_7 = (int *) p_6(D); # SMT.31_21 = VDEF <SMT.31_18> *l_7 = 0; <<<change_dynamic_type (long int *) p_6(D))>>> __p_14 = p_6(D); f_10 = (long int *) __p_14; # SMT.30_22 = VDEF <SMT.30_17> *f_10 = -1; the store to *f_10 needs to alias the store to *l_7. Can you just add the runtime testcase to your patches and make sure at least that one is working? Thanks, Richard.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should Btw, it could save us many bugs (or bug reports) if if PTA says must-alias we'd trust it, instead of using TBAA to say "but it shouldn't". Like for the (invalid) int foo(void *p) { double *q = p; int *r = p; *q = 1.0; return *r; } where we correctly figure that Points-to sets q_2 = same as p r_3 = same as p but still we get (with -fstrict-aliasing) <bb 2>: q_2 = (double *) p_1(D); r_3 = (int *) p_1(D); # SMT.4_7 = VDEF <SMT.4_6(D)> *q_2 = 1.0e+0; # VUSE <SMT.5_8(D)> D.1641_4 = *r_3; which just for the point to make the invalid input produce unexpected output is possibly creating many of the hard-to-fix issues like this one. Danny, how easy is it to "fix" the above? It looks like another attractive solution to this problem.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On 28 May 2007 11:14:20 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #161 from rguenther at suse dot de 2007-05-28 11:14 ------- > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement > new does not change the dynamic type as it should > > Btw, it could save us many bugs (or bug reports) if if PTA says must-alias > we'd trust it, instead of using TBAA to say "but it shouldn't". Like for > the (invalid) > > int foo(void *p) > { > double *q = p; > int *r = p; > *q = 1.0; > return *r; > } > > where we correctly figure that > > Points-to sets > q_2 = same as p > r_3 = same as p > These are just collapsed because of pointer equivalences. There is no guarantee that we will discover all pointer equivalences, and their are in fact cases where we won't.
Richi, I tested my patch on every test case I saved. Can you just point me at the one I missed? Thanks.
Created attachment 13637 [details] Patch This is a modification of the previous patch which correctly handles the test case in comment #85.
The lates patch fixes the FreePOOMA testsuite regressions. I'll give it a performance testing spin on vangelis tonight.
It causes a 10% performance regression for tramp3d. There appear to be no significant changes in libstdc++ performance testing. "Fixing" tramp3d-v4 with the below patch cures the performance regression. --- tramp3d-v4.cpp 2007-05-16 15:02:47.000000000 +0200 +++ tramp3d-v4x.cpp 2007-06-05 18:11:40.000000000 +0200 @@ -29583,10 +29583,6 @@ VectorEngine() { PoomaCTAssert<(ElementProperties<T>::hasTrivialDefaultConstructor && ElementProperties<T>::hasTrivialDestructor && ElementProperties<T>::concrete)>::test(); - for (int i = 0; i < D; ++i) - { - ElementProperties<T>::construct(&x_m[i]); - } } inline VectorEngine(const VectorEngine<D,T,Full>&); template<class X> (ElementProperties<T>::construct calls placement new)
Can you give me a .ii file for the performance regression, and point me at the relevant function?
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Tue, 5 Jun 2007, ian at airs dot com wrote: > ------- Comment #167 from ian at airs dot com 2007-06-05 20:48 ------- > Can you give me a .ii file for the performance regression, and point me at the > relevant function? http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.cpp.gz Amongst the interesting functions (yep, there are multiple) are those called Momentumflux*::operator(), one particular example is Adv5::Z::MomentumfluxX<Dim>::operator(), which mangles as _ZNK4Adv51Z13MomentumfluxXILi3EEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd7CompFwdI6EngineILi3E6VectorILi3Ed4FullE10BrickViewUE3LocILi1EEEES4_ISA_dSG_ESM_SM_EEvRKT_RKT0_RKT1_RKT2_RKSI_ILi3EE it has this initialization loop (which is fixed by the tramp3d patch) inside the computational kernel (triple nested loop): <bb 35>: D.760598_367 = &D.464122.engine_m.x_m[i_368]; <<<change_dynamic_type (double *) D.760598_367)>>> iftmp.913_369 = &D.464122.engine_m.x_m[i_368]; if (1) goto <bb 36>; else goto <bb 37>; <bb 36>: *iftmp.913_369 = 0.0; <bb 37>: i_370 = i_368 + 1; <bb 38>: # i_368 = PHI <0(34), i_370(37)> if (i_368 <= 2) goto <bb 35>; else goto <bb 39>; (that's after forwprop1 actually). It's important that we unroll this loop completely (to make us recognize the 0.0 stores are all super-seeded by later stores) and that we move all loop invariant loads out of the triple-nested loops, crossing this initialization loop. The optimized dump for all these functions should be 'easy to grasp and obviously fast' - at least that's what it used to be. Now with this patch we stil have <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[0])>>> D.767646.engine_m.x_m[0] = 0.0; <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[1])>>> D.767646.engine_m.x_m[1] = 0.0; <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[2])>>> D.767646.engine_m.x_m[2] = 0.0; in there and loads of index/domain variables on the MEM expressions like MEM[base: &D.767646, index: D.1312916, step: 8] = D.767266->origin_m.engine_m.x_m[i] + D.767266->spacings_m.engine_m.x_m[i] * (double) (MEM[base: &D.767512, index: D.1312916, step: 4] - D.767266->D.225459.physicalCellDomain_m.D.114276.D.113975.domain_m[i].D.110801.D.45225.D.45039.domain_m[0]); Richard.
What options are you using when you compile? I don't see the symbol you mention, although I do see _ZNK10ScalarCodeIN4Adv51Z13MomentumfluxZILi3EEEEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10MultiPatchI7GridTag6RemoteI5BrickEEESJ_SJ_EEvRKT_RK8IntervalIXsrSK_10dimensionsEERKT0_RKT1_
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Wed, 6 Jun 2007, ian at airs dot com wrote: > ------- Comment #169 from ian at airs dot com 2007-06-06 05:33 ------- > What options are you using when you compile? > > I don't see the symbol you mention, although I do see > > _ZNK10ScalarCodeIN4Adv51Z13MomentumfluxZILi3EEEEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10MultiPatchI7GridTag6RemoteI5BrickEEESJ_SJ_EEvRKT_RK8IntervalIXsrSK_10dimensionsEERKT0_RKT1_ This was simply with -O2. Note that the above is on 64bit x86_64-linux. Richard.
Created attachment 13666 [details] Patch This variant of the previous patch does better on at least some of the tramp3d functions. Here we eliminate the CHANGE_DYNAMIC_TYPE_EXPR nodes during the first aliasing pass, which prevents them from retaining references to local variables and generally clogging up optimizations. For this test case it generates the same results as mainline, which was not true of the previous patch. typedef __SIZE_TYPE__ size_t; inline void* operator new(size_t, void* __p) throw() { return __p; } template <class T, int D> class Vector { public: Vector() { for (int i = 0; i < D; ++i) new (&x_m[i]) T(); } T& operator[](int i) { return x_m[i]; } private: T x_m[D]; }; struct sia { int ai[3]; }; struct s { struct si { sia* p; } asi[3]; float* pd; }; class c { int foo(int, int, int); s sm; }; extern void bar(Vector<float, 3>*, int); int c::foo(int f1, int f2, int f3) { float sum = 0; for (int i = 0; i < 3; ++i) { for (int j = 0; j < 3; ++j) { Vector<float, 3> v; v[0] = 1.0; v[1] = 2.0; v[2] = 3.0; for (int k = 0; k < 3; ++k) { float f = (f1 * this->sm.asi[0].p->ai[0] + f2 * this->sm.asi[1].p->ai[0] + f3 * this->sm.asi[2].p->ai[0]); sum += f * v[k]; } *this->sm.pd = sum; } } return sum; }
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should On Fri, 8 Jun 2007, ian at airs dot com wrote: > > > ------- Comment #171 from ian at airs dot com 2007-06-08 07:49 ------- > Created an attachment (id=13666) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13666&action=view) > Patch > > This variant of the previous patch does better on at least some of the tramp3d > functions. Here we eliminate the CHANGE_DYNAMIC_TYPE_EXPR nodes during the > first aliasing pass, which prevents them from retaining references to local > variables and generally clogging up optimizations. Quick testing shows that this patch indeed shows better performance. I'll give it complete testing coverage tonight. Richard.
Full testing shows no (or at least < 0.5%) performance regression on tramp3d runtime. Compile time seems to go up by 0.5% (that looks consistent, about the same with previous patches) for tramp3d. Testcases not using placement new don't seem to be affected. The patch still fixes the FreePOOMA problems that caused me to file this PR. So, from my point of view the patch is ready to be exposed to more eyes. Thanks, Richard.
Great!
Nice, thanks Ian.
Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the dynamic type as it should rguenth at gcc dot gnu dot org wrote: > So, from my point of view the patch is ready to be exposed to more eyes. The C++ bits are fine.
Subject: Bug 29286 Author: ian Date: Tue Jun 12 17:47:37 2007 New Revision: 125653 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125653 Log: ./: PR libstdc++/29286 * tree.def: Add CHANGE_DYNAMIC_TYPE_EXPR. * tree.h (CHANGE_DYNAMIC_TYPE_NEW_TYPE): Define. (CHANGE_DYNAMIC_TYPE_LOCATION): Define. (DECL_NO_TBAA_P): Define. (struct tree_decl_common): Add no_tbaa_flag field. * tree-ssa-structalias.c (struct variable_info): Add no_tbaa_pruning field. (new_var_info): Initialize no_tbaa_pruning field. (unify_nodes): Copy no_tbaa_pruning field. (find_func_aliases): Handle CHANGE_DYNAMIC_TYPE_EXPR. (dump_solution_for_var): Print no_tbaa_pruning flag. (set_uids_in_ptset): Add no_tbaa_pruning parameter. Change all callers. (compute_tbaa_pruning): New static function. (compute_points_to_sets): Remove CHANGE_DYNAMIC_TYPE_EXPR nodes. Call compute_tbaa_pruning. * tree-ssa-alias.c (may_alias_p): Test no_tbaa_flag for pointers. * gimplify.c (gimplify_expr): Handle CHANGE_DYNAMIC_TYPE_EXPR. * gimple-low.c (lower_stmt): Likewise. * tree-gimple.c (is_gimple_stmt): Likewise. * tree-ssa-operands.c (get_expr_operands): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise. * tree-inline.c (estimate_num_insns_1): Likewise. (copy_result_decl_to_var): Likewise. * expr.c (expand_expr_real_1): Likewise. * tree-pretty-print.c (dump_generic_node): Likewise. * tree-inline.c (copy_decl_to_var): Copy DECL_NO_TBAA_P flag. * omp-low.c (omp_copy_decl_2): Likewise. * print-tree.c (print_node): Print DECL_NO_TBAA_P flag. * doc/c-tree.texi (Expression trees): Document CHANGE_DYNAMIC_TYPE_EXPR. cp/: PR libstdc++/29286 * init.c (avoid_placement_new_aliasing): New static function. (build_new_1): Call it. testsuite/: PR libstdc++/29286 * g++.dg/init/new16.C: New test. * g++.dg/init/new17.C: New test. * g++.dg/init/new18.C: New test. * g++.dg/init/new19.C: New test. Added: trunk/gcc/testsuite/g++.dg/init/new16.C trunk/gcc/testsuite/g++.dg/init/new17.C trunk/gcc/testsuite/g++.dg/init/new18.C trunk/gcc/testsuite/g++.dg/init/new19.C Modified: trunk/gcc/ChangeLog trunk/gcc/cp/ChangeLog trunk/gcc/cp/init.c trunk/gcc/doc/c-tree.texi trunk/gcc/expr.c trunk/gcc/gimple-low.c trunk/gcc/gimplify.c trunk/gcc/omp-low.c trunk/gcc/print-tree.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-gimple.c trunk/gcc/tree-inline.c trunk/gcc/tree-pretty-print.c trunk/gcc/tree-ssa-alias.c trunk/gcc/tree-ssa-dce.c trunk/gcc/tree-ssa-operands.c trunk/gcc/tree-ssa-structalias.c trunk/gcc/tree.def trunk/gcc/tree.h
Fixed on mainline. No plans to backport the patch to previous releases.
(In reply to comment #158) > Subject: Re: [4.0/4.1/4.2/4.3 Regression] placement new does not change the > dynamic type as it should > > "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes: > > [...] > > | > Now, if I understand your argument below correctly, you are saying > | > that even if we fixed the union rules for C++, we may still be facing > | > the same problem because assignment to POD objects does not just mean > | > that the object was there before, but that we are actually starting a > | > new one. Do I understand your argument correctly? > | > | Yes. > > Then, I'm not sure I agree with that. I'll raise it as part of the > union problem. > I'm sorry for commenting this late, but were there any outcomes of the discussion about changing the dynamic type by assignments for non-PODs (maybe on the reflector)? The way I see it, and how Richard may see it, is by 3.8/1 which says The lifetime of an object of type T begins when - storage with the proper alignment and size for type T is obtained The lifetime of an object of type T ends when - the storage which the object occupies is reused or released. This isn't specific to unions, but it never seems to define what constitutes a "reuse", but Richard seems to interpret this to include assignments and so do I, and i've always used this interpretation to explain the mechanism by which unions change their active member.