Bug 23383 - builtin array operator new is not marked with malloc attribute
builtin array operator new is not marked with malloc attribute
Status: NEW
Product: gcc
Classification: Unclassified
Component: c++
4.1.0
: P2 enhancement
: ---
Assigned To: Not yet assigned to anyone
: alias, missed-optimization
: 35467 35559 57823 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-08-14 05:39 UTC by Andrew Pinski
Modified: 2013-07-08 04:53 UTC (History)
10 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-07-01 00:43:13


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2005-08-14 05:39:15 UTC
Testcase:
int f(void)
{
  int t;
  int *a = new int[1024];
  int *b = new int[1024];
  *a = 1;
  *b = 2;
  t = *a;
  delete a;
  delete b;
  return t;
}

the return is not turned into 1 but still have "return t" in the final_cleanup.
Comment 1 Andrew Pinski 2005-08-23 16:34:29 UTC
Confirmed:
  *a = 1;
  *b = 2;
  t = *a;
  operator delete (a);
  operator delete (b);
  return t;
Comment 2 Andrew Pinski 2008-03-05 07:48:02 UTC
*** Bug 35467 has been marked as a duplicate of this bug. ***
Comment 3 Andrew Pinski 2008-03-12 20:29:59 UTC
*** Bug 35559 has been marked as a duplicate of this bug. ***
Comment 4 Chris Lattner 2008-06-04 03:21:34 UTC
This would not be legal, there is no reason operator new can't return a pointer that already exists in the program. 
Comment 5 davidxl 2008-06-04 04:15:59 UTC
(In reply to comment #4)
> This would not be legal, there is no reason operator new can't return a pointer
> that already exists in the program. 
> 

This is certainly a flaw in the C++ standard (it requires p returned from operator new to be different from previous calls to the operator unless the previous call's return is passed to operator delete -- but it does not require the pointer returned to be not pointing to any known location) -- and I do not see a reason this is intentional. 

For compiler optimization, it would better to assume the aggressive default, and provide an option to turn it off in case user provided definition violates the aliasing assumption.  Someone in the language committee might need to bring this up.

David
Comment 6 Chris Lattner 2008-06-04 04:32:36 UTC
This has been extensively discussed on the C++ reflector.  They decided (informally, on the reflector) that people should be able to globally override operator new to do logging, etc, which can make malloc have arbitrary side effects.
Comment 7 Chris Lattner 2008-06-04 04:32:58 UTC
That said, it would be fine to add support for this under a non-standards-mode option of some sort of course.
Comment 8 davidxl 2008-06-04 04:46:45 UTC
(In reply to comment #6)
> This has been extensively discussed on the C++ reflector.  They decided
> (informally, on the reflector) that people should be able to globally override
> operator new to do logging, etc, which can make malloc have arbitrary side
> effects.
> 
yes, things like saving pointers to user visible locations are bad.  In this kind of situation, optimizer people's voice seems not loud enough :(. 

One way to solve this problem is to require user who override the default definition to always provide a declaration (before any use of operator new) -- if such declaration is not seen, the operator new will be treated as builtin (the default) implementation which has the nice property.

David


Comment 9 Chris Lattner 2008-06-04 04:48:03 UTC
This isn't possible.  The user can override operator new at the very last minute: e.g. by interposing a shared object with LD_PRELOAD.  There is no way that a compiler or even LTO optimizing linker can know about this.  A special non-standard compiler flag is required.
Comment 10 davidxl 2008-06-04 05:23:49 UTC
(In reply to comment #9)
> This isn't possible.  The user can override operator new at the very last
> minute: e.g. by interposing a shared object with LD_PRELOAD.  There is no way
> that a compiler or even LTO optimizing linker can know about this.  A special
> non-standard compiler flag is required.
> 

Then runtime override would be illegal (in theory) (if builtin operator is assumed), though the program may still run fine.

so LLVM's LTO completely gives up when seeing operator new or there is a special flag?

David
Comment 11 Chris Lattner 2008-06-04 05:34:58 UTC
Expecting people to modify their source is bad news.

LLVM's LTO does nothing for operator new, it treats it as any other external function with undefined behavior.
Comment 12 Richard Biener 2008-06-04 07:58:12 UTC
Interesting things start to happen once you inline allocator functions as well.
See PR29286 and PR33407 which we still don't handle 100% correct.
Comment 13 davidxl 2008-06-04 16:48:11 UTC
(In reply to comment #12)
> Interesting things start to happen once you inline allocator functions as well.
> See PR29286 and PR33407 which we still don't handle 100% correct.
> 

I browsed through the two bugs -- it seems that compiler should get this right regardless -- local pointer analysis should detect the must aliasing and should overrule the type based aliasing decision when the placement new is inlined. If not inlined, compiler should know the exact semantics of placement new (return == arg), or treat it conservatively. 

David
Comment 14 Richard Biener 2008-06-04 17:03:36 UTC
We do the exact opposite - type-based rules override points-to must-alias
information (or really may-alias information).  Also for the proposed scheme
to work you need to guarantee that you always can compute correct points-to
relations (I mean, if points-to information says pt_anything and if you
then assume must-alias and thus a conflict then you simply disable TBAA
completely).
Comment 15 davidxl 2008-06-04 17:34:20 UTC
(In reply to comment #14)
> We do the exact opposite - type-based rules override points-to must-alias
> information (or really may-alias information).  Also for the proposed scheme
> to work you need to guarantee that you always can compute correct points-to
> relations (I mean, if points-to information says pt_anything and if you
> then assume must-alias and thus a conflict then you simply disable TBAA
> completely).
> 

Right, in general, type alias rules should override field and flow insensitive pointer aliasing information as they really have very low confidence level (especially for pt_anything case which is just a baseless guess) -- but  precise/trustworthy aliasing info should be checked before assertion based alias information and decide whether to proceed. 

For example:

if (no_alias_according_to_conservative_pointer_info) return no_alias;
if (no_alias_according_to_precise_pointer_info) return no_alias;
if (must_alias or definitely_may_alias) return may/must_alias;   (1)

// now proceed with type based rules, etc.


This is in theory. In practice, it can be tricky to tag the confidence level of aliasing info.

David


Comment 16 davidxl 2012-01-04 00:28:55 UTC
A related topic - aliasing property of realloc -- the malloc attribute is not applied in the glibc header and the comment is like

/* __attribute_malloc__ is not used, because if realloc returns
   the same pointer that was passed to it, aliasing needs to be allowed
   between objects pointed by the old and new pointers.  *


It is true that the realloc can return an address is physically aliased with the pointer passed to it -- but assuming no-alias by the compiler should cause no harm -- as all code motions/CSEs across the realloc call will not be possible because realloc may modify/use the memory location.


Any comment on this? 

David


(In reply to comment #15)
> (In reply to comment #14)
> > We do the exact opposite - type-based rules override points-to must-alias
> > information (or really may-alias information).  Also for the proposed scheme
> > to work you need to guarantee that you always can compute correct points-to
> > relations (I mean, if points-to information says pt_anything and if you
> > then assume must-alias and thus a conflict then you simply disable TBAA
> > completely).
> > 
> 
> Right, in general, type alias rules should override field and flow insensitive
> pointer aliasing information as they really have very low confidence level
> (especially for pt_anything case which is just a baseless guess) -- but 
> precise/trustworthy aliasing info should be checked before assertion based
> alias information and decide whether to proceed. 
> 
> For example:
> 
> if (no_alias_according_to_conservative_pointer_info) return no_alias;
> if (no_alias_according_to_precise_pointer_info) return no_alias;
> if (must_alias or definitely_may_alias) return may/must_alias;   (1)
> 
> // now proceed with type based rules, etc.
> 
> 
> This is in theory. In practice, it can be tricky to tag the confidence level of
> aliasing info.
> 
> David
Comment 17 rguenther@suse.de 2012-01-04 09:43:13 UTC
On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> 
> --- Comment #16 from davidxl <xinliangli at gmail dot com> 2012-01-04 00:28:55 UTC ---
> A related topic - aliasing property of realloc -- the malloc attribute is not
> applied in the glibc header and the comment is like
> 
> /* __attribute_malloc__ is not used, because if realloc returns
>    the same pointer that was passed to it, aliasing needs to be allowed
>    between objects pointed by the old and new pointers.  *
> 
> 
> It is true that the realloc can return an address is physically aliased with
> the pointer passed to it -- but assuming no-alias by the compiler should cause
> no harm -- as all code motions/CSEs across the realloc call will not be
> possible because realloc may modify/use the memory location.
> 
> 
> Any comment on this? 

The malloc attribute assumes that the contents of the memory pointed
to by the return value is undefined, so the comment is inaccurate
but the malloc attribute can indeed be not used.

We can explicitly handle REALLOC in the points-to code to honor the
fact that it is not (we do not at the moment).
Comment 18 davidxl 2012-01-04 17:11:26 UTC
(In reply to comment #17)
> On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> 
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > 
> > --- Comment #16 from davidxl <xinliangli at gmail dot com> 2012-01-04 00:28:55 UTC ---
> > A related topic - aliasing property of realloc -- the malloc attribute is not
> > applied in the glibc header and the comment is like
> > 
> > /* __attribute_malloc__ is not used, because if realloc returns
> >    the same pointer that was passed to it, aliasing needs to be allowed
> >    between objects pointed by the old and new pointers.  *
> > 
> > 
> > It is true that the realloc can return an address is physically aliased with
> > the pointer passed to it -- but assuming no-alias by the compiler should cause
> > no harm -- as all code motions/CSEs across the realloc call will not be
> > possible because realloc may modify/use the memory location.
> > 
> > 
> > Any comment on this? 
> 
> The malloc attribute assumes that the contents of the memory pointed
> to by the return value is undefined, so the comment is inaccurate
> but the malloc attribute can indeed be not used.

Which part of the optimizer takes advantage of the 'undefinedness' of returned memory?


> 
> We can explicitly handle REALLOC in the points-to code to honor the
> fact that it is not (we do not at the moment).

ok.
Comment 19 rguenther@suse.de 2012-01-05 08:39:57 UTC
On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> 
> --- Comment #18 from davidxl <xinliangli at gmail dot com> 2012-01-04 17:11:26 UTC ---
> (In reply to comment #17)
> > On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> > 
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > > 
> > > --- Comment #16 from davidxl <xinliangli at gmail dot com> 2012-01-04 00:28:55 UTC ---
> > > A related topic - aliasing property of realloc -- the malloc attribute is not
> > > applied in the glibc header and the comment is like
> > > 
> > > /* __attribute_malloc__ is not used, because if realloc returns
> > >    the same pointer that was passed to it, aliasing needs to be allowed
> > >    between objects pointed by the old and new pointers.  *
> > > 
> > > 
> > > It is true that the realloc can return an address is physically aliased with
> > > the pointer passed to it -- but assuming no-alias by the compiler should cause
> > > no harm -- as all code motions/CSEs across the realloc call will not be
> > > possible because realloc may modify/use the memory location.
> > > 
> > > 
> > > Any comment on this? 
> > 
> > The malloc attribute assumes that the contents of the memory pointed
> > to by the return value is undefined, so the comment is inaccurate
> > but the malloc attribute can indeed be not used.
> 
> Which part of the optimizer takes advantage of the 'undefinedness' of returned
> memory?

points-to analysis.  It assumes that the returned blob of memory
points to nothing (yet).  So for

 int i;
 int **p = malloc (8);
 *p = &i;
 int **q = realloc (p, 8);

you'd get that *q points to nothing insteda of i.

> > We can explicitly handle REALLOC in the points-to code to honor the
> > fact that it is not (we do not at the moment).
> 
> ok.

which needs to be fixed here.

Richard.
Comment 20 davidxl 2012-01-05 18:11:18 UTC
(In reply to comment #19)
> On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> 
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > 
> > --- Comment #18 from davidxl <xinliangli at gmail dot com> 2012-01-04 17:11:26 UTC ---
> > (In reply to comment #17)
> > > On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> > > 
> > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > > > 
> > > > --- Comment #16 from davidxl <xinliangli at gmail dot com> 2012-01-04 00:28:55 UTC ---
> > > > A related topic - aliasing property of realloc -- the malloc attribute is not
> > > > applied in the glibc header and the comment is like
> > > > 
> > > > /* __attribute_malloc__ is not used, because if realloc returns
> > > >    the same pointer that was passed to it, aliasing needs to be allowed
> > > >    between objects pointed by the old and new pointers.  *
> > > > 
> > > > 
> > > > It is true that the realloc can return an address is physically aliased with
> > > > the pointer passed to it -- but assuming no-alias by the compiler should cause
> > > > no harm -- as all code motions/CSEs across the realloc call will not be
> > > > possible because realloc may modify/use the memory location.
> > > > 
> > > > 
> > > > Any comment on this? 
> > > 
> > > The malloc attribute assumes that the contents of the memory pointed
> > > to by the return value is undefined, so the comment is inaccurate
> > > but the malloc attribute can indeed be not used.
> > 
> > Which part of the optimizer takes advantage of the 'undefinedness' of returned
> > memory?
> 
> points-to analysis.  It assumes that the returned blob of memory
> points to nothing (yet).  So for
> 
>  int i;
>  int **p = malloc (8);
>  *p = &i;
>  int **q = realloc (p, 8);
> 
> you'd get that *q points to nothing insteda of i.

The malloc attribute documentation is confusing:

malloc
The malloc attribute is used to tell the compiler that a function may be treated as if any non-NULL pointer it returns cannot alias any other pointer valid when the function returns. This will often improve optimization. Standard functions with this property include malloc and calloc. realloc-like functions have this property as long as the old pointer is never referred to (including comparing it to the new pointer) after the function returns a non-NULL value. 


It does not mention the undefineness explicitly.

It might be better to fix the semantics of this attribute to only specify the aliasing property so that it can also be applied to realloc. Points-to needs to be fixed to special case realloc for the initialization.

David


> 
> > > We can explicitly handle REALLOC in the points-to code to honor the
> > > fact that it is not (we do not at the moment).
> > 
> > ok.
> 
> which needs to be fixed here.
> 
> Richard.
Comment 21 Jakub Jelinek 2012-01-05 18:26:17 UTC
But can't a valid code also compare the result from realloc with the old pointer, and if they are equal, do something, otherwise do something else?
I think it is pretty common e.g. if the malloced block contains pointers to parts of the malloced area and upon realloc that didn't return the passed address wants to adjust all those pointers.
Having a malloc attribute on realloc would still break this.
I'd say we want realloc attribute and handle it where we currently handle BUILT_IN_REALLOC.
Comment 22 davidxl 2012-01-05 18:54:51 UTC
(In reply to comment #21)
> But can't a valid code also compare the result from realloc with the old
> pointer, and if they are equal, do something, otherwise do something else?
> I think it is pretty common e.g. if the malloced block contains pointers to
> parts of the malloced area and upon realloc that didn't return the passed
> address wants to adjust all those pointers.
> Having a malloc attribute on realloc would still break this.
> I'd say we want realloc attribute and handle it where we currently handle
> BUILT_IN_REALLOC.

Right, the case you describe may be common. On the other hand, the compiler probably should not optimize away pointer comparisons without knowing if the pointer is used after 'free' or 'free' like function 'realloc'.

David
Comment 23 rguenther@suse.de 2012-01-09 08:37:21 UTC
On Thu, 5 Jan 2012, jakub at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #21 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-01-05 18:26:17 UTC ---
> But can't a valid code also compare the result from realloc with the old
> pointer, and if they are equal, do something, otherwise do something else?
> I think it is pretty common e.g. if the malloced block contains pointers to
> parts of the malloced area and upon realloc that didn't return the passed
> address wants to adjust all those pointers.

Sure.

> Having a malloc attribute on realloc would still break this.

We cannot use the malloc attribute on realloc, ever.

> I'd say we want realloc attribute and handle it where we currently handle
> BUILT_IN_REALLOC.

Not sure it's worth it besides handling BUILT_IN_REALLOC explicitly.
Comment 24 rguenther@suse.de 2012-01-09 08:39:37 UTC
On Thu, 5 Jan 2012, xinliangli at gmail dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> 
> --- Comment #20 from davidxl <xinliangli at gmail dot com> 2012-01-05 18:11:18 UTC ---
> (In reply to comment #19)
> > On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> > 
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > > 
> > > --- Comment #18 from davidxl <xinliangli at gmail dot com> 2012-01-04 17:11:26 UTC ---
> > > (In reply to comment #17)
> > > > On Wed, 4 Jan 2012, xinliangli at gmail dot com wrote:
> > > > 
> > > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23383
> > > > > 
> > > > > --- Comment #16 from davidxl <xinliangli at gmail dot com> 2012-01-04 00:28:55 UTC ---
> > > > > A related topic - aliasing property of realloc -- the malloc attribute is not
> > > > > applied in the glibc header and the comment is like
> > > > > 
> > > > > /* __attribute_malloc__ is not used, because if realloc returns
> > > > >    the same pointer that was passed to it, aliasing needs to be allowed
> > > > >    between objects pointed by the old and new pointers.  *
> > > > > 
> > > > > 
> > > > > It is true that the realloc can return an address is physically aliased with
> > > > > the pointer passed to it -- but assuming no-alias by the compiler should cause
> > > > > no harm -- as all code motions/CSEs across the realloc call will not be
> > > > > possible because realloc may modify/use the memory location.
> > > > > 
> > > > > 
> > > > > Any comment on this? 
> > > > 
> > > > The malloc attribute assumes that the contents of the memory pointed
> > > > to by the return value is undefined, so the comment is inaccurate
> > > > but the malloc attribute can indeed be not used.
> > > 
> > > Which part of the optimizer takes advantage of the 'undefinedness' of returned
> > > memory?
> > 
> > points-to analysis.  It assumes that the returned blob of memory
> > points to nothing (yet).  So for
> > 
> >  int i;
> >  int **p = malloc (8);
> >  *p = &i;
> >  int **q = realloc (p, 8);
> > 
> > you'd get that *q points to nothing insteda of i.
> 
> The malloc attribute documentation is confusing:
> 
> malloc
> The malloc attribute is used to tell the compiler that a function may be
> treated as if any non-NULL pointer it returns cannot alias any other pointer
> valid when the function returns. This will often improve optimization. Standard
> functions with this property include malloc and calloc. realloc-like functions
> have this property as long as the old pointer is never referred to (including
> comparing it to the new pointer) after the function returns a non-NULL value. 

Ugh, it should not mention realloc here.

> It does not mention the undefineness explicitly.
> 
> It might be better to fix the semantics of this attribute to only specify the
> aliasing property so that it can also be applied to realloc. Points-to needs to
> be fixed to special case realloc for the initialization.

It can't be "fixed" to not assume the undefinedness and still be useful.
For any non-builtin with the malloc attribute it would need to assume
the pointed to memory points to anything.  That will be pretty useless
once you put pointers in the allocated memory.

I'll fixup the documentation.
Comment 25 Andrew Pinski 2013-07-08 04:53:48 UTC
*** Bug 57823 has been marked as a duplicate of this bug. ***