This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Adjust 'malloc' attribute documentation to match implementation


On Tuesday 21 February 2012 10:19:15 Richard Guenther wrote:
> On Mon, Feb 20, 2012 at 8:55 PM, Tijl Coosemans <tijl@coosemans.org> wrote:
>> On Monday 9 January 2012 10:05:08 Richard Guenther wrote:
>>> Since GCC 4.4 applying the malloc attribute to realloc-like
>>> functions does not work under the documented constraints because
>>> the contents of the memory pointed to are not properly transfered
>>> from the realloc argument (or treated as pointing to anything,
>>> like 4.3 behaved).
>>>
>>> The following adjusts documentation to reflect implementation
>>> reality (we do have an implementation detail that treats the
>>> memory blob returned for non-builtins as pointing to any global
>>> variable, but that is neither documented nor do I plan to do
>>> so - I presume it is to allow allocation + initialization
>>> routines to be marked with malloc, but even that area looks
>>> susceptible to misinterpretation to me).
>>>
>>> Any comments?
>>
>> The new text says the memory must be undefined, but gives calloc as an
>> example for which the memory is defined to be zero. Also, GCC has
>> built-ins for strdup and strndup with the malloc attribute and GLIBC
>> further adds it to wcsdup (wchar_t version of strdup) and tempnam. In
>> all of these cases the memory is defined.
>>
>> Isn't the reason the attribute doesn't apply to realloc simply because
>> the returned pointer may alias the one given as argument, rather than
>> having defined memory content?
> 
> The question is really what the alias-analysis code can derive from a
> function that is declared with the malloc attribute.  The most useful
> property for alias analysis would be that te non-aliasing holds
> transitively, thus reading (with any level of indirection) from the returned
> pointer does not produce memory that is aliased by any other pointer.
> That's what happens for 'malloc' (also for 'calloc' - you can't do any
> further indirections through the NULL pointers the memory holds).  It
> does not happen for realloc.  Currently the alias-analysis code does
> assume exactly this properly (only very slightly weakened, possibly
> because we broke some code I guess).
> 
> Internally, all builtins with interesting allocation properties are handled
> explicitely, so we probably should not rely on the malloc attribute present
> on those (and maybe simply drop it there).
> 
> The question is really what is useful for users, and what's the most natural
> behavior?  For example
> 
> int **my_initialized_malloc (int *p)
> {
>   int **q = malloc (sizeof (int *));
>   *q = p;
>   return q;
> }
> 
> would not qualify for the 'malloc' attribute (but we've taken measures to not
> miscompile this kind of code, it seems to be a very common misconception
> to place annotate these with 'malloc').
> 
> I'm not sure how to exactly constrain the documentation for 'malloc' better.
> Maybe
> 
> The @code{malloc} attribute is used to tell the compiler that a function
> may be treated as if any non-@code{NULL} pointer it returns cannot
> alias any other pointer valid when the function returns and that the memory
> does not contain any pointer value.
> 
> ?  Because that is what is relevant.  That you can in no way extract
> a pointer value from the memory pointed to by the return value.  Because
> alias analysis will assume any such extracted pointer value points
> nowhere (so, extracting a NULL pointer is ok).
> 
> The reasoning why the string functions have the malloc attribute was
> probably that strings do not contain pointer values.  Of course they
> can, you can store a character encoding of a pointer, copy the
> string and decode it from the copy again.  We'd miscompile then
> 
>  int i = 1;
>  int *p = &i;
>  char ptr[16];
>  ... inline encode p into ptr ...
>  char *x = strdup (ptr);
>  int *q = ... inline decode x to q
>  *q = 2;
>  return i;
> 
> to return 1 because we do not see that q may point to i.  Of course
> we properly handle the transfer of pointers for str[n]dup, so the
> 'malloc' attribute on it is a lie...

Thanks, that was very informative.

Is it correct to say that the attribute applies to deep copies, but not to
shallow ones?

How about the following text:

@item malloc
@cindex @code{malloc} attribute
The @code{malloc} attribute is used to tell the compiler that a pointer
returned by a function is either @code{NULL} or points to a newly
allocated object and that any pointer within that object is either
uninitialised, @code{NULL} or pointing to a newly allocated object for
which the same conditions hold recursively.  The compiler assumes that
existing variables and memory cannot be accessed through the returned
pointer which will often improve optimization.
Standard functions with this property include @code{malloc} and
@code{calloc}.  @code{realloc}-like functions do not have this
property as the returned pointer may alias the one given as argument
or the memory pointed to may contain initialised pointers.
@code{strdup}-like functions have this property as long as the string
does not encode a memory address.  More generally the attribute applies
to deep memory copies, but not to shallow ones.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]