PATCH: Fix assorted bounds violations

Greg McGary
Tue Aug 29 16:38:00 GMT 2000

Jeffrey A Law <> writes:

>   > The
>   > low bounds must move in a fashion parallel to the pointer values, and
>   > the high bounds must move as well as expand to encompass the
>   > additional space.
> ?!?  Sorry, this is where I don't see where you're going -- yes, we're
> expanding a malloc'd array, but I don't see how we're expanding things
> *inside* the array. ie, if there are items within the old space
> why can't they be naively copied into the new space?  I don't see how
> reallocing the buffer changes the address/bounds of the items *within*
> that buffer.
> Or are we indeed resizing items within the realloc'd buffer too?

Yes, we are--at least for now.

Only the item on whose behalf we expanded the buffer needs to be
resized, but I didn't look hard enough to figure out how to do that.
I should add a FIXME comment to that effect.  The most important thing
this patch accomplishes is to stop cpp from crashing.

Note that there's more work to do with cpp because its allocators have
not yet been enhanced to set bounds.  I haven't even gotten as far as
identifying those allocators.  That comes later.  In the meantime,
read on...

> My objection is that I don't believe (right now) that it should be
> necessary to change a conforming program to get proper bounds checking.

Well then, Brother Jeff, I must preach more earnestly! 8^)

> Possibly.  Not sure about this one.  We'd need to tackle it separately.

Tackle away at your earliest convenience.

>   > Bounds need to be set at the time a pointer is assigned.  If we're
>   > taking the address of an object, gcc can synthesize the bounds based
>   > on the referent object's decl.  Storage allocators, OTOH, must
>   > explicitly set bounds based on the size they were requested to
>   > allocate.  
> But can't we do that in the caller?  ie, the typical usage will be
> something like ptr = ggc_alloc (somesize);
> At that point don't you have the information you need to set the bounds 
> properly?  

In general, no.  Sometimes gcc has enough information, but it must
know some things for certain:

    1) that this is a call to an allocator (the malloc attribute might
       be sufficient),

    2) how to identify which args to the allocator bear on object
       size, and what formula computes the size.

The simplest case of malloc is doable: set low_bound to the returned
pointer and high_bound to be the returned pointer plus malloc's first
argument.  The case of calloc is a bit more complex: set high_bound to
be the returned pointer plus the product of the first and second args.

How can gcc handle this hypothetical allocator?:

struct name
  int fiddly_bits;
  char string[1];

struct name *
alloc_name (int name_length)
  ... explicitly manage a heap ...
  ... return memory of length sizeof (struct name) + name_length ...

We can't preprogram gcc to understand that it should derive the
high_bound on calls to alloc_name by adding sizeof (struct name) to
the value of the first argument.  We certainly shouldn't booger gcc's
attribute interface to allow programmers to communicate that to gcc.
If alloc_name were simple minded and just called malloc, then it would
get a good BP because malloc took care of bounds creation.  However,
for this discussion, alloc_name manages its own heap by calling malloc
to create very large buffers and then carves up that space to make the
individual name structs.  alloc_name needs to be responsible for
returning a pointer with proper bounds.  That's just one of the few
burdens placed on programs that use BPs.  Programs that use only
standard allocators to directly create program objects needn't do
anything special.  Programs that provide their own allocators must be
responsible for setting bounds.  This goes for obstacks too.  I still
need to provide a patch for them.

> Presumably I'm missing something fundamental about when/where the bounds
> for pointers in dynamically allocated memory are set up.

They are explicitly setup by the allocators.  Allocators that act as a
simple wrapper for other allocators (e.g. xmalloc for malloc) needn't
do anything special.  Allocators that draw from raw, untyped pools of
storage need to explicitly set bounds on what they return.

> In general, how are things handled with "malloc"?  Why does this routine have
> to do something different?

Malloc explicitly sets bounds on the pointer it returns:
	low_bound = value; high_bound = value + requested size,
and can hardly do otherwise.  Before setting bounds, malloc has an
internal pointer to a chunk of memory that is oversized to accommodate
bookkeeping information and possibly to round up to an alignment boundary.

> I would recommend you go ahead and install the parts of this patch that
> have been approved (if you haven't done so already).  No sense in making
> them wait while we hash out the issues with these two hunks.

Already done.  Thanks.


More information about the Gcc-patches mailing list