This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] replace malloc with a decl on the stack
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Marc Glisse <marc dot glisse at inria dot fr>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 12 Nov 2013 18:10:11 +0100
- Subject: Re: [RFC] replace malloc with a decl on the stack
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot DEB dot 2 dot 10 dot 1311100140300 dot 4153 at laptop-mg dot saclay dot inria dot fr> <20131111100814 dot GA6574 at domone dot podge> <alpine dot DEB dot 2 dot 10 dot 1311120101060 dot 17040 at laptop-mg dot saclay dot inria dot fr> <20131112104938 dot GA20441 at domone dot podge> <alpine dot DEB dot 2 dot 10 dot 1311121248180 dot 23017 at stedding dot saclay dot inria dot fr> <20131112145033 dot GA24799 at domone dot podge> <alpine dot DEB dot 2 dot 10 dot 1311121637580 dot 23017 at stedding dot saclay dot inria dot fr>
On Tue, Nov 12, 2013 at 05:01:31PM +0100, Marc Glisse wrote:
> On Tue, 12 Nov 2013, OndÅej BÃlka wrote:
>
> >On Tue, Nov 12, 2013 at 01:41:24PM +0100, Marc Glisse wrote:
> >>On Tue, 12 Nov 2013, OndÅej BÃlka wrote:
> >>
> >>>>I am trying to get something to actually work and be accepted in
> >>>>gcc. That may mean being conservative.
> >>>
> >>>That also may mean that you will cover only cases where it is not needed.
> >>>
> >>>A malloc will have a small per-thread cache for small requests that does
> >>>not need any locking. A performance difference will be quite small and
> >>>there may be a define which causes inlining constant size mallocs.
> >>>
> >>>Sizes from 256 bytes are interesting case.
> >>
> >>I have to disagree here. When the allocated size is large enough,
> >>the cost of malloc+free often becomes small compared to whatever
> >>work you are doing in that array. It is when the size is very small
> >>that speeding up malloc+free is essential. And you are
> >>underestimating the cost of those small allocations.
> >>
> >No, just aware that these are important and there will be optimizations
> >that convert these. For example:
> >
> >#define malloc (s) ({ \
> > static pool p; \
> > if (__builtin_constant_p (s) { \
> > alloc_from_pool(&p); \
> > else \
> > malloc (s); \
> >})
>
> Seems to be missing some bits.
>
A example, its purpose is to show a idea not to be complete.
> >How will you find small constant allocations with this in place?
>
> I won't. If your code is already optimized, the compiler has nothing
> left to do, that's fine. (not that I am convinced your optimization
> works that well)
>
What if it decreases running time of all constant allocations by 6%.
Converting to stack allocation would eliminate overhead but eliminated
sites contributed to 5% of runtime.
> >>I started on this because of an application that spends more than
> >>half of its time in malloc+free and where (almost) no allocation is
> >>larger than 100 bytes. Changing the code to not use malloc/free but
> >>other allocation strategies is very complicated because it would
> >>break abstraction layers. I used various workarounds that proved
> >>rather effective, but I would have loved for that to be unnecessary.
> >
> >See my memory pool that uses custom free functionality where you need
> >only change malloc, free is handled automaticaly.
>
> Do you mean the incomplete macro above, or your STACK_ALLOC macro
> from the other post? (don't know how that one works either, "size"
> appears out of nowhere in STACK_FREE)
>
Also a example where actual logic could be supplied later, should be
__stack_new instead size.
I am not talking about stack conversion but about memory pool, a
proof-of-concept is here.
https://www.sourceware.org/ml/libc-alpha/2013-11/msg00258.html
> As I already said, I know how to write efficient code, but that's
> hard on the abstraction layers (before inlining, you have to go at
> least 20 layers up in the CFG to find a common ancestor for malloc
> and free), and I'd be happy if the compiler could help a bit in easy
> cases.
>
This is more about using for allocation libraries that are flexible
enough.
> >Then there are parts where coordination is necessary, one is determining
> >if stack allocation is possible. A posible way would be first turn a
> >eligible malloc calls to
> >
> >malloc_stack(size, color)
> >
> >as hint to allocator. I added a color parameter to handle partial
> >overlap, if you do a coloring with edge when allocations partialy
> >overlap then you can assign to each color class a stack and proceed as
> >normal.
>
> That would be great, yes. I'll be looking forward to your patches.
>
> (note that the limits of alias analysis mean that gcc often has no
> idea which free goes with which malloc).
>
Wait, you have a free with same SSA_NAME as malloc and alias analysis
cannot tell which malloc corespond to that free?
> --
> Marc Glisse
--
We've picked COBOL as the language of choice.