The following function really should be compiled to an empty function (DSE should first remove the store and then free of a malloc with no change inbetween and we should remove both calls). void *malloc(__SIZE_TYPE__); void free(void*); int f(void) { char *i = malloc(1); *i = 1; free (i); } This is something which is useful when we want to much higer level optimizations.
Subject: Re: New: Missing DSE/malloc/free optimization > void *malloc(__SIZE_TYPE__); > void free(void*); > int f(void) > { > char *i = malloc(1); > *i = 1; > free (i); > } > > This is something which is useful when we want to much higer level optimizations. This would require an attribute free, which is on the tree-profiling branch Otherwise, you wouldn't know that free is really "free"
I have a patch for the DSE part. The malloc part shouldn't be too difficult. The free part is more interesting. Basically the pointer use in free (p) should mark the malloc necessary but really nothing is there to mark the free necessary if it is (the free is necessary iff the malloc is necessary but nothing provides this link to provide easy integration with the DCE worklist implementation). Maybe a two-stage implementation would work where we replace an unnecessary malloc () with a NULL assignment (so the free becomes a no-op and is eventually folded away after CCP). A little bit hackish - and dependent on free (NULL) (if it happens to persist) to not fault.
Hm, it's not _that_ simple. A fab hack is easier I guess.
And we want to optimize void *malloc(__SIZE_TYPE__); void free(void*); void abort(void); int f(void) { char *i = malloc(1); if (i == (void *)0) abort (); *i = 1; free (i); } the same (removing the check and the abort() call). Which asks for doing the task all at once. Somewhere.
Subject: Bug 19831 Author: rguenth Date: Wed Jul 1 12:27:33 2009 New Revision: 149140 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=149140 Log: 2009-07-01 Richard Guenther <rguenther@suse.de> PR tree-optimization/19831 * tree-ssa-dce.c (propagate_necessity): Calls to functions that only act as barriers do not make any previous stores necessary. * tree-ssa-structalias.c (handle_lhs_call): Delay making HEAP variables global, do not add a constraint from nonlocal. (find_func_aliases): Handle escapes through return statements. (compute_points_to_sets): Make escaped HEAP variables global. * gcc.dg/tree-ssa/20041122-1.c: Enable TBAA, scan FRE dump, make allocated memory escape. Un-XFAIL. * gcc.dg/vect/pr21591.c: Make allocated memory escape. * gcc.dg/vect/pr31699.c: Likewise. * gcc.dg/tree-ssa/ssa-dce-7.c: New testcase. libmudflap/ * testsuite/libmudflap.c/fail11-frag.c: Make allocated memory escape. * testsuite/libmudflap.c/fail12-frag.c: Likewise. * testsuite/libmudflap.c/fail16-frag.c: Likewise. * testsuite/libmudflap.c/fail31-frag.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-7.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/20041122-1.c trunk/gcc/testsuite/gcc.dg/vect/pr21591.c trunk/gcc/testsuite/gcc.dg/vect/pr31699.c trunk/gcc/tree-ssa-dce.c trunk/gcc/tree-ssa-structalias.c trunk/libmudflap/ChangeLog trunk/libmudflap/testsuite/libmudflap.c/fail11-frag.c trunk/libmudflap/testsuite/libmudflap.c/fail12-frag.c trunk/libmudflap/testsuite/libmudflap.c/fail16-frag.c trunk/libmudflap/testsuite/libmudflap.c/fail31-frag.c
I am not working on the other piece.
I'll give it another try somewhen. Don't mark the free as necessary, nor its operand. If a malloc ends up not being marked as necessary and all uses of the lhs are free calls remove the pairs, otherwise do nothing in eliminate_unnecessary_stmts.
Hm, and for invalid code like int main() { int *p = __builtin_malloc (sizeof (int) * 4); *p++ = 4; *p++ = 8; __builtin_free (p); return 0; } we shouldn't ICE either ... but here the idea wouldn't work (malloc is not marked necessary but free doesn't match up with it either).
Author: rguenth Date: Thu Sep 8 09:21:39 2011 New Revision: 178683 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178683 Log: 2011-09-08 Richard Guenther <rguenther@suse.de> PR tree-optimization/19831 * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not mark allocation functions as necessary. * gcc.dg/tree-ssa/ssa-dce-8.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dce-8.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-dce.c
Author: rguenth Date: Thu Sep 8 13:00:23 2011 New Revision: 178687 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=178687 Log: 2011-09-08 Richard Guenther <rguenther@suse.de> PR tree-optimization/19831 * tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Also skip builtins with vdefs that do not really store something. (propagate_necessity): For calls to free that we can associate with an allocation function do not mark the freed pointer definition necessary. (eliminate_unnecessary_stmts): Remove a call to free if the associated call to an allocation function is not necessary. * gcc.dg/tree-ssa/pr19831-1.c: New testcase. * gcc.dg/tree-ssa/pr19831-2.c: Likewise. * gcc.dg/tree-ssa/pr19831-3.c: Likewise. * gcc.dg/errno-1.c: Adjust. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr19831-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr19831-2.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr19831-3.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/errno-1.c trunk/gcc/tree-ssa-dce.c
More cases handled now. What we still miss is conditional free, see gcc.dg/tree-ssa/pr19831-3.c for XFAILed testcases.
but you need some
*** Bug 53082 has been marked as a duplicate of this bug. ***
Discussing PR 53082 here since it was closed as a dup. It asks for 2 things: 1) Replacing a=malloc(n);b=malloc(m); ... free(a);free(b); with a single malloc-free pair: a=malloc(n+m);b=a+n; ... free(a);. 2) Replacing some malloc calls with alloca. I am wondering about the circumstances where those things are safe to do. The second one can only be safe if the size is a constant or VRP tells us it is at most a small constant. And then for both, nothing too weird must happen in between, but I am not sure what bad things are possible exactly. The "easy" case would be with the 2 malloc calls consecutive (what can't we put in between?), the 2 free calls consecutive, everything in a single basic block (this sounds way too restrictive), the argument of free an SSA_NAME defined with the malloc call, nothing that may throw (or any other way we can avoid coming back to execute free) in "...". Then I have a hard time finding bad cases (except that on error, malloc returns 0, and we put 0+n in b, maybe it requires a -fmalloc-cannot-fail). Are there other places in the compiler with a similar set of checks? And then we would also need to extend everything to new/delete :-( (unless maybe we install an LTO version of libstdc++ and, with -fwhole-program (taken as a promise not to LD_PRELOAD another new/delete), gcc accepts to inline new/delete and sees malloc and free, but that seems a long shot...)
See the discussion that starts at: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130923/188747.html (it continues in the following weeks) for optimization 2 of comment #14 in clang. The key idea is that if, in the function where malloc is called, we can check that on all paths (including EH) to the exit of the function free is called on the return value of malloc, then we can replace malloc and all the corresponding free with a stack allocation (whether we want to, depending on the size, is a different issue). Of course there are some details like: if we test whether malloc returned 0, that's one branch where we don't need free, etc. Also, the optimization can make a crash happen much earlier: p=malloc(n);fun(p);free(p); where fun calls free, would normally fail in the second free (double-free) and will instead fail in fun (free of a pointer to the stack). I think that's fine, but people may be surprised if there are a lot of side-effects (I/O for instance) between the 2 free. The hardest part here is deciding whether we always reach a free (this is how we prove that there is no hidden call to realloc/free). There could be an infinite loop in between. Any function we call might itself have an infinite loop or call exit. And worst of all, there might be a call to longjmp. That seems to make it impossible to perform the optimization when there is a function call, in all but the most contrived cases (a call to setjmp between malloc and the function call, or in C++ if the EH landing pad for the call contains a non-inlined call to a non-trivial destructor). So to make this possible, the first thing to do seems to forbid longjmp: -fno-longjmp. But even then it is still hard to know if regular functions always return (infinite loops, exit). There are *LOOPING_CONST_OR_PURE* flags in gcc (and a hardcoded list of builtins in call_may_noreturn_p), we could generalize them to a may_not_return / always_returns flag (and associated attribute), that gcc could recursively infer on many functions (a similar no_longjmp flag could also be added to avoid the need for -fno-longjmp). I think this is the most promising direction. Assuming that we never do the things that would break the optimization (call realloc or free in a function that won't return) seems too complicated to explain properly, so I wouldn't even propose a -fI-promise-it-is-ok.
Some more transformations for the list: p = malloc (n); memcpy (p, q, m); free (q); ==> p = realloc (q, n); it isn't equivalent, in particular it could be slower if m is much smaller than what q points to, but I think it should generally be safe and profitable. Doing it without the memcpy is more questionable. --- And in case we could actually see where q was coming from above: q = malloc (n); ... p = realloc (q, n); // --> just p = q; Maybe I shouldn't list more without implementing at least one first... --- A detail I had forgotten for the heap to stack (malloc+free -> alloca) transformation is that the stack deallocation (which we clearly want, especially if the allocation is in a loop) can't be inserted exactly where free was. Otherwise, for: p = malloc (m); q = malloc (n); free (p); free (q); // deallocation not in reverse order! bad things could happen. Besides, there could already be VLAs in the function, and even without doing stack deallocation on free there are already strange interactions if we call alloca from inside the scope of a VLA.
void f (double * __restrict a) { int * __restrict p = (int*) __builtin_malloc (sizeof (int)); *p = 42; __builtin_free (p); ++*a; // Breaks the optimization! } Normally, gcc now manages to remove unused malloc+write+free sequences. However, it seems that completely unrelated operations, even after free, can prevent this optimization (it is the detection that *p=42 is dead that fails). That limits the effect in real-world code... (funny that if I remove the line with free(p), gcc does manage to optimize)
On Fri, 25 Oct 2013, glisse at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19831 > > --- Comment #17 from Marc Glisse <glisse at gcc dot gnu.org> --- > void f (double * __restrict a) { > int * __restrict p = (int*) __builtin_malloc (sizeof (int)); > *p = 42; > __builtin_free (p); > ++*a; // Breaks the optimization! > } > > Normally, gcc now manages to remove unused malloc+write+free sequences. > However, it seems that completely unrelated operations, even after free, can > prevent this optimization (it is the detection that *p=42 is dead that fails). > That limits the effect in real-world code... > > (funny that if I remove the line with free(p), gcc does manage to optimize) That's just DCE doing more than it is designed to do in some very rare corner cases. Look why DSE doesn't do its job instead.
Author: glisse Date: Tue Oct 29 13:19:08 2013 New Revision: 204160 URL: http://gcc.gnu.org/viewcvs?rev=204160&root=gcc&view=rev Log: 2013-10-29 Marc Glisse <marc.glisse@inria.fr> PR tree-optimization/19831 gcc/ * tree-ssa-alias.c (stmt_kills_ref_p_1): Handle BUILT_IN_FREE. gcc/testsuite/ * gcc.dg/tree-ssa/alias-25.c: New file. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-25.c (with props) Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-alias.c Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-25.c ('svn:eol-style' added) Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-25.c ('svn:keywords' added)
The original cases are all fixed but what remains is us failing to elide void f () { void *p = __builtin_malloc (1); if (!p) __builtin_abort (); __builtin_free (p); } if that's even desirable. Note that we mark the abort() call as necessary since it has side-effects which is contrary to this expectation. If you think of, say, extern int alloc_fails; extern int alloc_success; void f () { void *p = __builtin_malloc (1); if (!p) alloc_fails++; else alloc_success++; __builtin_free (p); } then the question is really whether we may assume that malloc does not return NULL. We can then avoid marking 'p' necessary on NULL checks on malloc returned pointers and upon DCEing the malloc/free pair replace the def of 'p' with a non-zero constant. Related would be to play similar things with realloc - replace void *p = realloc (q, n); free (p); with free (q); and void *p = realloc (q, n); if (p == q) not_copied++; free (p); with void *p = q; if (p == q) not_copied++; free(q); so here we'd not only have to deal with != NULL but other pointer equality checks.
Hello! You can examine a list of the required documents here in one file: xpawel.com/miss-edythe-mccullough/gcc-bugzilla-12.zip -----Original Message----- On Thursday, 25 March 2021, 15:07, <gcc-bugzilla@gcc.gnu.org> wrote: > Hello! > > You can examine a list of the required documents here in one file: > > xpawel.com/miss-edythe-mccullough/gcc-bugzilla-12.zip