This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 04/08] PR jit/63854: Remove xstrdup from ipa/cgraph fprintf calls


On Wed, 2014-11-26 at 09:13 +0100, Uros Bizjak wrote:
> Hello!
> 
> > cgraph*.c and ipa-*.c use xstrdup on strings when dumping them via
> > fprintf, leaking all of the duplicated buffers.
> >
> > Is/was there a reason for doing this?
> 
> Yes, please see [1] and PR 53136 [2]. As said in [1]:
> 
> "There is a problem with multiple calls of cgraph_node_name in fprintf
> dumps. Please note that C++ uses caching in
> cxx_printable_name_internal (aka LANG_HOOKS_DECL_PRINTABLE_NAME), so
> when cxx_printable_name_internal is called multiple times from printf
> (i.e. fprintf "%s/%i -> %s/%i"), it can happen that the first string
> gets evicted by the second call, before fprintf is fully evaluated."
> 
> > Taking them out fixes these leaks (seen when dumping is enabled):
> 
> But you will get "Invalid read of size X" instead.
> 
> The patch at [1] fixed these, but introduced memory leaks, which were
> tolerable at the time:
> 
> "I think that small memory leak is tolerable here (the changes are
> exclusively in the dump code), and follows the same approach as in
> java frontend."
> 
> It seems that these assumptions are not valid anymore.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01904.html
> [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53136

Aha!   Thanks.

I was only running under valgrind when testing the jit code, and the jit
code looks like a frontend to the rest of GCC, it doesn't provide an
implementation of LANG_HOOKS_DECL_PRINTABLE_NAME.  I ran the rest of the
gcc testsuite outside of gcc.  So perhaps the duplications of the
transient decl strings is still needed, and my patch would undo that
fix.  Oops.

I'm about to disappear for a US holiday, but maybe a good solution here
is to invent something like a "xstrdup_for_dump" function like this:

  /* When using fprintf (or similar), problems can arise with
     transient generated strings.  Many string-generation APIs
     only support one result being alive at once (e.g. by
     returning a pointer to a statically-allocated buffer).

     If there is more than one generated string within one
     fprintf call: the first string gets evicted or overwritten
     by the second, before fprintf is fully evaluated.
     See e.g. PR/53136.

     This function provides a workaround for this, by providing
     a simple way to create copies of these transient strings,
     without the need to have explicit cleanup:

         fprintf (dumpfile, "string 1: %s string 2:%s\n",
                  xstrdup_for_dump (EXPR_1),
                  xstrdup_for_dump (EXPR_2));
     
     The copied string is eventually freed, from code within
     toplev::finalize.  */

  extern const char *
  xstrdup_for_dump (const char *transient_str);

We could then use this instead of xstrdup at these callsites, which
would document what's going on.

(I'm not in love with the name, but hopefully the idea is clear).

This could be implemented either:

(A) in terms of the "long-term allocator" idea here:
  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02420.html
which uses a obstack on the gcc::context, eventually freeing the
allocations from toplev::finalize (although that's currently only called
from libgccjit.so, but we could call it from cc1 etc when running under
valgrind).

(B) or in terms of ggc_xstrdup, with the assumption that no GC can occur
in anything within a fprintf call.

That would document what's going on, ensure that we don't have
use-after-free issues, and stop memory leak warnings from valgrind
concerning these duplicates.

Thoughts?
Dave


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]