Bug 64206 - fake.so is unlinked too early for some users
Summary: fake.so is unlinked too early for some users
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: jit (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: ---
Assignee: David Malcolm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-06 00:07 UTC by Ulrich Drepper
Modified: 2015-04-13 19:56 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-12-06 00:00:00


Attachments
Work-in-progress patch to fix this (5.17 KB, patch)
2014-12-08 16:09 UTC, David Malcolm
Details | Diff
Updated patch (1.93 KB, patch)
2014-12-09 20:05 UTC, David Malcolm
Details | Diff
Little hello world (729 bytes, text/x-c++src)
2015-04-13 18:02 UTC, Ulrich Drepper
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ulrich Drepper 2014-12-06 00:07:48 UTC
Some users of the DSO created by the JIT (probably mostly debuggers) might have a hard time getting to the file before it gets unlinked.  For some gdb versions, for instance, this is fatal.  Try gdb 7.8.1, for instance, see

https://bugzilla.redhat.com/show_bug.cgi?id=1170861

That is certainly gdb's fault but it's an example about the type of problems that can appear.

It certainly is useful to unlink the file as quickly as possible so that in case of a problem crash nothing is left behind.  But there at least should be the possibility to prevent the early unlink.  Dave suggested to tie this to the enabling of debuginfo generation in libgccjit.  I'm actually not entirely sure that's the best possibility since even without debuginfo the debugger can use the ELF symbols to place breakpoints etc.  Maybe a boolean option?

As a solution it should be quite easy to transfer ownership of the file and directory from playback::context to result.
Comment 1 David Malcolm 2014-12-06 01:44:03 UTC
(In reply to drepper.fsp+rhbz@gmail.com from comment #0)
> Some users of the DSO created by the JIT (probably mostly debuggers) might
> have a hard time getting to the file before it gets unlinked.  For some gdb
> versions, for instance, this is fatal.  Try gdb 7.8.1, for instance, see
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1170861
> 
> That is certainly gdb's fault but it's an example about the type of problems
> that can appear.

For reference, this works for me with:
  gdb-7.6.50.20130731-19.fc20.x86_64

> It certainly is useful to unlink the file as quickly as possible so that in
> case of a problem crash nothing is left behind.  But there at least should
> be the possibility to prevent the early unlink.  Dave suggested to tie this
> to the enabling of debuginfo generation in libgccjit.  I'm actually not
> entirely sure that's the best possibility since even without debuginfo the
> debugger can use the ELF symbols to place breakpoints etc.  Maybe a boolean
> option?
> 
> As a solution it should be quite easy to transfer ownership of the file and
> directory from playback::context to result.

I'm working on a patch that supports delaying the cleanup of the tempdir, with the cleanup normally occurring at the end of the compile, after the .so has been dlopened, but deferring it to when gcc_jit_result_release occurs if the user set GCC_JIT_BOOL_OPTION_DEBUGINFO.

FWIW I think of this flag as a
  "make debugging work please, I'm willing to take a performance hit"
request by the user, rather than specifically for the generation of debuginfo.

I want to avoid a proliferation of lots of options that need to be enabled, since this makes the library more difficult to use.  Obviously a tradeoff though.  (Maybe that flag should be renamed?)
Comment 2 David Malcolm 2014-12-08 16:09:45 UTC
Created attachment 34219 [details]
Work-in-progress patch to fix this

The attached patch implements the ideas we talked about (needs some comments and a ChangeLog).

Uli: does this fix the issue for you?
Comment 3 David Malcolm 2014-12-09 20:00:39 UTC
Author: dmalcolm
Date: Tue Dec  9 20:00:07 2014
New Revision: 218533

URL: https://gcc.gnu.org/viewcvs?rev=218533&root=gcc&view=rev
Log:
Add jit-tempdir.{c|h}

gcc/jit/ChangeLog:
        PR jit/64206
	* Make-lang.in (jit_OBJS): Add jit/jit-tempdir.o.
	* jit-common.h (gcc::jit::tempdir): New forward decl.
	* jit-playback.c: Include jit-tempdir.h.
	(gcc::jit::playback::context::context): Initialize m_tempdir.
	(gcc::jit::playback::context::~context): Move tempdir
	cleanup to new file jit-tempdir.c
	(make_tempdir_path_template): Move to new file jit-tempdir.c.
	(gcc::jit::playback::context::compile): Move tempdir creation
	to new tempdir object in new file jit-tempdir.c.
	(gcc::jit::playback::context::make_fake_args): Get path from
	tempdir object rather than from member data.
	(gcc::jit::playback::context::convert_to_dso): Likewise.
	(gcc::jit::playback::context::dlopen_built_dso): Likewise.
	(gcc::jit::playback::context::dump_generated_code): Likewise.
	(gcc::jit::playback::context::get_path_c_file): New function.
	(gcc::jit::playback::context::get_path_s_file): New function.
	(gcc::jit::playback::context::get_path_so_file): New function.
	* jit-playback.h (gcc::jit::playback::context::get_path_c_file):
	New function.
	(gcc::jit::playback::context::get_path_s_file): New function.
	(gcc::jit::playback::context::get_path_so_file): New function.
	(gcc::jit::playback::context): Move fields "m_path_template",
	"m_path_tempdir", "m_path_c_file", "m_path_s_file",
	"m_path_so_file" to new jit::tempdir class; add field "m_tempdir".
	* jit-tempdir.c: New file.
	* jit-tempdir.h: New file.


Added:
    trunk/gcc/jit/jit-tempdir.c
    trunk/gcc/jit/jit-tempdir.h
Modified:
    trunk/gcc/jit/ChangeLog
    trunk/gcc/jit/Make-lang.in
    trunk/gcc/jit/jit-common.h
    trunk/gcc/jit/jit-playback.c
    trunk/gcc/jit/jit-playback.h
Comment 4 David Malcolm 2014-12-09 20:05:36 UTC
Created attachment 34233 [details]
Updated patch
Comment 5 David Malcolm 2014-12-09 20:08:06 UTC
(In reply to David Malcolm from comment #4)
> Created attachment 34233 [details]
> Updated patch

I've introduced a gcc::jit::tempdir class in r218533 (which appears to be a reasonable simplification in itself)

Attachment 34233 [details] implements the idea of using this tempdir class to defer cleanup until later.

Does this fix the symptoms you're seeing?
Comment 6 David Malcolm 2015-01-09 17:01:37 UTC
Author: dmalcolm
Date: Fri Jan  9 17:01:04 2015
New Revision: 219395

URL: https://gcc.gnu.org/viewcvs?rev=219395&root=gcc&view=rev
Log:
PR jit/64206: delay cleanup of tempdir if the user has requested debuginfo

gcc/jit/ChangeLog:
	PR jit/64206
	* docs/internals/test-hello-world.exe.log.txt: Update, the log now
	shows tempdir creation/cleanup.
	* docs/_build/texinfo/libgccjit.texi: Regenerate.
	* jit-logging.h (class gcc::jit::log_user): Add gcc::jit::tempdir
	to the list of subclasses in the comment.
	* jit-playback.c (gcc::jit::playback::context::context): Add a
	comment clarifying when the tempdir gets cleaned up.
	(gcc::jit::playback::context::compile): Pass the context's logger,
	if any, to the tempdir.
	(gcc::jit::playback::context::dlopen_built_dso): When creating the
	gcc::jit::result, if GCC_JIT_BOOL_OPTION_DEBUGINFO is set, hand
	over ownership of the tempdir to it.
	* jit-result.c: Include "jit-tempdir.h".
	(gcc::jit::result::result): Add tempdir param, saving it as
	m_tempdir.
	(gcc::jit::result::~result): Delete m_tempdir.
	* jit-result.h (gcc::jit::result::result): Add tempdir param.
	(gcc::jit::result::m_tempdir): New field.
	* jit-tempdir.c (gcc::jit::tempdir::tempdir): Add logger param;
	add JIT_LOG_SCOPE.
	(gcc::jit::tempdir::create): Add JIT_LOG_SCOPE to log entry/exit,
	and log m_path_template and m_path_tempdir.
	(gcc::jit::tempdir::~tempdir): Add JIT_LOG_SCOPE to log
	entry/exit, and log the unlink and rmdir calls.
	* jit-tempdir.h: Include "jit-logging.h".
	(class gcc::jit::tempdir): Make this be a subclass of log_user.
	(gcc::jit::tempdir::tempdir): Add logger param.
	* notes.txt: Update to show the two possible places where the
	tempdir can be cleaned up.


Modified:
    trunk/gcc/jit/ChangeLog
    trunk/gcc/jit/docs/_build/texinfo/libgccjit.texi
    trunk/gcc/jit/docs/internals/test-hello-world.exe.log.txt
    trunk/gcc/jit/jit-logging.h
    trunk/gcc/jit/jit-playback.c
    trunk/gcc/jit/jit-result.c
    trunk/gcc/jit/jit-result.h
    trunk/gcc/jit/jit-tempdir.c
    trunk/gcc/jit/jit-tempdir.h
    trunk/gcc/jit/notes.txt
Comment 7 David Malcolm 2015-01-09 17:07:14 UTC
I ran into this issue myself.  I've committed the workaround as of r219395.
Comment 8 Ulrich Drepper 2015-04-13 15:03:12 UTC
(In reply to David Malcolm from comment #5)
> Does this fix the symptoms you're seeing?

Sorry for the delay, I'm terribly behind.

I just tried it and I don't see any improvement.  This is on Fedora 21 with a mainline gcc.  By the call to the loaded function is made the entire directory the jit uses is gone.  This is before the call to gcc_jit_results_release.  Since I don't have a fixed gdb (or more correct: BFD) I still see the gdb crash.

I haven't looked at the logic of your patch.  From my perspective the right solution is still to enable, on request, to delay removing all the files until the call to gcc_jit_result_release.  There is already this interface available, let's use it for one more thing.  For production runs we probably want the current behavior.

I don't think you need any code to reproduce, I see the problem with a trivial hello world like the one below.  Just put a breakpoint on line 55 (the call to some_fn) and step into the function.  I immediately get

Can't read data for section '.eh_frame' in file '/tmp/libgccjit-a07Nh7/fake.so'

and upon issuing p $pc gdb will crash (gdb 7.8.2-38.fc21).
Comment 9 Ulrich Drepper 2015-04-13 18:02:21 UTC
Created attachment 35307 [details]
Little hello world

Probably copied from the documentation, nothing special.
Comment 10 David Malcolm 2015-04-13 18:42:22 UTC
(In reply to drepper.fsp+rhbz@gmail.com from comment #9)
> Created attachment 35307 [details]
> Little hello world
> 
> Probably copied from the documentation, nothing special.

You need to set GCC_JIT_BOOL_OPTION_DEBUGINFO on the context for the generated code to be debuggable:
https://gcc.gnu.org/onlinedocs/jit/topics/contexts.html#GCC_JIT_BOOL_OPTION_DEBUGINFO

You need something
  ctxt.set_bool_option (GCC_JIT_BOOL_OPTION_DEBUGINFO, 1);

This leads to:
  (1) adding -g to the command-line options of the notional compile, so that fake.so includes DWARF data
  (2) delaying cleanup of fake.so until after the gcc_jit_result is released
Comment 11 David Malcolm 2015-04-13 18:42:53 UTC
(In reply to David Malcolm from comment #10)
> You need something
"something like", I meant to say
Comment 12 Ulrich Drepper 2015-04-13 19:56:39 UTC
(In reply to David Malcolm from comment #11)
> (In reply to David Malcolm from comment #10)
> > You need something
> "something like", I meant to say

Sorry, David, I missed that.  At least nothing crashes with this extension.