Bug 19823 - java fails with non-executable memory
Summary: java fails with non-executable memory
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libgcj (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 3.4.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-08 14:37 UTC by Michael Matz
Modified: 2005-02-22 19:02 UTC (History)
5 users (show)

See Also:
Host: i686-linux-gnu
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-02-10 06:11:14


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Matz 2005-02-08 14:37:18 UTC
Newer linux kernels (2.6.11 in this case) check the executable for a 
PT_GNU_STACK program header, and if it exists default to provide 
non-executable memory (for stack _and_ malloced memory) on CPUs which 
support this (all x86-64 CPUs and newer x86 ones). 
 
It seems that gij is not prepared to handle this.  This can be seen 
in some testresults, e.g. here: 
http://gcc.gnu.org/ml/gcc-testresults/2005-02/msg00223.html 
 
This is the autovect branch, but it's the same on all GCC versions (3.3 
through 4.0).  This is with such a new kernel, with an older kernel which 
didn't do this yet all those testcases work. 
 
Currently Andi Kleen proposed again to switch this off on the kernel side 
as a hot fix, because too much software currently breaks.  But somewhen it 
will be activated for sure, and then GCC should be able to cope with this. 
 
My guess is, that there only are missing some mprotect calls at the right 
places.
Comment 1 Andrew Pinski 2005-02-08 14:41:41 UTC
I think this is a libffi problem in that it creates a closure but does not make it exutable.
Comment 2 Andrew Haley 2005-02-08 19:25:41 UTC
I don't think this is a libffi problem.  gcj allocates trampolines on the heap,
not the stack.

I think this is a multilib problem.
Comment 3 Bryce McKinlay 2005-02-09 04:11:06 UTC
Note that TestProxy is failing too, which does seem to indicate a problem with
ffi closures.

boehm-gc does have code to call mprotect() on its heap (see PROTECT/UNPROTECT in
os_dep.c), but it isn't used because we don't build the GC with MPROTECT_VDB.

It sounds like, if the heap is now not exec-enabled by default, that the GC will
have to unconditionally call mprotect() in order to set PROT_EXEC when
initializing its heap?

cc'ing Hans for his comments.
Comment 4 Hans Boehm 2005-02-09 05:38:24 UTC
I believe that the GC alters permissions on the heap only if either
- It is running in incremental mode, or
- It is built with USE_MMAP, and thus uses mmap to allocate the heap.
I think we talked about always doing the latter, but we don't.  Thus I think the
heap is allocated with sbrk.  If the kernel no longer gives execute permission
for sbrk memory, then that would propagate through to the Java heap.  That would
presunably break libffi.  It should be possible to confirm a lot of this with
strace.

I think the collector is configured to disable execute permission whenever it
resets permissions.  Thus if we used mmap, libffi would presumably break
consistently everywhere it tries to put trampolines in the heap.  (It doesn't
need trampolines on all platforms.)

If all of this is correct, then building the collector with USE_MMAP, and
telling it to enable execute permission, might be a work-around.
Comment 5 Andrew Haley 2005-02-09 10:47:48 UTC
I changed the gc settings to enable USE_MMAP on Linux.
I had to do this because at least one Linux kernel didn't give exec permission
on the heap.  That change did work at the time.

2004-01-20  Andrew Haley  <aph@redhat.com>

	* include/private/gcconfig.h (USE_MMAP): Define for all Linux.
Comment 6 Andreas Jaeger 2005-02-09 18:27:13 UTC
I looked at my build dir (4.0 CVS) and everything looks ok.  I guess 
I should run some strace - could somebody tell me how to invoke gij 
from the commandline in a simple case so that I can check this? 
Comment 7 Tom Tromey 2005-02-09 20:09:39 UTC
I'm not so sure that the patch to always define USE_MMAP really
has that effect.  By my reading this patch is revision 1.36, but
if you diff against 1.35 and then go look at include/private/gcconfig.h,
it looks like this addition takes place in the middle of a big
"#ifdef M68K".

Am I misreading this?  I have no explanation of why things work on
my system.
Comment 8 Hans Boehm 2005-02-10 00:33:10 UTC
I agree with Tom's interpretation.  The define for USE_MMAP only affects 
M68K/LINUX.  Confirmed with strace on IA/64.

I'd prefer to see the USE_MMAP definition in a gcj-specific configuration 
file.  I'd like to move to using sbrk and then mmap on Linux.  If we do that, 
presumably we want USE_MMAP to still mean consistent use of mmap.  And we 
presumably then want to do that only on platforms on which libffi requires it.
Comment 9 Andreas Jaeger 2005-02-10 06:11:12 UTC
I agree, the USE_MMAP is not global.  I'm trying now the following patch to see 
whether we're really on the right track: 
============================================================ 
Index: boehm-gc/include/private/gcconfig.h 
--- boehm-gc/include/private/gcconfig.h 13 Oct 2004 10:34:21 -0000      1.39 
+++ boehm-gc/include/private/gcconfig.h 10 Feb 2005 06:08:42 -0000 
@@ -1888,6 +1888,10 @@ 
 #   endif 
 # endif 
 
+#if defined(LINUX) 
+# define USE_MMAP 
+#endif 
+ 
 #if defined(LINUX) && defined(USE_MMAP) 
     /* The kernel may do a somewhat better job merging mappings etc.   */ 
     /* with anonymous mappings.                                               
*/ 
 
If you have a patch for sbrk usage, I can try that one also. 
Comment 10 Andreas Jaeger 2005-02-10 09:22:54 UTC
 
With my patch, the results look good again (this is on x86-64 with multilibs): 
 
               === libjava Summary for unix === 
 
# of expected passes            3726 
# of expected failures          14 
# of untested testcases         16 
 
So, we are on the right track. 
Comment 11 Andrew Haley 2005-02-10 10:58:42 UTC
It looks like the patch was applied to the wrong place in the file: it certainly
was my intention to apply it to all Linux.  And indeed, my testing was not on
m68k, but on x86-64.

The obvious question is whether gc allocated memory should have exec permission
by default.



Comment 12 Andrew Haley 2005-02-14 14:56:13 UTC
We need to make this change soon.  I'd like to do something that Hans would
approve of, but I don't know exactly what that might be.

Andreas Jaeger says his patch works.  Unless someone comes up with something
better, we should go with that.
Comment 13 Jakub Jelinek 2005-02-14 15:07:21 UTC
Is it desirable to have all memory allocated by the GC executable though?
I think libgcj always knows in advance what memory will it need executable,
so wouldn't it be better to provide separate allocation entrypoint
(I think void *GC_malloc_atomic_executable (size_t) would be enough)
that would be giving executable memory and otherwise you'd get non-executable
one (if OS allows that)?
Comment 14 Andrew Haley 2005-02-14 15:18:26 UTC
This seems to me like creeping featurism.

We need to distinguish between fixing this bug in a simple way and adding "nice"
new properties that would require a change to the garbage collector's API.  At
this stage in gcc 4.0 development I don't think that is appropriate.
Comment 15 Hans Boehm 2005-02-14 19:38:25 UTC
I'd like to see USE_MMAP set in boehm-gc/configure.ac instead of gcconfig.h.  
There's already a comment there that we're not setting NO_EXECUTE_PERMISSION 
for this reason, where the upstream version does set it.  I also think we've 
been trying to keep gcconfig.h shared, with gcj-specific adjustments in 
configure.ac.

It seems to me that, with this minor adjustment, this is clearly the right 
short term fix.

I'm not sure that we should be turning on execute permission on all platforms, 
though.  I'm pretty sure it's unnecessary on IA64.  I think it should be 
unnecessary on all platforms on which function pointers point to an (ip, GOT) 
pair.  IIRC, that includes PowerPC?  At least IA64 does not put trampoline code 
in the closure.

Having said that, I'm not sure that having the Java heap be executable is a 
major security risk.

There can occasionally be a large performance down side to making the heap 
executable on some platforms, at least in the presence of incremental GC.  I 
know that some old Un*x versions ensured d-cache and i-cache coherency whenever 
you made an mprotect call with the execute bit set.  And this could take a 
substantial amount of time. 

Having the collector allocate both executable and non-executable memory sounds 
non-trivial.  Probably it would be easiest to add a new "kind" of pointer-free, 
executable memory, and have the GC adjust permissions when a new block was 
moced between kinds.  But that may not be cheap (see above).  And there are 
issues if the physical page size is larger than the heap block size.

The alternative is to essentially keep a separate heap, which would require a 
major code reorganization, and introduces a new and different kind of 
fragmentation.  I suspect it would be far easier to have the interpreter 
allocate closures outside the Java heap.
Comment 16 GCC Commits 2005-02-16 04:10:46 UTC
Subject: Bug 19823

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	bryce@gcc.gnu.org	2005-02-16 04:10:42

Modified files:
	boehm-gc       : ChangeLog configure.ac configure.host configure 
	boehm-gc/include: gc_config.h.in 
	boehm-gc/include/private: gc_priv.h 

Log message:
	2005-02-15  Bryce McKinlay  <mckinlay@redhat.com>
	
	PR libgcj/19823
	* configure.host: Set gc_use_mmap on *-linux*.
	* configure.ac: Define USE_MMAP if gc_use_mmap is set.
	* include/private/gc_priv.h: Include gc_config.h.
	* configure, include/gc_config.h.in: Rebuilt.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/ChangeLog.diff?cvsroot=gcc&r1=1.218&r2=1.219
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/configure.ac.diff?cvsroot=gcc&r1=1.11&r2=1.12
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/configure.host.diff?cvsroot=gcc&r1=1.7&r2=1.8
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/configure.diff?cvsroot=gcc&r1=1.94&r2=1.95
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/include/gc_config.h.in.diff?cvsroot=gcc&r1=1.5&r2=1.6
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/include/private/gc_priv.h.diff?cvsroot=gcc&r1=1.16&r2=1.17

Comment 17 Bryce McKinlay 2005-02-16 04:22:20 UTC
Michael (or someone else who has seen this bug),

Could you confirm that this patch fixes it?

Thanks
Comment 19 Andrew Pinski 2005-02-22 19:02:02 UTC
Fixed in 3.4.4.