Bug 51615 - Condition Variable queue state corruption and infinite loop
Summary: Condition Variable queue state corruption and infinite loop
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: libgcj (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-18 22:22 UTC by N8GCBP7SHNBTI79GINADGKJPRTLOCO2A
Modified: 2016-09-30 22:54 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description N8GCBP7SHNBTI79GINADGKJPRTLOCO2A 2011-12-18 22:22:11 UTC
When attempting to run ecj-3.8M4.jar on a large number of files, gij hangs.  (On a small number of files, it runs fine, curiously enough.)

Invoking gdb (7.3.1), I see that the thread is stuck in
(gdb) bt
#0  _Jv_CondWait (cv=0x1729a48, mu=<optimized out>, millis=<optimized out>, nanos=<optimized out>)
    at ../.././../gcc-4.7-20111210/libjava/posix-threads.cc:241
#1  0x00000000419d99b8 in java::lang::Object::wait (this=0x1704900, timeout=250, 
    nanos=<optimized out>) at ../.././../gcc-4.7-20111210/libjava/java/lang/natObject.cc:226
#2  0x0000000042486aa4 in ffi_call_v9 () at ../.././../gcc-4.7-20111210/libffi/src/sparc/v9.S:83
#3  0x0000000042486400 in ffi_call (cif=0x1815f08, fn=<optimized out>, rvalue=0x7fdff7f97e8, 
    avalue=0x7fdff7f9680) at ../.././../gcc-4.7-20111210/libffi/src/sparc/ffi.c:415
#4  0x00000000424830c8 in ffi_java_raw_call (cif=<optimized out>, fn=<optimized out>, 
    rvalue=<optimized out>, raw=<optimized out>)
    at ../.././../gcc-4.7-20111210/libffi/src/java_raw_api.c:300
#5  0x00000000419b6430 in _Jv_InterpMethod::run (retp=0x7fdff7f9aa0, args=0x419b6d9c, meth=0x13a0c00)
    at ../.././../gcc-4.7-20111210/libjava/interpret-run.cc:613
#6  0x0000000042483028 in ffi_java_translate_args (cif=<optimized out>, rvalue=<optimized out>, 
    avalue=<optimized out>, user_data=<optimized out>)
    at ../.././../gcc-4.7-20111210/libffi/src/java_raw_api.c:314
#7  0x00000000424867e0 in ffi_closure_sparc_inner_v9 (closure=<optimized out>, rvalue=0x7fdff7f9aa0, 
    gpr=0x7fdff7f9bc0, fpr=0x7fdff7f9ac0) at ../.././../gcc-4.7-20111210/libffi/src/sparc/ffi.c:621
#8  0x0000000042486b90 in ffi_closure_v9 () at ../.././../gcc-4.7-20111210/libffi/src/sparc/v9.S:181
#9  0x0000000041dd2ce8 in java.lang.Thread.run()void (this=<optimized out>)
    at /var/ports/usr/ports/lang/gcc47/work/gcc-4.7-20111210/libjava/java/lang/Thread.java:761
#10 0x00000000419ddbec in _Jv_ThreadRun (thread=<optimized out>)
    at ../.././../gcc-4.7-20111210/libjava/java/lang/natThread.cc:335
#11 0x00000000419e78a8 in really_start (x=<optimized out>)
    at ../.././../gcc-4.7-20111210/libjava/posix-threads.cc:639
#12 0x000000004249950c in GC_start_routine (arg=0x12ca120)
    at ../.././../gcc-4.7-20111210/boehm-gc/pthread_support.c:1301
#13 0x0000000043c68890 in ?? () from /lib/libthr.so.3
#14 0x0000000043c68890 in ?? () from /lib/libthr.so.3

and if I

(gdb) print cv.first
$14 = (_Jv_Thread_t *) 0x446c4830
(gdb) print cv.first.next 
$15 = (_Jv_Thread_t *) 0x446c4830

which is obviously bad since the loop we're stuck in is over ->next pointers until we see a NULL, which we won't.  Note that current has also become corrupted in the same way:

(gdb) print current 
$16 = (_Jv_Thread_t *) 0x446c4860
(gdb) print current.next
$17 = (_Jv_Thread_t *) 0x446c4860

I am on a FreeBSD/sparc64 machine, running 8.2 and using gcc47 from ports (which means exactly 4.7.0 20111210).  It's quite easy to get into this state, so if I've left something out please don't hesitate to ask.
Comment 1 Igor Pashev 2013-06-15 16:06:49 UTC
Here is what I see on illumos:

21071:	gij-4.7 -classpath build/bootstrap/eclipse-ecj.jar:/usr/share/ant/lib/
-----------------  lwp# 1 / thread# 1  --------------------
 fffffd7f4e16f427 lwp_park (0, 0, 0)
 fffffd7f4e167b06 mutex_lock_impl () + 156
 fffffd7f4e167bdb mutex_lock () + b
 fffffd7f42984a81 _Z13_Jv_MutexLockP11_Jv_Mutex_t () + 51
 0000000000733840 ???????? ()
 f2f1e00000000001 ???????? ()
 0000000000000000 ???????? ()
 0000000000000002 ???????? ()
-----------------  lwp# 2 / thread# 2  --------------------
 fffffd7f4e16f427 lwp_park (0, 0, 0)
 fffffd7f4e168f8f cond_wait_queue () + 4f
 fffffd7f4e1695e2 __cond_wait () + b2
 fffffd7f4e169612 cond_wait () + 22
 fffffd7f4e169649 pthread_cond_wait () + 9
 fffffd7f42984c1c _Z12_Jv_CondWaitP23_Jv_ConditionVariable_tP11_Jv_Mutex_txi () + 13c
 0000000000000000 ???????? ()
-----------------  lwp# 3 / thread# 3  --------------------
 fffffd7f4e16f427 lwp_park (0, 0, 0)
 fffffd7f4e167b06 mutex_lock_impl () + 156
 fffffd7f4e167bdb mutex_lock () + b
 fffffd7f42984c3d _Z12_Jv_CondWaitP23_Jv_ConditionVariable_tP11_Jv_Mutex_txi () + 15d
 0000000000000000 ???????? ()
-----------------  lwp# 4 / thread# 4  --------------------
 fffffd7f42984bc7 _Z12_Jv_CondWaitP23_Jv_ConditionVariable_tP11_Jv_Mutex_txi () + e7
 0100000000000000 ???????? ()


I also tried to build openjdk6, and gcj "compiles" 250 files forever :-) I've waited for day or two :-\
Comment 2 Igor Pashev 2013-06-15 16:10:34 UTC
See also #34574
Comment 3 Igor Pashev 2015-01-11 22:12:33 UTC
I managed to work around this issue by disabling multithreaded compilation in ECJ:


--- ecj-3.10.1.orig/src/org.eclipse.jdt.core/org/eclipse/jdt/internal/compiler/batch/Main.java
+++ ecj-3.10.1/src/org.eclipse.jdt.core/org/eclipse/jdt/internal/compiler/batch/Main.java
@@ -4106,7 +4106,7 @@ public void performCompilation() {
        this.batchCompiler.remainingIterations = this.maxRepetition-this.currentRepetition/*remaining iterations including this one*/;
        // temporary code to allow the compiler to revert to a single thread
        String setting = System.getProperty("jdt.compiler.useSingleThread"); //$NON-NLS-1$
-       this.batchCompiler.useSingleThread = setting != null && setting.equals("true"); //$NON-NLS-1$
+       this.batchCompiler.useSingleThread = setting == null || setting.equals("true"); //$NON-NLS-1$
 
        if (this.compilerOptions.complianceLevel >= ClassFileConstants.JDK1_6
                        && this.compilerOptions.processAnnotations) {


The I recompiled eclipse-ecj.jar on a system without this issue (Debian Linux amd64), copied this jar to my system and rebuilt ecj.




P. S. I think Debian Linux/amd64 is also affected, see Bug 51615, but multithreaded compilation works there.
Comment 4 Andrew Pinski 2016-09-30 22:54:15 UTC
Closing as won't fix as libgcj (and the java front-end) has been removed from the trunk.