Bug 17021 - libgcj verifier resolves classes too eagerly
Summary: libgcj verifier resolves classes too eagerly
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libgcj (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.2.0
Assignee: Robert Schuster
URL:
Keywords:
: 17980 22463 24385 (view as bug list)
Depends on:
Blocks: 12725
  Show dependency treegraph
 
Reported: 2004-08-13 21:07 UTC by Bryce McKinlay
Modified: 2006-02-21 10:37 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-09-24 15:57:14


Attachments
new version of compatible (499 bytes, text/plain)
2005-10-18 16:38 UTC, Robert Schuster
Details
assignments to object (410 bytes, text/plain)
2005-10-18 16:40 UTC, Robert Schuster
Details
new version of to_array method (565 bytes, text/plain)
2005-10-18 17:01 UTC, Robert Schuster
Details
a test class for the aanewarray lazyfication (310 bytes, text/plain)
2005-10-18 17:09 UTC, Robert Schuster
Details
proposed fix (2.57 KB, text/plain)
2005-10-21 16:15 UTC, Robert Schuster
Details
proposed fix (3.04 KB, text/plain)
2005-10-30 19:14 UTC, Robert Schuster
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bryce McKinlay 2004-08-13 21:07:25 UTC
NoClassDefFoundError should only be thrown if a code path that attempts to use
the missing class is actually used. GIJ, however, throws the error too early.
This can be seen by compiling the following test case and deleting the class
file for "Bar".

$ gcj -C Foo.java Bar.java 
$ rm Bar.class 
$ gij Foo

Exception in thread "main" java.lang.NoClassDefFoundError: while resolving
class: Foo 
   at java.lang.ClassLoader.resolveClass0(java.lang.Class)
(/usr/lib/libgcj.so.4.0.0)
   at java.lang.Class.initializeClass() (/usr/lib/libgcj.so.4.0.0)
   at java.lang.Class.forName(java.lang.String, boolean, java.lang.ClassLoader)
(/usr/lib/libgcj.so.4.0.0)
...

public class Foo
{
  public boolean missing_class_code_path = false;

  public static void main(String[] args)
  {
    new Foo().j();
  }

  void j()
  {
    if (missing_class_code_path)
      new Bar().x();
  }
}

public class Bar 
{
  public void x() {}
}
Comment 1 Andrew Pinski 2004-08-13 21:51:18 UTC
Confirmed, I thought I saw a bug like this somewhere.
Comment 2 Tom Tromey 2004-10-13 21:33:24 UTC
*** Bug 17980 has been marked as a duplicate of this bug. ***
Comment 3 Andrew Pinski 2005-07-13 18:12:42 UTC
*** Bug 22463 has been marked as a duplicate of this bug. ***
Comment 4 Bryce McKinlay 2005-07-13 18:18:34 UTC
Updated summary to reflect that this is actually caused by the libgcj verifier.

A workaround is to use "gij -noverify"
Comment 5 Robert Schuster 2005-10-16 00:44:17 UTC
*** Bug 24385 has been marked as a duplicate of this bug. ***
Comment 6 Robert Schuster 2005-10-18 00:18:27 UTC
With the help of Tromey and the team we made an interesting obervation:

class Foo{
     static Object o;

     void method(){
          o = new RemoveClassFile();
     }
}

This is accepted by you-know-which VM but rejected by GIJ (processing putstatic opcode in verify.cc leads to _Jv_BytecodeVerifier::ref_intersection::compatible which in turn resolves the RHS).

Now takes this situation:

class Foo{
     static Thread t;

     void method(){
          t = new RemoveClassFileFromClassWhichInheritsFromThread();
     }
}

This is rejected by the 'other' verifier, too if no class file could be found. The idea seems to be that the assignment test is short-cutted if the LHS is Object.

Now we have to find and fix every situation where this can happen. Here is a partial list:
- static assignment
- non-static assignment
- array assignment
- arguments to methods
- return values
- type state merges (only tromey knows what this is ;) )

*Important*:
If you have real world class files where this bug happens please send them to me.
Comment 7 Robert Schuster 2005-10-18 16:38:12 UTC
Created attachment 10019 [details]
new version of compatible

I added a check which looks whether the LHS of an expression is java.lang.Object. If this is the case we do not have to look what the RHS actually is.

(This change will be part of the final patch.)
Comment 8 Robert Schuster 2005-10-18 16:40:04 UTC
Created attachment 10020 [details]
assignments to object

This test class does all kinds of assignments to java.lang.Object variables. If you remove the class files for Stuff1 and Stuff2 this will work with the changes made to compatible().
Comment 9 Robert Schuster 2005-10-18 17:01:10 UTC
Created attachment 10021 [details]
new version of to_array method

This is the lazyfied version of the to_array method which is uses for the aanewarray opcode.
Comment 10 Robert Schuster 2005-10-18 17:09:55 UTC
Created attachment 10022 [details]
a test class for the aanewarray lazyfication

This tests lazy array instantiation which was not possible before. Need to check whether one of my changes fixed the "dontKnowIfThisCouldWork" method or not.
Comment 11 Robert Schuster 2005-10-19 16:14:51 UTC
In the JavaUtils class of axis 1.2 RC2 contains the following instruction:

639: invokevirtual <Method javax.activation.DataHandler.getContent ()java.lang.Object>

With a missing DataHandler class file this JavaUtils class can be verified successfully on sun' s jvm but fails at this instruction on gij.

I traced the problem back to a call to in 

_Jv_BytecodeVerifier::ref_intersection::compatible(_Jv_BytecodeVerifier::ref_intersection*, _Jv_BytecodeVerifier*)

In that method the names of the two classes (which are not resolved) are the following:

(gdb) putf self->data.name
$152 = 0x121054 "javax.activation.DataHandler"
(gdb) putf other_iter->data.name
$153 = 0x80736ac "Ljavax.activation.DataHandler;"

Obviously this denotes the same classnames but for Jv_equalUtf8Consts which is called in _Jv_BytecodeVerifier::ref_intersection::compatible() the strings are different.

I propose to fix the problem in the following way: Instead of
if (! self->is_resolved
  && ! other_iter->is_resolved
  && _Jv_equalUtf8Consts (self->data.name,
  other_iter->data.name))

do

if (! self->is_resolved
  && ! other_iter->is_resolved
  && ( _Jv_equalUtf8Consts (self->data.name, other_iter->data.name) || match_classname(self->data.name, other_iter->data.name ) )

match_classname is then implemented in a way that it would accept all these names as equal:
java/lang/Class
Ljava/lang/Class;
java.lang.Class
Ljava.lang.Class;

However there might be another approach to the problem: One could make sure that class names in _Jv_UTF8_Const objects are always in the Lp1/p2/p3/classname; form.

Please comment.
Comment 12 Robert Schuster 2005-10-21 16:13:36 UTC
Tromey suggested to not transform all Utf8Const object into a canonical form so I fixed the invokevirtual problem in the first way.

I introduced a new method _Jv_equalUt8Const_classnames method which reliably compares classnames of different formats and finds out whether they denote the same Java class.

However before I could test the new code I found out that gcj failed to load a certain "[java.lang.String". This was introduced by the new to_array code which assumed that classnames where given in the "Lp1/p2/cn;" format. Unfortunately this is not the case and I had to make that code aware of that fact.

I attach the final patch which fixes
a) the problems mentioned in this bug
b) the problems with array instantiations
c) the problem that return type checking always provoked class resolution

Please review.
Comment 13 Robert Schuster 2005-10-21 16:15:14 UTC
Created attachment 10041 [details]
proposed fix

This fixes this bug and bunch of related problems.
Comment 14 Bryce McKinlay 2005-10-25 20:36:43 UTC
Robert, thanks very much for working on this. Examining the behaviour of Sun's verifier a bit more shows that it does attempt to resolve classes where type compatibility can not be proven by a simple string comparison, so I think that your approach is correct. 

I have one pedantic concern about the implementation of _Jv_equalUtf8Const_classnames: 

Say we're comparing a class called "Lfoo" and one called "fool", and fool is given in the bytecode form while "Lfoo" is in the regular form. So, _Jv_equalUtf8Const_classnames would end up comparing the strings "Lfoo" and "Lfool;". Whats to stop it falsely returning true in this case? 

Also, how about a more concise name for this function: _Jv_equalUtf8Classnames?  

It will also need a ChangeLog entry, of course - other than these issues, this patch looks pretty good.
Comment 15 Robert Schuster 2005-10-30 19:14:32 UTC
Created attachment 10082 [details]
proposed fix

This patch includes all the lazyfications and a fix for to_array() involving primitive array names which caused gcj to crash. I also changed the name of the classname comparation method as suggested by bryce.

The special case where we have a class Lfoo and fool (the latter given as "Lfool;") is now handled correctly.

There was a need for a new helper method. I hope its name is well chosen.

I am currently waiting for my libgcj assignment papers. This gives you some time to comment on it. :)
Comment 16 Robert Schuster 2006-02-21 10:37:52 UTC
Fixed by the following patches:

Main patch:
http://gcc.gnu.org/ml/java-patches/2006-q1/msg00124.html

Build fix for ARM:
http://gcc.gnu.org/ml/java-patches/2006-q1/msg00143.html

Stylistic: remove casts
http://gcc.gnu.org/ml/java-patches/2006-q1/msg00144.html

Stylistic: Indentation
http://gcc.gnu.org/ml/java-patches/2006-q1/msg00228.html