Bug 23917 - gjdoc doesn't handle the -classpath option correctly
Summary: gjdoc doesn't handle the -classpath option correctly
Status: ASSIGNED
Alias: None
Product: classpath
Classification: Unclassified
Component: gjdoc (show other bugs)
Version: unspecified
: P3 minor
Target Milestone: ---
Assignee: Julian Scheid
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-16 14:39 UTC by Vadim Nasardinov
Modified: 2021-05-04 12:31 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-10-12 16:37:48


Attachments
gjdoc-classpath-broken.tar.gz (the test case) (1019 bytes, application/octet-stream)
2005-09-16 14:41 UTC, Vadim Nasardinov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vadim Nasardinov 2005-09-16 14:39:10 UTC
gjdoc 0.7.5 that ships with Fedora Core 4 doesn't handle the
-classpath option correctly.  I haven't tried the latest CVS version.
Apologies if this has been fixed already.  (I did look at today's
checkout of gjdoc/ChangeLog and didn't notice anything that would
suggest this had been addressed already.)

| $ cat /etc/fedora-release
| Fedora Core release 4 (Stentz)
| $ rpm -qf $(which gjdoc)
| gjdoc-0.7.5-3

Java source files may reference classes supplied by third-party
libraries.  For gjdoc to resolve these references, it has to be able
to find said third-party jars.  In theory, their location is specified
via the -classpath option.  Upon encountering a reference to a class
for which there is no source in the current sourcepath, gjdoc should
look inside the jars listed in the -classpath option, find the
referenced .class files, and parse its bytecode in order to extract
the information needed by gjdoc (superclasses, inherited methods, and
suchlike).

In practice, there is a number of issues with the handling of the
-classpath option in gjdoc.  I'm attaching a test case that
demonstrates the issues.  The contents of the test case are:

| $ find . -type f -printf '%P\n'
| Makefile
| 3rd-party/third/party/lib/Quux.java
| src/my/source/Foo.java

Quux.java is compiled and packaged as third-party.jar to simulate a
sloppily written and/or malicious third-party library.  Foo.java is a
class for which we are trying to generate javadoc with gjdoc.  Foo
extends Quux.

Here's the list of issues:

1. gjdoc tries to pass the value of the -classpath option to the JVM
   in which gjdoc itself is running.  It attempts this feat by doing
   the following

   | $ cvs status src/gnu/classpath/tools/gjdoc/Main.java | grep Working
   |    Working revision:    1.74
   | $ cat -n src/gnu/classpath/tools/gjdoc/Main.java | head -n1476 | tail -n9
   |   1468      options.put("-classpath", new OptionProcessor(2)
   |   1469        {
   |   1470          void process(String[] args)
   |   1471          {
   |   1472            reporter.printWarning("-classpath option could not be passed to the VM.  Faking it with ");
   |   1473            reporter.printWarning("    System.setProperty(\"java.class.path\", \"" + args[0] + "\");");
   |   1474            System.setProperty("java.class.path", args[0]);
   |   1475          }
   |   1476        });

   By the time the JVM is up and running, changing the
   "java.class.path" doesn't do any good.  AFAIK, it's a read-only
   property.  (You can change it, but that doesn't have any effect on
   the system classpath.)

   The effects of trying to set the -classpath flag can be seen by
   running the "doc1" target in the attached test case:

   | $ make doc1
   | mkdir -p build/{classes,jar}
   | gcj -C -d build/classes 3rd-party/third/party/lib/Quux.java
   | jar cf build/jar/third-party.jar -C build/classes .
   | mkdir -p build/doc1
   | gjdoc -d build/doc1 -classpath build/jar/third-party.jar -sourcepath src my.source
   | WARNING: -classpath option could not be passed to the VM.  Faking it with
   | WARNING:     System.setProperty("java.class.path", "build/jar/third-party.jar");
   | Loading classes for package my.source...
   | Constructing Javadoc information...
   | WARNING: Error while loading class Quux
   | WARNING: Cannot locate class third.party.lib.Quux referenced in class my.source.Foo
   | Resolving references in comments...
   | Resolving references in classes...
   | Resolving references in packages...
   | Resolving references in class comments...
   | Resolving references in package comments...
   | Running doclet...
   | Building cross-reference information...
   | Writing overview files...
   | Writing index...
   | Writing HTML files for package my.source
   | 4 warnings

2. The reason why gjdoc tries to set the system classpath to the value
   specified in the -classpath option is so that it can avoid the
   honest labor of bytecode parsing by leveraging reflection instead.

   By default, it doesn't use reflection however.  This can be seen by
   running "make doc2".  In this case, we work around the '-classpath'
   uselessness by setting the environment variable CLASSPATH instead.
   (In theory, gjdoc should ignore CLASSPATH, but it currently honors
   it.  Does this merit a ticket of its own?)

   | $ make doc2
   | mkdir -p build/doc2
   | export CLASSPATH=build/jar/third-party.jar && gjdoc -d build/doc2 -sourcepath src my.source
   | Loading classes for package my.source...
   | Constructing Javadoc information...
   | WARNING: Error while loading class Quux
   | WARNING: Cannot locate class third.party.lib.Quux referenced in class my.source.Foo
   | Resolving references in comments...
   | Resolving references in classes...
   | Resolving references in packages...
   | Resolving references in class comments...
   | Resolving references in package comments...
   | Running doclet...
   | Building cross-reference information...
   | Writing overview files...
   | Writing index...
   | Writing HTML files for package my.source
   | 2 warnings

3. We can get the Quux reference resolved by telling gjdoc to use
   reflection which is done by passing the gjdoc-specific -reflection
   flag.  This is demonstrated by "make doc3":

   | $ make doc3
   | mkdir -p build/doc3
   | export CLASSPATH=build/jar/third-party.jar && gjdoc -reflection -d build/doc3 -sourcepath src my.source
   | Loading classes for package my.source...
   | ------------------------------------------------------------------------------
   | *********               Harmful side effect in action!               *********
   | ------------------------------------------------------------------------------
   | Constructing Javadoc information...
   | WARNING: Cannot locate class third.party.lib.Quux on file system, falling back to reflection.
   | Resolving references in comments...
   | Resolving references in classes...
   | Resolving references in packages...
   | Resolving references in class comments...
   | Resolving references in package comments...
   | Running doclet...
   | Building cross-reference information...
   | Writing overview files...
   | Writing index...
   | Writing HTML files for package my.source
   | 1 warnings

   This time around, we are finally able to get semi-correct javadoc
   where Foo is shown to inherit from third.party.lib.Quux.  This is
   accomplished at the expense of running potentially harmful code in
   third-party.jar.  This is because gjdoc ends up running
   Class.forName("third.party.lib.Quux") which executes the static
   initializer of Quux.

   (I say "semi-correct", because it doesn't show methods inherited by
   Foo from Quux.  Should I file a separate ticket for this?)
Comment 1 Vadim Nasardinov 2005-09-16 14:41:24 UTC
Created attachment 9742 [details]
gjdoc-classpath-broken.tar.gz (the test case)
Comment 2 Julian Scheid 2005-10-12 16:37:48 UTC
Thanks for your detailed bug report, and sorry for the late follow-up.

Vadim Nasardinov <vadimn@redhat.com> wrote:
> The reason why gjdoc tries to set the system classpath to the value
> specified in the -classpath option is so that it can avoid the
> honest labor of bytecode parsing by leveraging reflection instead.

Seems like you know exactly why which decisions have been made in the
gjdoc code, huh?

It's not quiet as easy as you make it out to be, though.  For example,
the gcj-compiled gjdoc binary doesn't necessarily have access to the
bytecode of core classes (java.* etc.), but it does have access to
those classes via reflection.

So the reflection approach isn't only here to "avoid honest labor",
but instead serves a specific purpose.

I agree with you that where possible, a bytecode-parsing approach
would be preferrable.  Class loading not only poses a security risk
but wastes resources and might cause an Error to be thrown which would
interrupt the documentation generation process.

However, adding support for a bytecode toolkit will introduce an
additional dependency and yes, it will require "honest labor".

I will address this general issue on the cp-tools-discuss mailing list
in the next days.  Perhaps we can come up with a solution that only
uses reflection for "safe" classes (like those in java.lang), and uses
bytecode inspection in all other cases.

I'd like to add that at any rate, I'm treating this as a minor issue.
Given that gjdoc's main target audience are Free software developers,
it is very likely that the source code or at least the API
documentation for third party libraries is available.  And in this
case using the -sourcepath, -link or -linkexternal facilities is the
recommended way to include the library.

However I agree that it's a convenient option, and in some cases the
only option to include third-party libraries, so I confirm your 
report.


> gjdoc tries to pass the value of the -classpath option to the JVM in
> which gjdoc itself is running.  It attempts this feat by doing the
> following 
>
> [---snip---]
>
>   |   1472            reporter.printWarning("-classpath option could not be passed to the VM.  Faking it with ");
>   |   1473            reporter.printWarning("    System.setProperty(\"java.class.path\", \"" + args[0] + "\");");
>   |   1474            System.setProperty("java.class.path", args[0]);

The right way to do this is obviously a custom ClassLoader.  Of course
the above is a "useless attempt" to accomplish that "feat".  That's
why there is a warning which includes the word "faking".  Faking as 
in "to simulate; feign."

The original assumption was that the wrapper script would intercept
the -classpath option and pass it to the VM instead of to gjdoc, so
that the code path above would never be executed.  This assumption no
longer holds true now that there is a native gjdoc binary.  So yes,
confirmed.  I will try to fix that for the next release.


> In theory, gjdoc should ignore CLASSPATH, but it currently honors
> it.

http://java.sun.com/j2se/1.5.0/docs/tooldocs/solaris/javadoc.html#classpath
says:

>> As with other tools, if you do not specify -classpath, the Javadoc
>> tool uses the CLASSPATH environment variable, if it is set.
Comment 3 Andrew Haley 2005-11-04 11:00:31 UTC
 > Class.forName("third.party.lib.Quux") which executes the static
 > initializer of Quux.

It shouldn't.  A class or interface will be initialized at its first active use.
Comment 4 Vadim Nasardinov 2005-11-04 16:17:13 UTC
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Class.html#forName%28java.lang.String%29
says

  : Returns the Class object associated with the class or interface
  : with the given string name. Invoking this method is equivalent to:
  : 
  :   Class.forName(className, true, currentLoader)
  :  
  : 
  : where currentLoader denotes the defining class loader of the
  : current class.


So,

| $ cvs status src/gnu/classpath/tools/gjdoc/RootDocImpl.java | grep Working
|    Working revision:    1.23
| $ cat -n src/gnu/classpath/tools/gjdoc/RootDocImpl.java | head -1196 | tail -9
|   1188              try {
|   1189                 Class clazz = Class.forName(scheduledClassName);
|   1190                 printWarning("Cannot locate class " + scheduledClassName + " on file system, falling back to reflection.");
|   1191                 ClassDoc result = new ClassDocReflectedImpl(clazz);
|   1192                 return result;
|   1193              }
|   1194              catch (Throwable ignore) {
|   1195                 unlocatableReflectedClassNames.add(scheduledClassName);
|   1196              }

if you're saying that Class.forName at line 1189 should be replaced
with

|   Class clazz = Class.forName(scheduledClassName,
|                               false,
|                               RootDocImpl.class.getClassLoader());
|

... then I agree.


Additionally, the following piece of code may not be too hard to fix
either:

| $ cvs status src/gnu/classpath/tools/gjdoc/Main.java | grep Working
|    Working revision:    1.75
| $ cat -n src/gnu/classpath/tools/gjdoc/Main.java | head -1478 | tail -9
|   1470      options.put("-classpath", new OptionProcessor(2)
|   1471        {
|   1472          void process(String[] args)
|   1473          {
|   1474            reporter.printWarning("-classpath option could not be passed to the VM.  Faking it with ");
|   1475            reporter.printWarning("    System.setProperty(\"java.class.path\", \"" + args[0] + "\");");
|   1476            System.setProperty("java.class.path", args[0]);
|   1477          }
|   1478        });


Instead of what we currently do at line 1476, we can create a custom
URLClassLoader and initialize it with the list of URLs specified by
the -classpath option.  We can later use this custom classloader for
resolving "scheduled" classes in
src/gnu/classpath/tools/gjdoc/RootDocImpl.java.
Comment 5 Julian Scheid 2005-11-04 16:37:57 UTC
Hi, below is a snippet from a conversation I had with Mark Wielaard on this topic.  I planned to do a write-up on the whole issue and send it to the mailing list but alas, I'm quiet short on time at the moment.  So here goes without further comments before it gets lost.

On Fri, 2005-10-14 at 18:37 +0200, Julian Scheid wrote:

>>> > Yeah. I looked at the report. Not really trivial to do nicely. Except by
>>> > doing your own byte code parser probably. In theory you can create a
>>> > separate class loader and do reflection on the classes in there without
>>> > triggering class initialization.
>
>> 
>> Really? I didn't know that, how would I go about that?  I thought as 
>> soon as I'm calling defineClass the static initializer is executed.


Not until the first "active use" of a Class. Which mean creating an
instance object, accessing/invoking an static member (field or method),
etc. But "inspection only" access of a newly defined Class shouldn't
trigger any static initializers. Of course depending on the runtime
used/bugs triggered...

The attached mauve test shows some things that shouldn't and some that
should trigger class initialization. (Note that I fixed a bug just
yesterday in jamvm for this mauve test, so in practice not all runtimes
might correctly pass this test...)

Cheers,

Mark



// Tags: JDK1.1
//  
// Copyright (C) 2004, Free Software Foundation, Inc.
// Contributed by Mark J. Wielaard (mark@klomp.org)
//   
// This file is part of Mauve.
//    
// Mauve is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation; either version 2, or (at your option)
// any later version.
//     
// Mauve is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//      
// You should have received a copy of the GNU General Public License
// along with Mauve; see the file COPYING.  If not, write to
// the Free Software Foundation, 59 Temple Place - Suite 330,
// Boston, MA 02111-1307, USA.

package gnu.testlet.java.lang.Class;

import gnu.testlet.*;
import java.lang.reflect.*;

// This tests VM Spec 2.17.4
// As discussed at http://gcc.gnu.org/ml/java-patches/2004-q2/msg00443.html 
public class init implements Testlet
{
  static boolean initI = false;
  static boolean initC1 = false;
  static boolean initC2 = false;
  static boolean initC3 = false;
  static boolean initC4 = false;
  static boolean initC5 = false;
  static boolean invokedM = false;

  interface I
  {
    static long l = init.initI();
    void m();
  }

  static class C1 implements I
  {
    static long l = init.initC1();
    public void m()
    {
      invokedM = true;
    }
  }

  static class C2 implements I
  {
    static long l = init.initC2();
    public void m() { }
  }

  static class C3 extends C2
  {
    static long l = init.initC3();
  }

  static class C4 extends C2
  {
    static long l = init.initC4();
    static boolean m2() {
      return true;
    }
  }

  static class C5 extends C4
  {
    static long l = init.initC5();
    public static int i;
  }

  public void test(TestHarness h)
  {
    try
      {
	// None of this should initialize anything
	Class i = new I[0].getClass().getComponentType();
	Method m = i.getDeclaredMethod("m", null);
	Field f = Class.forName(getClass().getName() + "$C5",
	  false, getClass().getClassLoader()).getField("i");

	// Static field access should initialize C3 and superclass C2 but not I
	h.check(!initC2);
	h.check(!initC3);
	if (C3.l == 123)
	    hashCode();
	h.check(initC2);
	h.check(initC3);

	// Static method invocation should initialize C4 but not I
	h.check(!initC4);
	if (C4.m2())
		hashCode();
	h.check(initC4);

	// Static field access should initialize C5
	h.check(!initC5);
	f.set(null, new Character((char)0xffff));
	h.check(C5.i == 0xffff);
	h.check(initC5);

	// Instantiation of a C should initialize C but not I
	h.check(!initC1);
	Object o = new C1();
	h.check(initC1);

	// Apparently, invocation of interface method initializes I
	h.check(!initI);
	h.check(!invokedM);
	m.invoke(o, null);
	h.check(initI);
	h.check(invokedM);
      }
    catch (NoSuchMethodException nsme)
      {
	h.debug(nsme);
	h.check(false);
      }
    catch (NoSuchFieldException e)
      {
	h.debug(e);
	h.check(false);
      }
    catch (InvocationTargetException ite)
      {
	h.debug(ite);
	h.check(false);
      }
    catch (IllegalAccessException iae)
      {
	h.debug(iae);
	h.check(false);
      }
    catch (ClassNotFoundException e)
      {
	h.debug(e);
	h.check(false);
      }
  }

  static long initI()
  {
    initI = true;
    return 5;
  }

  static long initC1()
  {
    initC1 = true;
    return 5;
  }

  static long initC2()
  {
    initC2 = true;
    return 5;
  }

  static long initC3()
  {
    initC3 = true;
    return 5;
  }

  static long initC4()
  {
    initC4 = true;
    return 5;
  }

  static long initC5()
  {
    initC5 = true;
    return 5;
  }
}