PASSING STRUCTS BY VALUE ------------------------------------- (HIGHLIGHTS/CONCLUSIONS START WITH >>.) C# has 'struct' (value) types as well as reference types. These are useful for efficiency. The following is a proposal for value types in Java without modifying the language, but instead defining a *convention* that implementations may use to optimize value types. The goal is to let a Java compiler (JIT or ahead-of-time) be able to implement value types the way C/C++ structs are implemented. We use a "Point" with fields x and y as the running example. Our goal is that a Java programmer can create a Point object, and pass it to methods that expect a Point struct, with the compiler automatically passing it by value. It follows that the Point class has to be "magic" so the compiler knows that it needs to be handled specially. >> A "value class" is a Java class that the compiler/run-time passes by value instead of by reference. A "value instance" is an instance of a value class. On the other hand, we want legacy Java compilers to be able to implement value types with reasonable efficiency, so portable code can use value types and have it work with on any Java implementation. >> A "value class" is a normal Java class, but with a special "value type annotation", and following certain conventions. If a class T inherits from a value class S, then passing a T to a method expecting an S would require the class to be "trunctated". This causes unexpected and bad behavior. Furthermore, if we disallow value classes from having a vtable pointer (as discussed later), then inheritance loses much of its point. >> A value class must be final. Suppose a value class instance is passed to a method that modifies one of its fields. From the Java programmers point of view that should modify the original instance that was passed in. However, because the compiler invisibly passes it by value that does not happen. So we need to prevent modification. >> All non-static fields of a value class must be final. Stack-allocation and inlining of value-class instances: A major reason for passing struct parameters is that they can be stack-allocated, so we don't need heap allocation and gc. For performance we want to modify the Java implementation so that value class instances are stack-allocated. Specifically, given Point p = new Point(x, y); moveto(p); or: moveto(new Point(x, y)); we want the compiler to stack-allocate the new Point. >> The compiler can and should stack-allocate value instances. This is simple enough once the compiler knows that Point is a "value class", but there are a number of implications. First, assume a method that takes a Point parameter p, and that p is assigned to a field 'f' in some object 'o': o.f = p; We cannot store in 'o' an address since the parameter 'p' is a value on the stack, and it will become invalid as soon as the current method returns. Therefore, we actually have to copy the value of p into the field f. >> An object field whose type is a value class is implemented as a struct field, not a reference field. More generally: >> All assignment and passing of value class instances is done by structure copying, not reference copying. Similarly, if we allow arrays of Point elements (which would be a more extensive change than the current suggestion): >> Elements of array of value classes have to be structures. It follows that instance of values class do not have object identity: If a value gets copied whenever it accessed, then the concept of a fixed identity has no meaning. >> Equality of two value instances is defined in terms of field-by-field value comparison. Using the 'new' operator implies creating a new object with its own identity. This be misleading for value classes. >> Values classes have no public constructors. Instead, new values are created using static "factory" methods. Conceptually, these factory methods "collapse" or "intern" values that are equal. Conceptually, this is done by hashing on the values of fields, and keeping a table of existing objects, just like the standard String intern method does. That way value identity and object identity would be the same, as required for value types. Of course the actual implementation, at least when using a value-class-aware Java implementation need not do the actual interning, but instead translates object equality to value (field-for-field) equality. Some proposals (paralleling C++ "plain old data types") prohibit instance methods. I don't think that such a restriction is needed. We can have instance methods without a vtable pointer, as long as the class is final. In that case we can call instance methods directly, without any run-time method lookup or vtable. In essence we can treat an instance method 'foo(args)' as syntactic sugar for 'static foo(Point this, args)'. The 'this' reference can be passed by reference or by value - it doesn't matter (in terms of semantics), since all the fields are final. >> Value classes may have both instance and static methods. Is a value instance an Object? Can you pass an Point to a method that expects an Object? Doing so would require that an Point contain a vtable pointer. Also, if we pass a stack-allocated Point to a method that expects an Object, then at run-time we have a data pointer into the stack. This may cause problems and could confuse the GC, especially if we allow collectable fields in value classes (as discussed next). >> (Tentatively) A value instance should not be converted to or coerced from Object. (I.e. a value-class-aware compiler should reject code that does this.) This may be revisited later. What if a value class contains a field that points to collectible data? Could this complicate GC, given that value instances may live on the stack? This should not cause a problem, in that a parameter/variable/field containing (say) two Object pointers should be implemented more-or-less the same as two parameters/variables/fields that plain Object pointers. One possible exception might be an unusual architecture that passes structs in a funny place. We might also have to modify the reflection data to handle Object fields. >> In the initial "relase", values types cannot contain Object fields. Only primitive types, other value types, and gnu.gcj.RawData (a non-gc'd 'void*' pointer) are allowed. We may revisit this later. We've talked about "value classes", but skirted the issue of how the compiler can tell which classes are value classes. There are various options: (1) Classes that satisfy the needed properties of a value class (including being final and all instance variables are final) are value classes. This is difficult, because of the different semantics of value classes (lack of identity, possible lack of 'isa'), and it seems difficult for the compiler to verify (at least locally) that the "value optimization" is safe. (2) Use a special compiler flag to declare that a class is a value type. I don't like this - if nothing else, it complicates Makefiles. (3) Require value classes to inherit from some special class, like gnu.gcj.ValueObject. The problem with this approach is that makes it difficult to write Java code that is portable, but takes advantage of the "value optimization" when available (i.e. when using GCJ). (4) Use some special declaration understood by GCJ that would get ignored by non-GCJ compilers. For example we could use a special JavaDoc comment. JDK 1.5 annotations may make sense. My suggestion be for the compiler to test for the existence of a magic static field name. I suggest gnu$gcj$VALUE_CLASS. >> A value class is distinguished by the existance of a static field named "gnu$gcj$VALUE_CLASS". For performance this should be a final static primitive field initialized to a constant. Note that value classes are similar to the "un-boxed" struct types in .Net / CLI. We should look at this more closely, keeping in mind that we may want to support C#/.Net in the future. public final class Point { static final boolean gnu$gcj$VALUE_CLASS = true; // Fields are final. public final float x, y; private Point(float x, float y) { this.x = x; this.y = y; } public static Point make (float x, float y) { Point p = new Point(x, y); // Conceptually: p = intern(p); return p; } public static Point make(Point p) { return make(p.x, p.y); } public static boolean equals(Point p1, Point p2) { return p1.x = p2.x && p1.y == p2.y; } public float getX() { return x; } public float getY() { return y; } }