Thursday, November 29, 2012

Packed Objects in Java

Recently I spoke at EclipseCon Europe on some ideas for improving Java, citing the industry need for rapid innovation, and the business drivers which impact these priorities. Given the talk was co-sponsored by the OSGi community event, I focused specifically on how modularity can help drive improved performance, better software engineering discipline and some ideas on evolving Java and maintain compatibility by using versions to do more than just describe what’s in the box, essentially using versions to guide API evolution via JVM assisted mediation.

One of the charts I showed had a long “laundry list” of improvements – each topic a whole talk on its own which nicely triggered many fun technical discussions at the bar with various techies on their favorite wish list item.

For the low level types, I had listed a better Foreign Function Interface, better external memory access, packed structures for efficient memory use and I want these to be first class Java enhancements so the JDK libraries can benefit as well as end user applications.

And then I noticed this.

http://mechanical-sympathy.blogspot.co.uk/2012/10/compact-off-heap-structurestuples-in.html

It’s not the first time I’ve seen unsafe hackery used for performance and while it’s fun to see the abuse, and even though well done, it will be sensitive to processor architecture, memory and other hardware dependencies and the code will  even perform differently between releases of the same architectures. Maybe ok if you don’t care about portability and have the deep technical skills to deal with fragile code but not a great solution for wider adoption especially when the security code-review people find you (and they will).

The J9 VM team has been working on a generalized solution for more flexible data access which provides the user with more control over how objects will be layed out. You can find a high level overview from Marcel Mitran at http://www.slideshare.net/mmitran/ibm-java-packed-objects-mmit-20121120.  The goal is to improve density of data structures in Java, improve access to data not in Java heap format (usually external structures) and be able to define new data types beyond what Java’s object model allows today including value types if you also include read-only capabilities.

The main driver here has been our customers and their demanding applications,  most recently high performance big data analytics, virtualization, the need to support higher density cloud deployment and lightweight data structures needed to "pack" data to delivery better performance. Martins blog post triggers more discussions, specifically on the topic of reducing Java overhead.

While our "packed object" is somewhat related to the notion of "value types" (see JEP 169) there are some important differences that I will touch on as well. John Rose's blog post does an excellent job of introducing the implications and the advantages of having value types.

So, why have packed objects with more explicit control over layout ?

In Java, any non-primitive instance field/array element is a reference, which in a JVM implementation, implies a pointer to an object that has a header. Apart from the space overhead associated with the reference and the header, the resulting "pointer chasing" incurs data cache misses (e.g. in something as common as object serialization). In addition, as the layout of the Java objects is completely abstracted away from the application code, Java is inherently challenged when required to inter-operate with other 'native' data structures. Java typically requires copying and marshalling/boxing of native data structures into the Java heap before they can be manipulated. This can be cumbersome/awkward from an application perspective while also incurring significant overhead.

Value types can solve some of these issues (like packing new value types like complex), though it does not allow optimization of the "direct" native memory access scenario and my view is if you separate concerns between layout, packing and read-only attributes for objects, you can get all the functionality mentioned above, including the value type use case.

The solution to these problems involve giving the Java programmer the capability to specify how instances of a class should be treated by the JVM, and enable specific types of operations.

  1. Object inlining : Given a "packed" class A and an instance field in the class (say A.f) declared to be of "packed" class B, this would give the user the ability to specify the JVM allocate space corresponding to an instance of B (for A.f specifically) along with the space necessary for an instance of  A. This essentially gets rid of the header for B and co-locates the data access into a single object which improves performance.  For tightly coupled classes, where the two objects are always allocated together, this can be used for high performance data structures. 
  2. Object splitting : Given an instance of a "packed" class, this would give the user the ability to separate the "data" part of the instance from the "header" part where necessary, essentially allowing a user to describe external data via a Java definition. 
To benefit, the Java programmer would need to declare a class to be "packed" in their Java code thereby specifying that he would like the ability to "inline" an instance of that class into some other packed class instance (and/or to have other packed classes inlined into it).  For Arrays of packed objects, the header overhead is gone, which enables low overhead arrays of arbitrary types of objects,.

A user could also explicitly specify when they do not want an instance field to be inlined. Intuitively, object inlining allows data to be packed into an object (improving data locality) whereas object splitting allows an object to refer to its data even when it is not located immediately following the object's header (as it traditionally does in most JVMs); once the data is separated from the header it can really reside anywhere, including in native memory or inlined inside some other object. These two capabilities are thus key enablers for allowing "nested" packed fields (e.g. A.f in our earlier example). Because such nested fields have only their data inlined into the containing object, the JVM would need to (completely transparently to the user) create a packed object header (that points to the nested field's data) any time the nested packed field is read in the bytecodes. A reference to the packed object header still "looks" like a normal reference except that the data in the object is separated from the header and the data could be located anywhere in the address space.

We view this approach to be an intermediate step on the evolutionary path to introducing full blown value types in the Java language; yet it is successful at solving some important performance problems. One key difference from full blown value types is that the "packing" is only for data accessed via "nested" fields (namely, inlined instance fields or array elements) in objects on the Java heap; this should not be confused with the fact that the "data" itself can reside anywhere in the address space. In particular, declaring a local variable or parameter to be of a packed class type does not necessarily result in the data being "packed" on the stack. Local variables and parameters are "references" regardless of their declared type. This also means that no change in behaviour is required at the astore/aload, getstatic/putstatic and call/return bytecodes in the JVM. Immutability is another property that is typically implied when one discusses traditional value types; however, immutability is orthogonal to the solution, and would prevent some use cases such as read/write native memory structures. In allowing packed classes to optionally have an immutability property, we would enable evolution to value types. As immutability is not mandatory, common operations on traditional value types like equals, hashCode based solely on the field values is not required and can be supplied by the user if desired.

In this approach, a "packed object" lies somewhere in between a traditional object and a value type instance in its properties; since packed object references are synthesized transparently by the JVM, it is also free to share such references (or not) as appropriate. This means that the semantics of common operations on traditional object references like identity, hashCode, synchronization etc. are not well defined for packed object references because there may not be a reference pointer to each and every inlined instance and (there may be sensible ways to implement these methods in terms of the address where the data resides, rather than depending on the short lived synthesized references).

Note that this approach is different when compared with C#'s value types which overloads the meaning of the assignment operator to implicitly perform copy-by-reference, copy-by-value, or boxing. The latter does not align well with the evolution of Java which has steered away from operator overloading; this proposal provides consistent definition of the '=' operator as assignment-by-reference. It is suggested that assignment-by-value semantics should be made explicit through the introduction of new operators such as ':=' if this is accepted as a Java language change. Other necessary language changes might be around packed array allocations and accesses to explicitly specify to the JVM when a packed array is in use and avoid confusion with traditional (i.e. reference) arrays.

Feedback is most welcome.

4 Comments:

Blogger Martin Thompson said...

It is good to see parallel work on this subject which suggests a strong demand. Here is an alternative that could be implemented as an intrinsic without a language change.

https://github.com/mjpt777/examples/blob/master/src/java/uk/co/real_logic/intrinsics/StructuredArray.java

Some considerations that have occurred to a few of us working on this are:

1. Liveness of the container when an element/packed object is referenced.
2. How will constructors work on arrays?
3. Copy semantics, especially in the presence of final fields.

I do not believe an approach with purely immutable objects is useful for advanced data structures. Check out the implementation of the C# Dictionary for how to greatly beat Java HashMap performance using arrays of structures internally. I've measured a 10X difference on 2GB data sets.

11:02 PM  
Blogger John Duimovich said...

Thanks for your interest.

Your alternative is interesting. The main difference I saw is that PackedObjects can support primitive types, while StructuredArray will need to box them. I have seen similar approaches in libraries like the Trove collections which support primitive types with multiple specialized classes, basically as a way to get around limits in Java generics and pesky primitive types.

As to the questions ... I'd be interested in your answers too but here goes...

1) any internal element references or internal references will keep the container alive. That is of course unless the object is discovered to never escape a stack scope and thus torn apart and turned into locals :)

2) arrays themselves are initialized normally (empty) just like int[] so I assume your question is what happens on individual slots being filled. The issue is whether (or how) you would call individual constructors on essentially an "already allocated" object aka the inner object. My view is you need to separate initialize from create and call the init individually per slot or change spec to allow the proper constructor to be called on the slot to fill it in one at a time.

3) For final - you need to be precise in which field is final. If you mean the reference to an object inlined into the container, you would either disallow it or more likely ignore final fields as a simple solution since the case where a final reference was declared in an object - the object contents are still read/write, so an inlined version of that object makes the final a no-op. If you do allow inlined objects with final fields - my view is you need to allow "all or none" overwriting ... the consistency of the object needs to be maintained.

Your last point on immutable objects. Yes, correct and that's our approach in designing PackedObjects.

5:27 PM  
Blogger Ashwin Jayaprakash said...

Javolution's Struct emulates C's Struct (and Union), without auto-GC of course. It's a simple, type safe ByteBuffer wrapper.

1:17 PM  
Blogger John Duimovich said...

Ashwin,

If you only want C/C++ struct wrappers - the Structs approach is pretty good.

Our proposal for packed objects supports multiple cases
1) arrays of objects with the referenced objects inlined (headerless) - this saves lots of space and offers excellent reference locality for performance
2) struct layout for external references useful in foreign function interfaces (FFI)

Essentially a unified on-heap and off heap use case withing Java and as you note, supports GC.

Lots of discussion still to be had around the design choices to be made, how to integrate into current java, fun syntax choices and how this relates use cases like value types.

Thanks for the comment.

3:30 PM  

<< Home