Last updated 2004-06-30 by Roedy
Green ©1996-2004 Canadian Mind Products
Java definitions: 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
You are here : home : Java Glossary : S words : serialization.
Java has no direct way of writing a complete binary object to a file, or of sending it over a communications channel. It has to be taken apart with application code, and sent as a series of primitives, then reassembled at the other end. Serialized objects contain the data but not the code for the class methods. It gets most complicated when there are references to other objects inside each object. Starting with JDK 1.1 there is a scheme called
Serialisation works by depth first recursion. This manages to avoid any forward references in the object stream. Referenced objects are embedded in the middle of the referencing object. There are also backward references encoded as 00 78 xx xx, where xx xx is the relative object number.
While the lack of forward references simplifies decoding, the problem with this scheme is, you can overflow the stack if, for example, you serialized the head of a linked list with 1000 elements. Recursion requires about 50 times as much RAM stack space as the objects you are serialising. Another problem is there are no markers in the stream to warn of user-defined object formats. This means you can't use general purpose tools to examine streams. Tools would have to know the private formats, even to read the standard parts.
If your object A references C, and B also references C, and you write out both A and B, there will be only one copy of C in the object stream, even if C changed between the writeObject calls to write out A and B. You have to use the sledgehammer ObjectOutputStream.reset() which discards all knowledge of the previous stream output (including Class descriptions) to ensure a second copy of C.
Happily, serialization of ArrayLists is clever. They take only a few bytes more than the equivalent array. It does not bother to serialise the empty slots at the end.
implements java.io.SerializableNote the American spelling of Serialisable substituting a z for the s!
You don't need to write any methods. Serializable is just a dummy marker interface that turns on serializability.
You don't have to write a readObject or writeObject method, but if you do, you still need the implements java.io.Serializable.
Don't confuse the custom readObject method of a your class with the ObjectInputputStream.readObject method you use to read a whole tree of objects.
You might wonder how serialisation manages to get at the non-transient private members via reflection. It uses AccessController.doPrivileged() to override the general security privileges.
static final long serialVersionUID = 3L;This must change if any characteristics of the pickled object change. If you don't handle it manually, Java will assign one based on hashing the code in the class. It will thus change every time you make a very minor code change that may not actually affect the pickled objects. This will make it more difficult to restore old object streams.
If you have a reference to a non-serializable object, you have no choice but to make it transient. You will have to figure out some way to reconstitute the reference in a custom readObject method.
/** * deserialise and ensure fields are re-interned when read back * * @param stream ObjectInputStream of objects * * @exception IOException * @exception ClassNotFoundException */ private void readObject ( ObjectInputStream stream ) throws IOException, ClassNotFoundException { stream.defaultReadObject(); // reintern all Strings in the object name = name.intern(); extension = extension.intern(); }
implements java.io.Serializableon the class you are doing a writeObject on. Since writeObject also writes out all the objects pointed to by that object, by the objects those objects point to, ad infintum, all those classes too must be marked implements Serializable. Any references to non-serializable classes must be marked transient. While you are at it, give each of those classes an explicit version number.
You can't serialise Images and send them via RMI to another platform, because Images are platform specific. You need to convert your Image to a platform independent format. You can use the JAI API or you can write a class with ints only and use a PixelGrabber to create an int array representation of that Image (you also need the height and width). Then you can send the int[] representation of the class over the ObjectStream and cast it back at the destination. Then use createImage from the java.awt.Toolkit on a MemoryImageSource to recreate the Image data type.
Serialisation, or serialization in American, is Java's way of providing persistent objects, or transmitting objects over a wire (in conjuction with RMI). People like to concoct flavourful terminology to describe the saving (pickling, free drying, swizzling) and restoring (depicking, deswizzling, reconstituting) processes.
In theory all you have to do is save an object and all its dependent objects will automatically go with it. However there are many pitfalls. The Java Gotchas. See also the serialization entry in the Java & Internet Glossary.
There is now an essay on the Sun Site about serialisation.
When you read in a stream, then, Serialization has to keep a map of all read-in objects, relating them to the "handle" numbers, so that when a given handle number is later encountered a reference to the proper object can be substituted, thus creating a valid newly reconstituted object.
Serialization has no way of knowing that object number 13 in your stream is never referenced again anyplace in the stream, so of course it has to keep everything in that map (which is ever-increasing in size!) forever!
Unless...
Unless you call the "reset" method on the stream. In which case everything starts all over again. (Object numbers restart from zero, etc., etc.)
"Wow!" you say, "what a simple solution." Yes, but...
Once you do a "reset", none of the objects previously written will be "known" to the stream, so once again the first reference to a given object will cause its data to be written to the stream. "Well, what's wrong with that?"
Answer: When you then read that stream, and the "reset" is seen (a special code in the stream), then all knowledge of already-read objects is lost and... yep, you guessed it: You'll read the same object again!!! If you aren't prepared for this and you don't program accordingly, the results can be disastrous.
There is another negative consequence of doing "reset". The first time any class is written (or the first time after a "reset"), an incredible amount of junk that describes that class is written to the stream.
If you will only be serializing a handful of classes, and if you only need to do a "reset" every few hundred kilobytes, then this overhead isn't too onerous. But if you need to do a reset after every small group of objects, and if nearly every object in the group is a different type, then this overhead will bite you. (Note that even predefined system types, such as java.lang.Integer, must be "fully described" in the stream.)
So what's the solution, if "reset" isn't appropriate to your needs? Dump Serialization. It's slow and clumsy and has a lot of overhead. But that may not be viable if you really do depend on its ability to maintain object references in large networks of objects (sometimes called "pickling" or "swizzling" and "depickling" or "deswizzling"). On the other hand, if you are simply sending pure numeric and textual data back and forth--if connections between objects are uninteresting to you--then do consider "rolling your own" instead of using Serialization.
// The fundamental asymmetry // write stream.writeObject( obj ); // ok // read obj = stream.readObject(); // ok // write stream.writeObject( this ); // ok // read this = stream.readObject(); // illegalYou can write out the current object, but you can't read it back. All you can do is read back creating some other object, then copy the fields into this object.
If the objects are actually identical, e.g. it is just you added another method to the class, you can manually give both classes a version id. of the form:
static final long serialVersionUID = 3L;If you don't provide such an ID, one is automatically generated for you by hashing together bits of the class source code.
If the objects are just a little bit different, e.g. a new field. You can use the manual version number method. I don't recall the precise details, but under some circumstances, the serial loader won't mind minor differences. It just zeros out new fields, and drops unused ones. Keep in mind the serial loader does not use your constructor! You can't count on it to do any initialisation of transient fields, especially the new ones.
home |
Canadian Mind Products | |||
| mindprod.com IP:[24.87.56.253] | ||||
| Your IP:[80.134.30.163] | ||||
| You are visitor number 5938. | ||||
| Please send errors, omissions and suggestions | ||||
| to improve this page to Roedy Green. | ||||
| You can get a fresh copy of this page from: | or possibly from your local J: drive mirror: | |||
| http://mindprod.com/jgloss/serialization.html | J:\mindprod\jgloss\serialization.html | |||