Jun 26, 2015

Serialization: Part-1

Serialization is the process of converting a set of object instances (that may contain references to each other) into a linear stream of bytes. This stream of bytes can be sent through a socket, stored to a file, or simply manipulated as a stream of data. Even it can be transferred between different JVMs. This mechanism is used by RMI. But lets not dig much into whats RMI. We will be more focused on various aspects of serialization including what data gets serialized, how its done, some use cases of serialization, etc.

Some uses of Serialization?
Network 
Serialization is much easier than writing message formats for each type of message that is to be passed between a client/server pair. Just send the common objects back and forth and you're done. Designing messaging protocols is notoriously finicky. Once again Java saves the day and makes a previously arduous task a simple matter of a single method call.
Serialization over networks is the basis of Sun's communications infrastructure: RMI uses it, as well as the many communications subsystems in Jini such as JNDI.

File I/O 
As we stated before, an object may be serialized to any Java I/O byte stream class. So it's trivial to serialize an object or a whole hierarchy of them to a disk file. We use this feature of the language for two reasons:
1. Saving an application program's state for the next time it's started.
2. Caching objects on a disk before they're needed. These objects are complex hierarchies that are created from database access and extensive algorithmic processing, and thus must be created ahead of time.


RDBMS 
Another use of serialization is to store complete objects as BLOBs in a relational database. This can provide a way to use traditional databases with an object-oriented programming language without needing to move to a true object-oriented database.

Using Serialization


Serialization is a mechanism built into the core Java libraries for writing a graph of objects into a stream of data. This stream of data can then be programmatically manipulated, and a deep copy of the objects can be made by reversing the process. This reversal is often called deserialization. In particular, there are three main uses of serialization:

As a persistence mechanism
If the stream being used is FileOutputStream, then the data will automatically be written to a file.

As a copy mechanism
If the stream being used is ByteArrayOutputStream, then the data will be written to a byte array in memory. This byte array can then be used to create duplicates of the original objects.

As a communication mechanism
If the stream being used comes from a socket, then the data will automatically be sent over the wire to the receiving socket, at which point another program will decide what to do.
The important thing to note is that the use of serialization is independent of the serialization algorithm itself. If we have a serializable class, we can save it to a file or make a copy of it simply by changing the way we use the output of the serialization mechanism.

As you might expect, serialization is implemented using a pair of streams. Even though the code that underlies serialization is quite complex, the way you invoke it is designed to make serialization as transparent as possible to Java developers. To serialize an object, create an instance of ObjectOutputStream and call the writeObject( ) method; to read in a serialized object, create an instance of ObjectInputStream and call the readObject( ) object.

ObjectOutputStream
ObjectOutputStream, defined in the java.io package, is a stream that implements the "writing-out" part of the serialization algorithm. The methods implemented by ObjectOutputStream can be grouped into three categories: methods that write information to the stream, methods used to control the stream's behavior, and methods used to customize the serialization algorithm. RMI actually uses a subclass of ObjectOutputStream to customize its behavior.

The "write" methods
public void write(byte[] b);   
public void write(byte[] b, int off, int len);   
public void write(int data);   
public void writeByte(int data);   
public void writeBoolean(boolean data);   
public void writeChar(int data);   
public void writeBytes(String data);   
public void writeChars(String data);   
public void writeFloat(float data);   
public void writeDouble(double data);   
public void writeFields( );   
public void writeInt(int data);   
public void writeUTF(String s);   
public void writeLong(long data);   
public void writeObject(Object obj);   
public void writeShort(int data);   
public void defaultWriteObject( );   

For the most part, these methods should seem familiar. writeFloat( ), for example, takes a floating-point number and encodes the number as four bytes. There are, however, two new methods here: writeObject() and defaultWriteObject( ).

writeObject( ) serializes an object. In fact, writeObject( ) is often the instrument of the serialization mechanism itself. In the simplest and most common case, serializing an object involves doing two things: creating an ObjectOuptutStream and calling writeObject( ) with a single "top-level" instance. The following code snippet shows the entire process, storing anobject and all the objects to which it refers into a file:
FileOutputStream underlyingStream = new FileOutputStream("C:\\temp\\test");
ObjectOutputStream serializer = new ObjectOutputStream(underlyingStream); 
serializer.writeObject(serializableObject);
Of course, this works seamlessly with the other methods for writing data. That is, if you wanted to write two floats, a String, and an object to a file, you could do so with the following code snippet:
FileOutputStream underlyingStream = new FileOutputStream("C:\\temp\\test"); 
ObjectOutputStream serializer = new ObjectOutputStream(underlyingStream);  
serializer.writeUTF(aString); 
serializer.writeFloat(firstFloat);  
serializer.writeFloat(secongFloat);  
serializer.writeObject(serializableObject);
The other new "write" method is defaultWriteObject(). defaultWriteObject( ) makes it much easier to customize how instances of a single class are serialized. However, defaultWriteObject( ) has some strange restrictions placed on when it can be called. Here's what the documentation says about defaultWriteObject( ): 
Write the nonstatic and nontransient fields of the current class to this stream. This may only be called from the writeObject method of the class being serialized. It will throw the NotActiveException if it is called otherwise. 

That is, defaultWriteObject( ) is a method that works only when it is called from another specific  method at a particular time. Since defaultWriteObject( ) is useful only when you are customizing the information stored for a particular class, this turns out to be a reasonable restriction. We'll talk more about defaultWriteObject( ) later, when we discuss how to make a class serializable.

Continue to Part-2

0 comments:

Post a Comment