Serialization of Dry Cargo Series Performance Chapter

 

 

Serialization scheme

  1. Java RMI uses Java serialization
  2. Spring Cloud uses JSON serialization
  3. Although Dubbo is compatible with Java serialization, Hessian serialization is used by default.

Java serialization

principle

 

Serializable

  1. JDK provides the input stream object ObjectInputStream and the output stream object ObjectOutputStream.
  2. They can only serialize and deserialize objects of classes that implement Serializable interfaces
// Only objects of classes that implement Serializable interfaces can be serialized
// java.io.NotSerializableException: java.lang.Object
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(new Object());
oos.close();

transient

  1. The default serialization method of ObjectOutputStream, which only serializes non-transient instance variables of objects
  2. transient instance variables of objects are not serialized, nor static variables are serialized
@Getter
public class A implements Serializable {
 private transient int f1 = 1;
 private int f2 = 2;
 @Getter
 private static final int f3 = 3;
}
// serialize
// Serialize only non-transient ly instance variables of objects
A a1 = new A();
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(a1);
oos.close();
// De serialization
ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
A a2 = (A) ois.readObject();
log.info("f1={}, f2={}, f3={}", a2.getF1(), a2.getF2(), a2.getF3()); // f1=0, f2=2, f3=3
ois.close();

serialVersionUID

  1. In the object of the class that implements the Serializable interface, a version number of the serialVersionUID is generated
  2. The class used to verify whether the serialized object loads the deserialized class in the deserialization process
  3. Objects cannot be retrieved in deserialization if they are classes with different versions of the same name
@Data
@AllArgsConstructor
public class B implements Serializable {
 private static final long serialVersionUID = 1L;
 private int id;
}
@Test
public void test3() throws Exception {
 // serialize
 B b1 = new B(1);
 ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
 oos.writeObject(b1);
 oos.close();
}
@Test
public void test4() throws Exception {
 // If you first change the serialVersionUID of B to 1 and directly deserialize the files on disk, an exception will be reported.
 // java.io.InvalidClassException: xxx.B; local class incompatible: stream classdesc serialVersionUID = 0, local class serialVersionUID = 1
 ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
 B b2 = (B) ois.readObject();
 ois.close();
}

writeObject/readObject

writeObject and readObject are the concrete implementations of serialization and deserialization.

@Data
@AllArgsConstructor
public class Student implements Serializable {
 private long id;
 private int age;
 private String name;
 // Serialize only partial fields
 private void writeObject(ObjectOutputStream outputStream) throws IOException {
 outputStream.writeLong(id);
 outputStream.writeObject(name);
 }
 // Deserialize in serialized order
 private void readObject(ObjectInputStream inputStream) throws IOException, ClassNotFoundException {
 id = inputStream.readLong();
 name = (String) inputStream.readObject();
 }
}
Student s1 = new Student(1, 12, "Bob");
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(s1);
oos.close();
ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
Student s2 = (Student) ois.readObject();
log.info("s2={}", s2); // s2=Student(id=1, age=0, name=Bob)
ois.close();

writeReplace/readResolve

  1. writeReplace: Used to replace serialized objects before serialization
  2. readResolve: Used to process returned objects after deserialization
// Deserialization destroys the singleton pattern by invoking the parametric constructor through reflection to return a new object
// You can solve it by readResolve()
public class Singleton1 implements Serializable {
 private static final Singleton1 SINGLETON_1 = new Singleton1();
 private Singleton1() {
 }
 public static Singleton1 getInstance() {
 return SINGLETON_1;
 }
}
Singleton1 s1 = Singleton1.getInstance();
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(s1);
oos.close();
ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
Singleton1 s2 = (Singleton1) ois.readObject();
log.info("{}", s1 == s2); // false
ois.close();

public class Singleton2 implements Serializable {
 private static final Singleton2 SINGLETON_2 = new Singleton2();
 private Singleton2() {
 }
 public static Singleton2 getInstance() {
 return SINGLETON_2;
 }
 public Object writeRepalce() {
 // No replacement is required before serialization
 return this;
 }
 private Object readResolve() {
 // After deserialization, return to the singleton directly
 return getInstance();
 }
}
Singleton2 s1 = Singleton2.getInstance();
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(s1);
oos.close();
ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
Singleton2 s2 = (Singleton2) ois.readObject();
log.info("{}", s1 == s2); // true
ois.close();

defect

Unable to cross languages

Java serialization applies only to frameworks implemented in the Java language

Vulnerable to attack

1.Java serialization is insecure

  • Java official website: Deserialization of distrustful data is inherently dangerous and should be avoided

2.ObjectInputStream.readObject()

  • Instantiate almost all objects that implement Serializable interface on the classpath!!
  • This means that in the process of deserializing byte streams, this method can execute arbitrary types of code, which is very dangerous.

3. For objects requiring long time deserialization, no code execution is required and an attack can be launched.

  • An attacker can create a loop object chain and then transfer the serialized object to the program for deserialization.
  • This causes the number of calls to haseCode methods to increase exponentially, causing stack overflow exceptions.

4. Many serialization protocols have developed a set of data structures to save and retrieve objects, such as JSON serialization, ProtocolBuf.

  • They only support some basic types and array types, avoiding deserialization and creating uncertain instances
int itCount = 27;
Set root = new HashSet();
Set s1 = root;
Set s2 = new HashSet();
for (int i = 0; i < itCount; i++) {
 Set t1 = new HashSet();
 Set t2 = new HashSet();
 t1.add("foo"); // Make t2 not equal to t1
 s1.add(t1);
 s1.add(t2);
 s2.add(t1);
 s2.add(t2);
 s1 = t1;
 s2 = t2;
}
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(FILE_PATH));
oos.writeObject(root);
oos.close();
long start = System.currentTimeMillis();
ObjectInputStream ois = new ObjectInputStream(new FileInputStream(FILE_PATH));
ois.readObject();
log.info("take : {}", System.currentTimeMillis() - start);
ois.close();
// itCount - take
// 25 - 3460
// 26 - 7346
// 27 - 11161

Serialized streams are too large

1. The size of the serialized binary stream can reflect the serialization capability.

2. The larger the serialized binary array, the more storage space it occupies and the higher the cost of storage hardware.

  • If network transmission is carried out, the more bandwidth will be occupied, which will affect the throughput of the system.

3.Java serialization uses ObjectOutputStream to implement object-to-binary encoding, which can be compared with the binary encoding implemented by ByteBuffer in BIO.

@Data
class User implements Serializable {
 private String userName;
 private String password;
}
User user = new User();
user.setUserName("test");
user.setPassword("test");
// ObjectOutputStream
ByteArrayOutputStream os = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(os);
oos.writeObject(user);
log.info("{}", os.toByteArray().length); // 107
// NIO ByteBuffer
ByteBuffer byteBuffer = ByteBuffer.allocate(2048);
byte[] userName = user.getUserName().getBytes();
byte[] password = user.getPassword().getBytes();
byteBuffer.putInt(userName.length);
byteBuffer.put(userName);
byteBuffer.putInt(password.length);
byteBuffer.put(password);
byteBuffer.flip();
log.info("{}", byteBuffer.remaining()); // 16

Slow serialization speed

  1. Serialization speed is an important indicator of serialization performance
  2. If the serialization speed is slow, it will affect the efficiency of network communication, thus increasing the response time of the system.
int count = 10_0000;
User user = new User();
user.setUserName("test");
user.setPassword("test");
// ObjectOutputStream
long t1 = System.currentTimeMillis();
for (int i = 0; i < count; i++) {
 ByteArrayOutputStream os = new ByteArrayOutputStream();
 ObjectOutputStream oos = new ObjectOutputStream(os);
 oos.writeObject(user);
 oos.flush();
 oos.close();
 byte[] bytes = os.toByteArray();
 os.close();
}
long t2 = System.currentTimeMillis();
log.info("{}", t2 - t1); // 731
// NIO ByteBuffer
long t3 = System.currentTimeMillis();
for (int i = 0; i < count; i++) {
 ByteBuffer byteBuffer = ByteBuffer.allocate(2048);
 byte[] userName = user.getUserName().getBytes();
 byte[] password = user.getPassword().getBytes();
 byteBuffer.putInt(userName.length);
 byteBuffer.put(userName);
 byteBuffer.putInt(password.length);
 byteBuffer.put(password);
 byteBuffer.flip();
 byte[] bytes = new byte[byteBuffer.remaining()];
}
long t4 = System.currentTimeMillis();
log.info("{}", t4 - t3); // 182

ProtoBuf

  1. ProtoBuf is a serialization framework launched by Google and supports multiple languages
  • In the performance test report of the serialization framework, ProtoBuf performs well in terms of both coding and decoding time consuming and binary stream compression size.
  1. ProtoBuf is based on a file with A. proto suffix, which describes fields and field types and generates data structure files in different languages through tools.
  2. When serializing the data object, ProtoBuf generates the encoding in Protocol Buffers format by describing the. proto file.

Storage format

  1. Protocol Buffers is a portable and efficient structured data storage format
  2. Protocol Buffers uses T-L-V (Identity-Length-Field Value) data format to store data
  • T represents the positive sequence of fields (tag)
  • Protocol Buffers corresponds fields in an object to a sequence of positive numbers, and the corresponding information is guaranteed by the generated code.
  • By replacing field names with integer values in serialization, traffic can be drastically reduced.
  • L represents the byte length of Value and generally takes only one byte.
  • V represents the encoded value of the field value
  1. This format does not require separators, spaces, and redundant field names.

Coding mode

 

1.ProtoBuf defines its own encoding method, which maps almost all the basic data types of languages such as Java/Python.

2. Different encoding methods can correspond to different data types and different storage formats.

3. For the data encoded by Varint, because the storage space occupied by the data is fixed, there is no need to store the length of bytes. T-V is used as the storage mode.

4.Varint encoding is a variable-length encoding method. The last byte of each data type is the marker bit (msb).

  • 0 indicates that the current byte is the last byte
  • 1 indicates that there is another byte behind it.

5. For int32 type digits, four bytes are generally required. For very small int32 type digits, one byte can be used to represent them if Varint coding is used.

  • For most integer type data, it is generally less than 256, so this can play a good data compression effect.

Codec

  1. ProtoBuf not only compresses and stores data well, but also has good performance of encoding and decoding.
  2. ProtoBuf encoding and decoding process combined. proto file format, plus Protocol Buffers unique encoding format
  • Simple data operations and displacement operations are needed to complete encoding and decoding.

I am a rack and need Java to learn advanced architecture materials. Plus my communication group

772300343 is available!

See you next article!

Thank!

Keywords: Java encoding JSON network

Added by llanitedave on Wed, 25 Sep 2019 13:43:18 +0300