Java Foundation
Chapter 21 buffer stream, conversion stream and serialization stream
Buffer stream
summary
**Buffered stream, also known as efficient stream, is an enhancement to the four basic FileXxx streams, so it is also four streams, classified according to data type:
- Byte buffer stream: BufferedInputStream, BufferedOutputStream
- Character buffer stream: BufferedReader, BufferedWriter
The basic principle of buffered stream is that when creating stream objects, a built-in buffer array of default size will be created to reduce the number of system IO through buffer reading and writing, so as to improve the efficiency of reading and writing**
Byte buffer stream
Construction method
- public BufferedInputStream(InputStream in): create a new buffered input stream.
- public BufferedOutputStream(OutputStream out): create a new buffered output stream.
The code is as follows (example):
// Create byte buffered input stream BufferedInputStream bis = new BufferedInputStream(new FileInputStream("bis.txt")); // Create byte buffered output stream BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream("bos.txt"));
Character buffer stream
Construction method
- public BufferedReader(Reader in): create a new buffered input stream.
- public BufferedWriter(Writer out): create a new buffered output stream.
The code is as follows (example):
// Create character buffered input stream BufferedReader br = new BufferedReader(new FileReader("br.txt")); // Create character buffered output stream BufferedWriter bw = new BufferedWriter(new FileWriter("bw.txt"));
Unique method
The basic method of character buffer stream is the same as that of ordinary character stream. We won't elaborate on it. Let's look at their unique methods.
- BufferedReader: public String readLine(): read a line of text.
- BufferedWriter: public void newLine(): write a line separator, and the symbol is defined by the system attribute.
The readLine method is demonstrated. The code is as follows (example):
public class BufferedReaderDemo { public static void main(String[] args) throws IOException { // Create flow object BufferedReader br = new BufferedReader(new FileReader("in.txt")); // Define a string and save a line of read text String line = null; // Loop reading. null is returned after reading while ((line = br.readLine())!=null) { System.out.print(line); System.out.println("------"); } // Release resources br.close(); } }
The newLine method is demonstrated, and the code is as follows (example):
public class BufferedWriterDemo throws IOException { public static void main(String[] args) throws IOException { // Create flow object BufferedWriter bw = new BufferedWriter(new FileWriter("out.txt")); // Write data bw.write("joker"); // Write line breaks bw.newLine(); bw.write("cyclone"); bw.newLine(); bw.write("W"); bw.newLine(); // Release resources bw.close(); } }
Output effect:
joker cyclone W
Conversion flow
Character encoding and character set
The information stored in the computer is represented by binary numbers, and the numbers, English, punctuation marks, Chinese characters and other characters we see on the screen are the results of binary number conversion. According to certain rules, storing characters in the computer is called encoding. On the contrary, the binary number stored in the computer is parsed and displayed according to some rules, which is called decoding. For example, if it is stored according to rule A and parsed according to rule A, the correct text symbols can be displayed. On the contrary, storing according to rule A and parsing according to rule B will lead to garbled code.
Encoding: characters (understandable) – bytes (incomprehensible)
Decoding: bytes (unintelligible) – > characters (intelligible)
-
Character Encoding: a set of correspondence rules between natural language characters and binary numbers.
Coding table: the corresponding rules between words in life and binary in computer
character set
- Charset: also known as encoding table. It is a collection of all characters supported by the system, including national characters, punctuation marks, graphic symbols, numbers, etc.
To accurately store and recognize various character set symbols, the computer needs character coding. A set of character set must have at least one set of character coding. Common character sets include ASCII character set, GBK character set, Unicode character set, etc.
It can be seen that when the encoding is specified, the corresponding character set will be specified naturally, so the encoding is our final concern.
- ASCII character set:
- ASCII (American Standard Code for Information Interchange) is a set of computer coding system based on Latin alphabet, which is used to display modern English, mainly including control characters (enter key, backspace, line feed key, etc.) and displayable characters (English uppercase and lowercase characters, Arabic numerals and Western symbols).
- The basic ASCII character set uses 7 bits to represent a character, a total of 128 characters. The extended ASCII character set uses 8 bits to represent a character, a total of 256 characters, which is convenient to support common European characters.
- ISO-8859-1 character set:
- Latin code table, alias Latin-1, is used to display the languages used in Europe, including Netherlands, Denmark, German, Italian, Spanish, etc.
- ISO-8859-1 uses single byte encoding and is compatible with ASCII encoding.
- GBxxx character set:
- GB means national standard, which is a set of character sets designed to display Chinese.
- GB2312: Simplified Chinese code table. A character less than 127 has the same meaning as the original. However, when two characters larger than 127 are connected together, they represent a Chinese character, which can be combined with more than 7000 simplified Chinese characters. In addition, mathematical symbols, Roman and Greek letters and Japanese Kanas have been compiled. Even the original numbers, punctuation and letters in ASCII have been re encoded by two bytes, which is often called "full angle" characters, Those below the original number 127 are called "half width" characters.
- GBK: the most commonly used Chinese code table. It is an extended specification based on GB2312 standard. It uses a double byte coding scheme and contains 21003 Chinese characters. It is fully compatible with GB2312 standard and supports traditional Chinese characters, Japanese and Korean characters.
- GB18030: latest Chinese code table. 70244 Chinese characters are included, and multi byte coding is adopted. Each word can be composed of 1, 2 or 4 bytes. Support the characters of ethnic minorities in China, as well as traditional Chinese characters, Japanese and Korean characters.
- Unicode character set:
- Unicode coding system is designed to express any character in any language. It is a standard in the industry, also known as unified code and standard universal code.
- It uses up to four bytes of numbers to express each letter, symbol, or text. There are three coding schemes, UTF-8, UTF-16 and UTF-32. The most commonly used UTF-8 coding.
- UTF-8 encoding can be used to represent any character in Unicode standard. It is the preferred encoding in e-mail, Web pages and other applications for storing or transmitting text. The Internet Engineering Task Force (IETF) requires that all Internet protocols must support UTF-8 encoding. Therefore, when we develop Web applications, we also use UTF-8 encoding. It uses one to four bytes to encode each character. The encoding rules are as follows:
- 128 US-ASCII characters, only one byte encoding is required.
- Latin and other characters require two byte encoding.
- Most common words (including Chinese) are encoded in three bytes.
- Other rarely used Unicode auxiliary characters use four byte encoding.
Problems caused by coding
In IDEA, use FileReader to read the text file in the project. Since the IDEA is set to the default UTF-8 encoding, there is no problem. However, when reading text files created in the Windows system, garbled code will appear because the default of the Windows system is GBK coding.
The code is as follows (example):
public class ReaderDemo { public static void main(String[] args) throws IOException { FileReader fileReader = new FileReader("E:\\File_GBK.txt"); int read; while ((read = fileReader.read()) != -1) { System.out.print((char)read); } fileReader.close(); } }
Output results:
���
InputStreamReader class
Transform stream Java io. Inputstreamreader is a subclass of Reader and a bridge from byte stream to character stream. It reads bytes and decodes them into characters using the specified character set. Its character set can be specified by name or accept the default character set of the platform.
Construction method
- InputStreamReader(InputStream in): creates a character stream that uses the default character set.
- InputStreamReader(InputStream in, String charsetName): creates a character stream with a specified character set.
The construction code is as follows (example):
InputStreamReader isr = new InputStreamReader(new FileInputStream("in.txt")); InputStreamReader isr2 = new InputStreamReader(new FileInputStream("in.txt") , "GBK");
Specify encoding read
public class ReaderDemo2 { public static void main(String[] args) throws IOException { // Define the file path. The file is gbk encoded String FileName = "D:\\file_gbk.txt"; // Create a stream object with the default UTF8 encoding InputStreamReader isr = new InputStreamReader(new FileInputStream(FileName)); // Create a stream object and specify the GBK encoding InputStreamReader isr2 = new InputStreamReader(new FileInputStream(FileName) , "GBK"); // Define variables and save characters int read; // Use the default encoding character stream to read, garbled while ((read = isr.read()) != -1) { System.out.print((char)read); // ��Һ� } isr.close(); // Read using the specified encoded character stream and parse normally while ((read = isr2.read()) != -1) { System.out.print((char)read);// hello everyone } isr2.close(); } }
OutputStreamWriter class
Transform stream Java io. Outputstreamwriter is a subclass of Writer and a bridge from character stream to byte stream. Encodes characters into bytes using the specified character set. Its character set can be specified by name or accept the default character set of the platform.
Construction method
- OutputStreamWriter(OutputStream in): creates a character stream that uses the default character set.
- OutputStreamWriter(OutputStream in, String charsetName): creates a character stream with a specified character set.
The construction code is as follows (example):
OutputStreamWriter isr = new OutputStreamWriter(new FileOutputStream("out.txt")); OutputStreamWriter isr2 = new OutputStreamWriter(new FileOutputStream("out.txt") , "GBK");
Specify encoding write out
public class OutputDemo { public static void main(String[] args) throws IOException { // Define file path String FileName = "D:\\out.txt"; // Create a stream object with the default UTF8 encoding OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(FileName)); // Write data osw.write("Hello"); // Save as 6 bytes osw.close(); // Define file path String FileName2 = "D:\\out2.txt"; // Create a stream object and specify the GBK encoding OutputStreamWriter osw2 = new OutputStreamWriter(new FileOutputStream(FileName2),"GBK"); // Write data osw2.write("Hello");// Save as 4 bytes osw2.close(); } }
Transformation flow understanding diagram
The conversion stream is a bridge between bytes and characters!
Conversion file coding case (example)
Convert GBK encoded text files into UTF-8 encoded text files.
case analysis
- Specify the GBK encoded conversion stream and read the text file.
- Use the UTF-8 encoded conversion stream to write out the text file.
The code is as follows (example):
public class TransDemo { public static void main(String[] args) { // 1. Define file path String srcFile = "file_gbk.txt"; String destFile = "file_utf8.txt"; // 2. Create a flow object // 2.1 convert the input stream and specify the GBK code InputStreamReader isr = new InputStreamReader(new FileInputStream(srcFile) , "GBK"); // 2.2 convert output stream, default utf8 encoding OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(destFile)); // 3. Read and write data // 3.1 defining arrays char[] cbuf = new char[1024]; // 3.2 defining length int len; // 3.3 cyclic reading while ((len = isr.read(cbuf))!=-1) { // Loop write osw.write(cbuf,0,len); } // 4. Release resources osw.close(); isr.close(); } }
Serialized stream
summary
Java provides a mechanism for object serialization. An object can be represented by a byte sequence, which contains the data of the object, the type of the object and the attributes stored in the object. After the byte sequence is written out to the file, it is equivalent to persisting the information of an object in the file.
On the contrary, the byte sequence can also be read back from the file, reconstruct the object and deserialize it. Object data, object type and data information stored in the object can be used to create objects in memory.
ObjectOutputStream class
java.io.ObjectOutputStream class, which converts the original data of Java objects
Construction method
- public ObjectOutputStream(OutputStream out): creates an ObjectOutputStream that specifies the OutputStream. Type is written out to the file to realize the persistent storage of objects.
The code is as follows (example):
FileOutputStream fileOut = new FileOutputStream("employee.txt"); ObjectOutputStream out = new ObjectOutputStream(fileOut);
Serialization operation
- To serialize an object, two conditions must be met:
- This class must implement Java io. Serializable interface. Serializable is a tag interface. Classes that do not implement this interface will not serialize or deserialize any state, and will throw NotSerializableException.
- All properties of this class must be serializable. If a property does not need to be serializable, the property must be marked as transient and modified with the transient keyword.
public class Employee implements java.io.Serializable { public String name; public String address; public transient int age; // Transient is a transient decorated member and will not be serialized public void addressCheck() { System.out.println("Address check : " + name + " -- " + address); } }
2. Write out object methods
- public final void writeObject (Object obj): writes out the specified object.
public class SerializeDemo{ public static void main(String [] args) { Employee e = new Employee(); e.name = "Xiang taro"; e.address = "Fengdu"; e.age = 20; try { // Create serialized stream object ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("employee.txt")); // Write out object out.writeObject(e); // Release resources out.close(); System.out.println("Serialized data is saved"); // Name and address are serialized, age is not serialized. } catch(IOException i) { i.printStackTrace(); } } }
Output results:
Serialized data is saved
ObjectInputStream class
ObjectInputStream deserializes the stream and restores the original data previously serialized with ObjectOutputStream to objects.
Construction method
- public ObjectInputStream(InputStream in): creates an ObjectInputStream that specifies an InputStream.
Deserialization operation 1.0
If we can find the class file of an object, we can deserialize it and call the ObjectInputStream method to read the object:
- public final Object readObject(): reads an object.
public class DeserializeDemo { public static void main(String [] args) { Employee e = null; try { // Create deserialized stream FileInputStream fileIn = new FileInputStream("employee.txt"); ObjectInputStream in = new ObjectInputStream(fileIn); // Read an object e = (Employee) in.readObject(); // Release resources in.close(); fileIn.close(); }catch(IOException i) { // Catch other exceptions i.printStackTrace(); return; }catch(ClassNotFoundException c) { // The capture class cannot find an exception System.out.println("Employee class not found"); c.printStackTrace(); return; } // No abnormality, direct printout System.out.println("Name: " + e.name); // Xiang taro System.out.println("Address: " + e.address); // Fengdu System.out.println("age: " + e.age); // 0 } }
For a JVM to deserialize an object, it must be a class that can find a class file. If the class file of the class cannot be found, a ClassNotFoundException exception is thrown.
Deserialization operation 2.0
**In addition, when the JVM deserializes the object, the class file can be found, but the class file is modified after serializing the object, the deserialization operation will also fail and an InvalidClassException will be thrown** The reasons for this exception are as follows:
- The serial version number of the class does not match the version number of the class descriptor read from the stream
- The class contains an unknown data type
- This class has no accessible parameterless constructor
The Serializable interface provides a serial version number to the class to be serialized. serialVersionUID this version number is used to verify whether the serialized object and the corresponding class match.
public class Employee implements java.io.Serializable { // Add serial version number private static final long serialVersionUID = 1L; public String name; public String address; // Add a new attribute, recompile, deserialize, and assign this attribute as the default value public int eid; public void addressCheck() { System.out.println("Address check : " + name + " -- " + address); } }
Print stream
summary
We usually print the output on the console by calling the print method and println method, both of which are from Java io. Printstream class, which can easily print values of various data types, is a convenient output method.
PrintStream class
Construction method
- public PrintStream(String fileName): creates a new print stream with the specified file name.
The code is as follows (example):
PrintStream ps = new PrintStream("ps.txt");
Change print flow direction
System.out is of PrintStream type, but its flow direction is specified by the system and printed on the console. However, since it is a flow object, we can play a "trick" to change its flow direction.
public class PrintDemo { public static void main(String[] args) throws IOException { // Call the print stream of the system, and the console outputs 97 directly System.out.println(97); // Create a print stream and specify the name of the file PrintStream ps = new PrintStream("ps.txt"); // Set the print flow direction of the system and output it to ps.txt System.setOut(ps); // Call the print stream of the system and output 97 in ps.txt System.out.println(97); } }