Four common ways of parsing XML
1. Introduction to XML Language
XML is an extensible markup language. It can define semantic markup (tag), which is a meta-markup language. Unlike HTML, which is a hypertext markup language, XML can only use specified tags. For XML, users can define tags they need. Tree model.
XML (eXtensible Markup Language) and HTML (Hyper Text Markup Language) are the same.
Reasons for using XML: data communication between different software (booking and payment), between different platforms (Mac and Windows), data sharing between different platforms (website and mobile APP), linking different things with the same XML file.
2. Four ways of parsing XML files
2.1 DOM Analysis
DOM, Document Object Model, Document Object Model. DOM is the programming interface specification of html and XML documents, which is independent of platform and language. Using the DOM specification, it can realize the conversion between DOM document and xml, traverse and operate the content of corresponding DOM document. The core of DOM specification is tree model. All read before parsing
2.2 JDOM parsing
JDOM is a combination of Java and DOM. JDOM is committed to building a complete Java-based platform to access, manipulate and output XML data through Java code. JDOM is a new API function for reading, writing and manipulating XML in Java language. Simple, efficient and optimized.
2.3 SAX parsing
SAX, Simple API For XML. Non-W3C official standards, "non-governmental" factual standards. SAX is completely different from DOM in concept. Non-document-driven, event-driven. Event-driven: A method of program operation based on callback mechanism. From the outside to the inside layer by layer.
2.4 DOM4j parsing
dom4j is a Java XML API, similar to jdom, used to read and write XML files. Excellent performance, powerful, simple and easy to use open source code.
2.5 Purpose
Get all the data in the XML file.
NodeType Named Content NoeName Return Value NoeValue Return Value
Element 1 ELEMENT_NODE Element name null
Attr 2 ATTRIBUTE_NODE Attribute Name Attribute Value
Text 3 TEXT_NODE#text node content
3. Specific examples of XML parsing
3.1 DOM Analytical Example
1. Create an object of DocumentBuilderFactory
2. Create a DocumentBuilder object to handle exceptions
3. Parsing xml files by the parse (String file Name) method of Document Builder
4. Return an object that returns org.w3c.dom.Document
5. Get all book nodes in xml through getElements ByTagName - > Booklist
6. Get the number of book nodes by nodelist's getLength method
7. Traversing through each book node{
Getting each node through the item method of nodelist
Get all attribute values for each node
Traversing through all attribute values
}
8. Or get the attribute value directly through ELement, if you know the attribute name
Parsing to Get Attribute Values
See Code: code-1
Parsing to get sub-nodes
See Code: code-2
3.2 JDOM parsing instance (non-JAVA official parsing)
1.JDOM needs to import the corresponding jar package
2. Create a SAXBuilder object
3. Create an input stream and load the books.xml file into the input stream
4. Loading the input stream into saxBuilder through the build method of saxBuilder
5. Get the root node of the xml file through the getRootElement of Document
6. Get the list set of the child nodes of the root node through getChildren of root
Parsing to get attributes and child nodes
See Code: code-4
The problem of scrambling when parsing with JDOM:
First, modify encoding= "UTF-8" for XML files
Fix encoding problems without modifying XML files: Using InputStreamReader input stream
InputStreamReader isr = new InputStreamReader(in,"UTF-8");
3.3 Examples of SAX parsing
handler–startElement–endElement
Create factory instance by static new Instance method of SAXParserFactory
Create parse instance through factory's newSAXParser() method
Create a class that inherits the DefaultHandler rewrite method for business processing and create an instance
Pass an instance into the method
Parsing to get attributes and child nodes
See Code: code-3
3.4 DOM4j parsing example
1.DOM4j is an unofficial parsing method to import jar packages
2. Create SAXReader objects
3. Loading books.xml file by reader's read method
4. Getting the root node
5. Getting child nodes
6. Get the attributes and attribute values of the child nodes
7. Get the node and node values of the child node
See Code-5 for the code
4. Comparisons of Four XML Parsing Methods
Basic methods: DOM and SAX, without importing jar packages
DOM: Platform independent, parsing begins to read all XML files into memory
SAX: Event-driven parsing, Content-triggered parsing of XML
Extension method: JDOM and DOM4j, need to import jar package, based on java platform
DOM advantages: tree structure, intuitive, easy to understand, easy to write code
In the parsing process, the tree structure is kept in memory for easy modification.
DOM Disadvantage: When the XML file is large, it consumes a lot of memory and easily affects parsing performance memory overflow.
SAX Advantages: Event-driven mode, low memory consumption, suitable for data only in XML
SAX Disadvantage: It's not easy to code, and it's difficult to access multiple data in the same XML file at the same time.
Code-1:DOM analysis XML Getting attribute values (two methods)
DOMTest.java
package com;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class DOMTest {
public static void main(String[] args) {
/**
* @author Stella
*/
//Create an object for DocumentBuilderFactory
DocumentBuilderFactory bdf = DocumentBuilderFactory.newInstance();
//Create a DocumentBuilder object to handle exceptions
try {
DocumentBuilder bd = bdf.newDocumentBuilder();
//Parsing xml files through the parse (String file Name) method of Document Builder
//Returns the object of org.w3c.dom.Document
Document doc = bd.parse("web/books.xml");
//Get all book nodes in xml through getElementsByTagName - > Booklist
NodeList booklist = doc.getElementsByTagName("book");
//Getting the Number of book Nodes by nodelist's getLength Method
System.out.println("Total"+booklist.getLength()+"This book");
//Traversing through each book node
for (int i = 0;i < booklist.getLength(); i++){
//Getting each node through the item method of nodelist
Node node = booklist.item(i);
//Get all attribute values for each node
NamedNodeMap attr = node.getAttributes();
//Traversing through all attribute values
System.out.println("The first"+(i+1)+"This book contains"+attr.getLength()+"Attributes");
for (int j = 0; j < attr.getLength(); j++){
//Obtaining node attributes through item method, you can see that the return value is still a node.
//Element, attr, Text are nodes
Node att = attr.item(j);
String name = att.getNodeName();
String value = att.getNodeValue();
System.out.println("Property name:"+name+"----Attribute value:"+value);
}
//Get the attribute value directly through ELement, if you know the attribute name
Element attrELe = (Element) booklist.item(i);
String eleValue = attrELe.getAttribute("id");
System.out.println("attribute ID The attribute value of:"+eleValue);
Element attrELe1 = (Element) booklist.item(i);
String eleValue1 = attrELe1.getAttribute("id");
System.out.println("attribute name The attribute value of:"+eleValue1);
}
} catch (ParserConfigurationException e) {
e.printStackTrace();
}catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Code-2:java analysis XML Get the node name and node value
DOMTest .java
package com;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class DOMTest {
public static void main(String[] args) {
//Create a Document Builder Factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse("web/books.xml");
NodeList booklist = doc.getElementsByTagName("book");
//Traversing through each book child node
for (int i = 0; i < booklist.getLength(); i++){
Node book = booklist.item(i);
NodeList childNodes = book.getChildNodes();
System.out.println("The first"+(i+1)+"The number of child nodes in this book:"+childNodes.getLength());
for (int j = 0; j < childNodes.getLength(); j++){
Node child = childNodes.item(j);
if(child.getNodeType()==Node.ELEMENT_NODE){
//Get the type name of the ELement type
String name = child.getNodeName();
//The getNodeValue return value of Element type is null
String value = child.getTextContent();
System.out.println("Name of child node:"+name+" Values of subnodes:"+value);
//You can also get the first node of a child node
String valueContent = child.getFirstChild().getNodeValue();
System.out.println("Name of child node:"+name+" Values of subnodes:"+valueContent);
}
}
}
} catch (ParserConfigurationException e) {
e.printStackTrace();
}catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Code-3: SAX analysis XML Get the attribute name, node name, and node value
SAXTest.java
package com;
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
import com.sun.handler.SAXParseHandler;
public class SAXTest {
public static void main(String[] args) {
//1 First, get an instance factory of SAXFactory.
SAXParserFactory factory = SAXParserFactory.newInstance();
//2. Obtaining an instance of SAXParser through factory
//2 Create an instance of SAXParseHandler
SAXParseHandler phandler = new SAXParseHandler();
try {
SAXParser parser = factory.newSAXParser();
parser.parse("web/books.xml", phandler);
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
SAXParseHandler.java
package com.sun.handler;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParseHandler extends DefaultHandler{
int bookIndex = 0;
//Rewriting the Start Label Method for Traversing xml Files//Parsing xml Elements
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//Call the startElement method of the parent DefaultHandler
super.startElement(uri, localName, qName, attributes);
if(qName.equals("book")){
/**
* Given the attribute name, get the attribute value according to the attribute name
* String value = attributes.getValue("id");
* System.out.println("book The attribute value of'+value';
*/
bookIndex++;
System.out.println("The first" + bookIndex + "Beginning of book traversal");
//Do not know the name and number of attributes, through the attributes method to get
for (int i = 0; i < attributes.getLength(); i++){
String name = attributes.getQName(i);
String value = attributes.getValue(i);
System.out.println("The first" + ( i + 1 ) + "Property name:" + name + "Attribute value" + value);
}
}else if(!(qName.equals("book")||qName.equals("bookstore"))){
System.out.print("Node name:" + qName);
}
}
//End Label Method for Rewriting Traversing xml Files
public void endElement(String uri, String localName, String qName) throws SAXException {
super.endElement(uri, localName, qName);
//Judging whether a book is over
if(qName.equals("book")){
System.out.println("The first" + bookIndex + "End of book traversal");
}
}
//Rewrite identification xml file parsing start method
public void startDocument() throws SAXException {
super.startDocument();
System.out.println("SAX Start of parsing");
}
//Rewrite Identification Method for Ending xml File Parsing
public void endDocument() throws SAXException {
super.endDocument();
System.out.println("SAX End of parsing");
}
//Rewriting Method for Obtaining Node Value
public void characters(char[] ch, int start, int length) throws SAXException {
super.characters(ch, start, length);
//ch is the whole book. XML content
String value = new String(ch, start, length);
if(!value.trim().equals("")){
System.out.println("Node values:" + value.trim());
}
}
}
//The tree structure of xml files is preserved when parsing xml files through java classes:
//Create Book.java
package com.sun.handler;
public class Book {
private String id;
private String name;
private String author;
private String year;
private String price;
private String language;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getAuthor() {
return author;
}
public void setAuthor(String author) {
this.author = author;
}
public String getYear() {
return year;
}
public void setYear(String year) {
this.year = year;
}
public String getPrice() {
return price;
}
public void setPrice(String price) {
this.price = price;
}
public String getLanguage() {
return language;
}
public void setLanguage(String language) {
this.language = language;
}
public String toString() {
return super.toString();
}
}
//Add content to SAXParseHandler
SAXParseHandler.java
package com.sun.handler;
import java.util.ArrayList;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXParseHandler extends DefaultHandler{
int bookIndex = 0;
String value = new String();
Book book = null;
private ArrayList<Book> booklist = new ArrayList<Book>();
public ArrayList<Book> getBooklist() {
return booklist;
}
//Rewriting the Start Label Method for Traversing xml Files//Parsing xml Elements
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//Call the startElement method of the parent DefaultHandler
super.startElement(uri, localName, qName, attributes);
if(qName.equals("book")){
book = new Book();
/**
* Given the attribute name, get the attribute value according to the attribute name
* String value = attributes.getValue("id");
* System.out.println("book The attribute value of'+value';
*/
bookIndex++;
System.out.println("The first" + bookIndex + "Beginning of book traversal");
//Do not know the name and number of attributes, through the attributes method to get
for (int i = 0; i < attributes.getLength(); i++){
String name = attributes.getQName(i);
String value = attributes.getValue(i);
System.out.println("The first" + ( i + 1 ) + "Property name:" + name + "Attribute value" + value);
if(attributes.getQName(i).equals("id")){
book.setId(value);
}
}
}else if(!(qName.equals("book")||qName.equals("bookstore"))){
System.out.print("Node name:" + qName);
}
}
//End Label Method for Rewriting Traversing xml Files
public void endElement(String uri, String localName, String qName) throws SAXException {
super.endElement(uri, localName, qName);
//Judging whether a book is over
if(qName.equals("book")){
//End the previous book node and empty the contents of the global book to facilitate the recording of the next book
//How to save the content of a book?
//Save globally with arraylist
booklist.add(book);
book = null;
System.out.println("The first" + bookIndex + "End of book traversal");
}else if(qName.equals("name")){
book.setName(value);
}else if(qName.equals("author")){
book.setAuthor(value);
}else if(qName.equals("year")){
book.setYear(value);
}else if(qName.equals("language")){
book.setLanguage(value);
}else if(qName.equals("price")){
book.setPrice(value);
}else if(qName.equals("id")){
book.setId(value);
}
}
//Rewrite identification xml file parsing start method
public void startDocument() throws SAXException {
super.startDocument();
System.out.println("SAX Start of parsing");
}
//Rewrite Identification Method for Ending xml File Parsing
public void endDocument() throws SAXException {
super.endDocument();
System.out.println("SAX End of parsing");
}
//Rewriting Method for Obtaining Node Value
public void characters(char[] ch, int start, int length) throws SAXException {
super.characters(ch, start, length);
//ch is the whole book. XML content
value = new String(ch, start, length);
if(!value.trim().equals("")){
System.out.println("----Node values:" + value.trim());
}
}
}
Code-4:
package com.JDOMtest;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.text.AttributedCharacterIterator.Attribute;
import java.util.List;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.input.SAXBuilder;
public class JDOMTest {
public static void main(String[] args) {
//1. Create a SAXBuilder object
SAXBuilder saxBuilder = new SAXBuilder();
File file = new File("web/books.xml");
try {
if(!file.exists()){
file.createNewFile();
}
//2. Create an input stream and load the books.xml file into the input stream
FileInputStream in = new FileInputStream(file);
//3. Loading the input stream into saxBuilder through the build method of saxBuilder
Document doc = saxBuilder.build(in);
//4. Get the root node of the xml file through the getRootElement of Document
Element root = doc.getRootElement();
//5. Get the child nodes of the root node through getChildren of root
List<Element> bookList = root.getChildren();
System.out.println("Total" + bookList.size() + "This book");
//Get the properties of book
for (int i = 0; i < bookList.size(); i++){
Element book = bookList.get(i);
List<org.jdom2.Attribute> attr = book.getAttributes();
System.out.println("The first" + (i + 1) +"This book" + attr.size() + "Attributes");
for (int j = 0 ;j < attr.size(); j++){
System.out.print("Property name:" + attr.get(j).getName());
System.out.println("----Attribute value:" + attr.get(j).getValue());
}
}
//Get the node and node values of book
for (int i = 0; i < bookList.size(); i++){
Element book = bookList.get(i);
List<Element> bookElement = book.getChildren();
System.out.println("The first" + (i + 1) + "This book contains" + bookElement.size() + "Number of nodes");
for (int j = 0; j < bookElement.size(); j++){
String name = bookElement.get(j).getName();
String value = bookElement.get(j).getValue();
System.out.print(" The first" + (j+1) + "individual----Node name:" + name);
System.out.println("----Node value:" + value);
}
}
//foreach loop analysis
for (Element book:bookList){
System.out.println("\n----Start parsing section" + (bookList.indexOf(book) + 1) + "This book");
List<org.jdom2.Attribute> attrs = book.getAttributes();
for (org.jdom2.Attribute attr:attrs){
//Get the property name and value
String name = attr.getName();
String value = attr.getValue();
System.out.print(" The first" + (attrs.indexOf(attr) + 1) + "individual----Property name:" + name);
System.out.println("----Node value:" + value);
}
List<Element> ele = book.getChildren();
for (Element o:ele){
String name = o.getName();
String value = o.getValue();
System.out.print(" The first" + (ele.indexOf(o) + 1) + "individual----Node name:" + name);
System.out.println("----Node value:" + value);
}
System.out.println("\n----End parsing section" + (bookList.indexOf(book) + 1) + "This book");
}
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Code-5:
package com.DOM4j;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class DOM4jTest {
public static void main(String[] args) {
//1.DOM4j is an unofficial parsing method to import jar packages
//2. Create SAXReader objects
SAXReader reader= new SAXReader();
try {
//Loading books.xml file through reader's read method
Document doc = reader.read(new File("web/books.xml"));
Element root = doc.getRootElement();
List<Element> books = root.elements();
int bookSize = books.size();
System.out.println("Total" + bookSize + "This book");
for (Element book:books){
System.out.println("The first" + (books.indexOf(book) + 1) + "Beginning of Book Analysis");
//Get all attributes
List<Attribute> attrs = book.attributes();
int attrSize = attrs.size();
System.out.println("\t Total" + attrSize + "Attributes:");
for (Attribute attr:attrs){
String name = attr.getName();
String value = attr.getValue();
System.out.println("\t The first" + (attrs.indexOf(attr)+1)+"Attribute names of attributes:" + name + "-----Attribute value:" + value);
}
//Get all nodes
List<Element> bookEles = book.elements();
int bookEleSize = bookEles.size();
System.out.println("\t Total" + bookEleSize + "Each node:");
for (Element ele:bookEles){
String name = ele.getName();
String value = ele.getStringValue();
System.out.println("\t The first" + (bookEles.indexOf(ele)+1)+"Node name of each node:" + name + "-----Node value:" + value);
}
System.out.println("The first" + (books.indexOf(book) + 1) + "End of parsing");
}
//Get all book s through iterators
Iterator<Element> it = root.elementIterator();
while(it.hasNext()){
}
} catch (DocumentException e) {
e.printStackTrace();
}
}
}
Code-5:
package com.DOM4j;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class DOM4jTest {
public static void main(String[] args) {
//1.DOM4j is an unofficial parsing method to import jar packages
//2. Create SAXReader objects
SAXReader reader= new SAXReader();
try {
//Loading books.xml file through reader's read method
Document doc = reader.read(new File("web/books.xml"));
Element root = doc.getRootElement();
List<Element> books = root.elements();
int bookSize = books.size();
System.out.println("Total" + bookSize + "This book");
for (Element book:books){
System.out.println("The first" + (books.indexOf(book) + 1) + "Beginning of Book Analysis");
//Get all attributes
List<Attribute> attrs = book.attributes();
int attrSize = attrs.size();
System.out.println("\t Total" + attrSize + "Attributes:");
for (Attribute attr:attrs){
String name = attr.getName();
String value = attr.getValue();
System.out.println("\t The first" + (attrs.indexOf(attr)+1)+"Attribute names of attributes:" + name + "-----Attribute value:" + value);
}
//Get all nodes
List<Element> bookEles = book.elements();
int bookEleSize = bookEles.size();
System.out.println("\t Total" + bookEleSize + "Each node:");
for (Element ele:bookEles){
String name = ele.getName();
String value = ele.getStringValue();
System.out.println("\t The first" + (bookEles.indexOf(ele)+1)+"Node name of each node:" + name + "-----Node value:" + value);
}
System.out.println("The first" + (books.indexOf(book) + 1) + "End of parsing");
}
//Get all book s through iterators
Iterator<Element> it = root.elementIterator();
while(it.hasNext()){
}
} catch (DocumentException e) {
e.printStackTrace();
}
}
}