3, Java Web Foundation (detailed explanation of xml documents)

1, Concept: Extensible Markup Language

1. Extensible:

Labels are custom< user> <student>

2. Functions:

  1. Store data
  2. configuration file
  3. Transmission in the network

3. The difference between XML and html

  1. xml tags are customized and html tags are predefined.
  2. The syntax of xml is strict and that of html is loose
  3. xml is used to store data, and html is used to display data

4. Grammar

(1) basic grammar:

  1. The suffix of the xml document. xml
  2. The first line of xml must be defined as a document declaration
  3. There is one and only one root tag in the xml document
  4. Attribute values must be enclosed in quotation marks (either single or double)
  5. The label must be closed correctly
  6. xml tag names are case sensitive

For example:

<?xml version="1.0" ?>
<users>
    <user id='1'>
        <name>zhangsan</name>
        <age>23</age>
        <gender>male</gender>
        <br/>
    </user>

    <user id='2'>
        <name>lisi</name>
        <age>24</age>
        <gender>female</gender>
    </user>
</users>

(2) components:

① Document declaration
  1. Format: <? XML attribute list? >
  2. Attribute list:
      * version: version number, required attribute
      * encoding: encoding method. Tells the parsing engine the character set used in the current document. The default value is ISO-8859-1
      * standalone: independent
        * value:
        * yes: does not depend on other files
        * no: dependent on other files
② Instruction (understanding): combined with css

<?xml-stylesheet type="text/css" href="a.css" ?>

③ Labels: custom label names

Rules:

  • The name can contain letters, numbers, and other characters
  • The name cannot start with a number or punctuation mark
  • The name cannot start with the letter Xml (or Xml, Xml, etc.)
  • The name cannot contain spaces
④ Properties:

Unique id attribute value

⑤ Text:

CDATA area: the data in this area will be displayed as is
  format: <! [CDATA [data]] >

(3) constraints: specify the writing rules of xml documents

① As a user (Programmer) of the framework:
  1. Ability to introduce constraint documents into xml
  2. Be able to simply read constraint documents
② Classification:
  1. DTD: a simple constraint technique
    For example: student.dtd

    <!ELEMENT students (student*) ><!--ELEMENT Indicates the meaning of the label,students Is the root label,student Is its sub standard,*Indicates that there can be 0 or more,+Indicates that there must be at least one-->
    <!ELEMENT student (name,age,sex)><!--express student The child tags are name,age,sex,They need to appear in this specified order,And there can only be one by default-->
    <!ELEMENT name (#PCDATA)><!--#PCDATA represents a string, indicating that the content in the name tag body is of string type -- >
    <!ELEMENT age (#PCDATA)><!--#PCDATA represents a string, indicating that the content in the age tag body is of string type -- >
    <!ELEMENT sex (#PCDATA)><!--#PCDATA represents a string, indicating that the content in the body of the sex tag is of string type -- >
    <!ATTLIST student number ID #REQUIRED><!--Attribute constraints, ATTLIST Is the meaning of attribute, indicating student The label has a name called number The type of the property is ID,ID Means the name is number Unique attribute value,#REQUIRED indicates that an attribute named number must appear -- >
    
  2. Schema: a complex constraint technique

③ DTD:

Importing dtd documents into xml documents
  * internal dtd: define constraint rules in xml documents (rarely used)
  <! DOCTYPE root label [constraint content] >

  * external dtd: define the constraint rules in the external dtd file
  local: <! DOCTYPE root signature SYSTEM "dtd file * * location" >

  network: <! DOCTYPE root signature PUBLIC "dtd file name" "dtd file location URL" >
       common methods

④ Schema:
  • What problem does dtd solve by designing schema
    The dtd document constrains the format of the label, but cannot constrain the content in the label body. For example, < sex > content < / sex > can only be man or woman, but dtd files cannot achieve this effect.
    schame constraint document content resolution:

  • introduce:
    1. Fill in the root element of the xml document
    2. Introduce xsi prefix. Xmlns: xsi=“ http://www.w3.org/2001/XMLSchema-instance "
    3. Import XSD file namespace. XSI: schemalocation=“ http://www.kejizhentan.cn/xml student.xsd"
    4. Declare a prefix for each xsd constraint to identify xmlns=“ http://www.kejizhentan.cn/xml "

xml introduces schema constraint document parsing:

For example: mvc configuration file

(4) parsing: operate the xml document and read the data in the document into memory

① Manipulating xml documents
  1. Parse (read): read the data in the document into memory
  2. Write: saves data in memory to an xml document. Persistent storage
② How to parse xml:
  1. DOM: load the markup language document into memory at one time to form a DOM tree in memory
    • Advantages: it is easy to operate and can perform all CRUD operations on documents
    • Disadvantages: occupy memory
  2. SAX: read line by line, event driven.
    • Advantages: no memory.
    • Disadvantages: it can only be read and cannot be added, deleted or modified
③ Common parsers for xml:
  1. JAXP: the parser provided by sun company supports dom and sax
  2. DOM4J: an excellent parser
  3. Jsoup: jsoup is a Java HTML parser that can directly parse a URL address and HTML text content. It provides a very labor-saving API, which can fetch and manipulate data through DOM, CSS and operation methods similar to jQuery.
  4. PULL: the built-in parser of Android operating system, sax mode.
④ Dom4j parsing
  1. Steps:
    a. Import the jar file (dom4j.jar)
    b. Create an input stream to an xml document
    InputStream is = new FileInputStream("file address");
    c. Create an xml reading tool object
    SAXReader sr = new SAXReader();
    d. Through the reading tool, read the input stream i of the xml document and get the document object
    Document doc = sr.reader(is);
    e. Obtain the root node object of the xml document through the document object
    Element root = doc.getRootElement();

  2. Common methods for Element objects
    An Element object represents the node of an xml document
    a. Method to get node name
    String getName();
    b. Method of obtaining node content
    String gerText();
    c. Get the matching first child node object according to the node name
    Element element("node name")
    d. Get all child nodes
    List<Element> elements();
    e. Gets the attribute value of the node
    String attributeValue("attribute value name");

dom4j.jar click download

The contents of books.xml file are as follows:

<?xml version="1.0" encoding="UTF-8"?>
<books>
	<book id="10001">
		<name>Journey to the West</name>
		<info>It tells the story of boss Tang leading three apprentices to start a business</info>
	</book>
	<book id="10002">
		<name>Water Margin</name>
		<info>It tells the story of three women and 105 men</info>
	</book>
	<book id="10003">
		<name>three countries</name>
		<info>It tells the story of three bosses' hard work</info>
	</book>
	<book id="10004">
		<name>The Dream of Red Mansion</name>
		<info>Tells the story of the rich second generation's family</info>
	</book>
</books>

The project structure is as follows:

The code is as follows:

public class TestDom4j {
    public static void main(String[] args) {
        InputStream is = null;
        try {
            //1. Create an input stream to an xml document
            is = TestDom4j.class.getClassLoader().getResourceAsStream("books.xml");
            //2. Create an xml reading tool
            SAXReader sr = new SAXReader();
            //3. Read the input stream of xml document through the reading tool to get the document object
            Document doc = sr.read(is);
            //4. Get the root node through the document object
            Element root = doc.getRootElement();
            //5. Common methods of element
            //5.1 obtain all child nodes through the root node (all child nodes of books are three book nodes)
            List<Element> es = root.elements();
            //5.2 loop traversal set
            for (Element book : es) {
                //5.3 get the id attribute of each child node book
                String idValue = book.attributeValue("id");
                //5.4 get the name child node of the book node
                Element name = book.element("name");
                //5.5 get the content in the name node
                String nameValue = name.getText();
                //5.6 get the child node info of the book node
                Element info = book.element("info");
                //5.7 get the content in info node
                String infoValue = info.getText();
                //5.8 print out the obtained content
                System.out.println("books id" + idValue+","+"Book Name:"+nameValue+","+"Book Introduction:"+infoValue);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The results are as follows:

⑤ Jsup parsing

jsoup is a Java HTML parser, which can directly parse a URL address and HTML text content. It provides a very labor-saving API, which can fetch and manipulate data through DOM, CSS and operation methods similar to jQuery.

  1. Steps:
    a. Import jar package Click to download jsoup.jar
    b. Get Document object
    c. Get the corresponding label Element object
    d. Get data

    The contents of books.xml file are as follows:

    <?xml version="1.0" encoding="UTF-8"?>
    <books>
    	<book id="10001">
    		<name>Journey to the West</name>
    		<info>It tells the story of boss Tang leading three apprentices to start a business</info>
    	</book>
    	<book id="10002">
    		<name>Water Margin</name>
    		<info>It tells the story of three women and 105 men</info>
    	</book>
    	<book id="10003">
    		<name>three countries</name>
    		<info>It tells the story of three bosses' hard work</info>
    	</book>
    	<book id="10004">
    		<name>The Dream of Red Mansion</name>
    		<info>Tells the story of the rich second generation's family</info>
    	</book>
    </books>
    

    The project structure is as follows:

    The code is as follows:

    public class TestJsoup {
        public static void main(String[] args) throws IOException {
            //2. Obtain the Document object according to the xml Document
            //2.1 get the path of student.xml
            String path = TestJsoup.class.getClassLoader().getResource("books.xml").getPath();
            //2.2 parse the xml document, load the document into memory, and obtain the dom tree -- > document
            Document document = Jsoup.parse(new File(path), "utf-8");
            //3. Get the Element object Element
            Elements elements = document.getElementsByTag("name");
    
            System.out.println(elements.size());
            //3.1 get the Element object of the first name
            Element element = elements.get(0);
            //3.2 data acquisition
            String name = element.text();
            System.out.println(name);
        }
    }
    
  2. Use of objects:
    Jsup tool class, which can parse html or xml documents and return Document
    a. parse: parses html or xml documents and returns Document

    • parse (File in, String charsetName): parses xml or html files.
    • parse (String html): parses xml or html strings
    • parse (URL, int timeoutmillis): obtain the specified html or xml document object through the network path
      For example:
    public class TestJsoup {
        public static void main(String[] args) throws IOException {
            //2. Obtain the Document object according to the xml Document
            //2.1 get the path of student.xml
            String path = TestJsoup.class.getClassLoader().getResource("books.xml").getPath();
            //2.2 parse the xml document, load the document into memory, and obtain the dom tree -- > document
           /* Document document = Jsoup.parse(new File(path), "utf-8");
            System.out.println(document);*/
    
            //2.parse (String html): parse xml or html strings
            String str = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                    "<books>\n" +
                    "\t<book id=\"10001\">\n" +
                    "\t\t<name>Journey to the West</name>\n" +
                    "\t\t<info>It tells the story of boss Tang leading three apprentices to start a business</info>\n" +
                    "\t</book>\n" +
                    "\t<book id=\"10002\">\n" +
                    "\t\t<name>Water Margin</name>\n" +
                    "\t\t<info>It tells the story of three women and 105 men</info>\n" +
                    "\t</book>\n" +
                    "\t<book id=\"10003\">\n" +
                    "\t\t<name>three countries</name>\n" +
                    "\t\t<info>It tells the story of three bosses' hard work</info>\n" +
                    "\t</book>\n" +
                    "\t<book id=\"10004\">\n" +
                    "\t\t<name>The Dream of Red Mansion</name>\n" +
                    "\t\t<info>Tells the story of the rich second generation's family</info>\n" +
                    "\t</book>\n" +
                    "</books>";
            Document document = Jsoup.parse(str);
            System.out.println(document);
    
            //3.parse (URL, int timeoutmillis): obtain the specified html or xml document object through the network path
            /*URL url = new URL("http://www.kejizhentan.com");//Represents a resource path in the network
            Document document = Jsoup.parse(url, 10000);
            System.out.println(document);*/
        }
    }
    

    b.Document: document object. Represents a dom tree in memory
    Method for obtaining Element object from document object

    • getElementById (String id): get a unique element object according to the id attribute value
    • getElementsByTag (String tagName): get the element object collection according to the tag name
    • getElementsByAttribute (String key): get the element object collection according to the attribute name
    • getElementsByAttributeValue (String key, String value): get the element object collection according to the corresponding attribute name and attribute value

    c. Elements: a collection of Element objects. It can be used as ArrayList < Element >
    d. Element: Method in element object

    1. Get child element object

      • getElementById (String id): get a unique element object according to the id attribute value
      • getElementsByTag (String tagName): get the element object collection according to the tag name
      • getElementsByAttribute (String key): get the element object collection according to the attribute name
      • getElementsByAttributeValue (String key, String value): get the element object collection according to the corresponding attribute name and attribute value
    2. Get property value
      *String attr(String key): get the attribute value according to the attribute name

    3. Get text content
      *String text(): get text content
      *String html(): get all the contents of the tag body (including the string contents of the word tag)

    e. Node: node object
    Is the parent class of Document and Element

    f. Shortcut query method: Based on jsoup
    1. selector: selector
        * method used: Elements select (String cssQuery)
       * syntax: refer to the syntax defined in the Selector class
    2. XPath: XPath is the XML path language. It is a language used to determine the location of a part in XML (a subset of Standard General Markup Language) documents
       * using the Xpath of jsup requires additional import of jar packages. Click to download the relevant jar
       * query w3cshool reference manual, using xpath syntax to complete the query

    The contents of books.xml file are as follows:

    	<?xml version="1.0" encoding="UTF-8"?>
    <books>
    	<book id="10001">
    		<name id = "1111">Journey to the West</name>
    		<info>It tells the story of boss Tang leading three apprentices to start a business</info>
    	</book>
    	<book id="10002">
    		<name id = "kejizhentan">Water Margin</name>
    		<info>It tells the story of three women and 105 men</info>
    	</book>
    	<book id="10003">
    		<name>three countries</name>
    		<info>It tells the story of three bosses' hard work</info>
    	</book>
    	<book id="10004">
    		<name>The Dream of Red Mansion</name>
    		<info>Tells the story of the rich second generation's family</info>
    	</book>
    </books>
    

    Project structure:

    The code is as follows:

    public class TestXpath {
        public static void main(String[] args) throws IOException, XpathSyntaxErrorException {
            //1. Get the path of student.xml
            String path = TestXpath.class.getClassLoader().getResource("books.xml").getPath();
            //2. Get Document object
            Document document = Jsoup.parse(new File(path), "utf-8");
    
            //3. Create a JXDocument object according to the document object
            JXDocument jxDocument = new JXDocument(document);
    
            //4. Query with xpath syntax
            //4.1 query all book Tags
            List<JXNode> jxNodes = jxDocument.selN("//book");
            for (JXNode jxNode : jxNodes) {
                System.out.println(jxNode);
            }
    
            System.out.println("--------------------");
    
            //4.2 query the name tag under all book tags
            List<JXNode> jxNodes2 = jxDocument.selN("//book/name");
            for (JXNode jxNode : jxNodes2) {
                System.out.println(jxNode);
            }
    
            System.out.println("--------------------");
    
            //4.3 query the name tag with id attribute under the book tag
            List<JXNode> jxNodes3 = jxDocument.selN("//book/name[@id]");
            for (JXNode jxNode : jxNodes3) {
                System.out.println(jxNode);
            }
            System.out.println("--------------------");
            //4.4 query the name tag with id attribute under the book tag, and the id attribute value is kenjizhentan
    
            List<JXNode> jxNodes4 = jxDocument.selN("//book/name[@id='kejizhentan']");
            for (JXNode jxNode : jxNodes4) {
                System.out.println(jxNode);
            }
        }
    
    }
    

    The results are as follows:

(5) generate xml documents through java

Guide package required dom4j.jar click download
step

① Create an empty document object (Document) through the document helper

Document doc = DocumentHelper.createDocument();

② Add the root node through the document object and get the newly added root node object

Element root = doc.addElement("root node name");

③ Enrich child nodes through root nodes

Element child node = root.addElement("element name");

④ Create a file output stream for outputting xml documents to a file

OutputStream os = new FileOutputStream("file path");

⑤ Converting a file output stream to an xml document output stream

XMLWriter xw = new XMLWriter(os);

⑥ Write out documents and free resources

xw .write(doc );
xw .close;
os.close;

The project structure is as follows:

The code is as follows:

public class TestXml {
    public static void main(String[] args) {
        //1. Create a document object through the document helper and temporarily store it in memory
        Document doc = DocumentHelper.createDocument();
        //2. Add root node through document object
        Element root = doc.addElement("books");
        //3. Enrich child nodes through root nodes
        for (int i = 0; i < 4; i++) {
            //3.1 add child nodes through the root node
            Element book = root.addElement("book");
            //3.2 add name and info nodes through the book node
            Element name = book.addElement("name");
            Element info = book.addElement("info");
            //3.3 setting the id attribute of book
            book.addAttribute("id",String.valueOf(1001+i));
            switch(i){
                case 0:
                    //3.4 setting the content of the name node
                    name.setText("Journey to the West");
                    //3.5 setting the content of info node
                    info.setText("It tells the story of boss Tang leading three apprentices to start a business");
                    break;
                case 1:
                    name.setText("Water Margin");
                    info.setText("It tells the story of three women and 105 men");
                    break;
                case 2:
                    name.setText("Romance of the Three Kingdoms");
                    info.setText("It tells the story of three bosses' hard work");
                    break;
                default:
                    name.setText("The Dream of Red Mansion");
                    info.setText("Tells the story of the rich second generation's family");
                    break;
            }
        }
        OutputStream os = null;
        XMLWriter xw = null;
        try {
            //4. Create file output stream
          os   = new FileOutputStream("books.xml");
          //5. Convert byte stream to xml document output stream
            xw = new XMLWriter(os);
            //6. Write the doc document in memory to disk and release resources
            xw.write(doc);
        } catch (FileNotFoundException | UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if(xw != null){
                    xw.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
            try {
                if(os != null){
                    os.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        System.out.println("xml Document generation completed!!!!!");
    }
}

The generated xml content is as follows:

Keywords: html5 xml html

Added by mattchewone on Tue, 21 Sep 2021 05:54:06 +0300