Deep parsing of String's intern al method

About String's Internal Method

  1. There are 8 basic types and a more specific type of String in the JAVA language. These types provide the concept of a constant pool to make them run faster and save memory. A constant pool is similar to a cache provided at the JAVA system level.

The eight basic types of constant pools are all system-coordinated, and the String type of constant pools is special. There are two main ways to use it:

  1. String objects declared directly in double quotes are stored directly in the constant pool.
  2. If the String object is not declared in double quotes, you can use the intern method provided by String. The internal method queries the string constant pool for the existence of the current string, or places the current string in the constant pool if it does not exist

Internal method

  1. Java Source
    /**
     * Returns a canonical representation for the string object.
     * <p>
     * A pool of strings, initially empty, is maintained privately by the
     * class {@code String}.
     * <p>
     * When the intern method is invoked, if the pool already contains a
     * string equal to this {@code String} object as determined by
     * the {@link #equals(Object)} method, then the string from the pool is
     * returned. Otherwise, this {@code String} object is added to the
     * pool and a reference to this {@code String} object is returned.
     * <p>
     * It follows that for any two strings {@code s} and {@code t},
     * {@code s.intern() == t.intern()} is {@code true}
     * if and only if {@code s.equals(t)} is {@code true}.
     * <p>
     * All literal strings and string-valued constant expressions are
     * interned. String literals are defined in section 3.10.5 of the
     * <cite>The Java&trade; Language Specification</cite>.
     *
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     */
    public native String intern();
  1. String's intern al method shows that it is a native method, but the comments are clear. "If the current string exists in the constant pool, it will be returned directly. If the string is not present in the constant pool, it will be put into the constant pool and returned."

Code Testing and Principle Analysis

Java Test Code

public class StringDemo {
    public static void main(String[] args) {
        String str = "abc";
        String str1 = new String("abc");
        String str2 = new String("abc");
        String str3 = "a";
        String str4 = "bc";
        String str5 = str3 + str4;
        System.out.println(str1 == str2);
        System.out.println(str1.equals(str2));
        System.out.println(str == str5);
        System.out.println(str == str1.intern());
        System.out.println(str1.intern() == str2.intern());
    }
}

Return results

Result analysis

Let's start by saying the difference between the== and equals methods

  1. ==The effect is different for the base type and the reference type. For the base data type, ==compares values, but for the reference data type, ==compares the object's memory address.
  2. The equals() method exists in the Object class, which is the direct or indirect parent of all classes:
public boolean equals(Object obj) {
     return (this == obj);
}

The equals() method has two uses:
Class does not override the equals() method: when comparing two objects of the class by equals(), it is equivalent to comparing the two objects by'==', using the Object class equals() method by default.
Classes override the equals() method: Generally, we override the equals() method to compare whether attributes in two objects are equal. Returns true if their attributes are equal (that is, they are considered equal).

  1. So the first result returns false, and since String overrides the equals() method, the second result returns true
    string overridden equals() method source:
    /**
     * Compares this string to the specified object.  The result is {@code
     * true} if and only if the argument is not {@code null} and is a {@code
     * String} object that represents the same sequence of characters as this
     * object.
     *
     * @param  anObject
     *         The object to compare this {@code String} against
     *
     * @return  {@code true} if the given object represents a {@code String}
     *          equivalent to this string, {@code false} otherwise
     *
     * @see  #compareTo(String)
     * @see  #equalsIgnoreCase(String)
     */
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

String Constant Pool

  1. JDK1.8 Runtime Data Area
    At JDK 1.8, the method area (the permanent generation of HotSpot) was completely removed (JDK 1.7 started), replaced by metaspace, which uses direct memory.
    As mentioned in Deep Understanding Java Virtual Machines
  2. constant folding
    During compilation, the Javac compiler (collectively referred to as the compiler below) performs a code optimization called Constant Folding. In Deep Understanding Java Virtual Machines, there are also descriptions:

Constant collapse calculates the value of a constant expression as a constant embedded in the final generated code, which is one of the few optimizations that the Javac compiler will do for source code (code optimizations are almost always done in the immediate compiler). For String str3 = "a" + "bc"; The compiler will optimize it to String str3 = "abc"; Not all constants collapse, only constants whose values can be determined by the compiler at compile time:
Basic data types (byte, boolean, short, char, int, float, long, double) and string constants
final-modified basic data types and string variables
Strings are concatenated by'+', arithmetic operations between basic data types (addition, subtraction, multiplication and division), bitwise operations of basic data types (<<, >, >>)
Therefore, str1, str2, and str3 all belong to objects in the string constant pool.
Referenced values are undetermined at program compilation time and cannot be optimized by the compiler. Object references and string splicing of'+'are actually implemented by StringBuilder calling append() method, after which toString() is called to get a String object.

String str5 = new StringBuilder().append(str3).append(str4).toString();

Therefore, str5 is not an object in the string constant pool, but a new object on the heap.

  1. From the above analysis, we can see why the third result is false.

Internal method

  1. Role of the intern al method: If the current string exists in the constant pool, it returns the current string directly. If this string is not present in the constant pool, it will be placed in the constant pool and returned

  2. Because jdk1. When the string constant pool is moved to the heap at 8, explain why results 4 and 5 are printed

  3. As shown in the figure above, when the intern() method is called, they both point to the unique address in the string constant pool, so results 4 and 5 are both true

Keywords: Java Back-end

Added by CKPD on Wed, 12 Jan 2022 19:58:27 +0200