hashCode() and equals() require redefinition

What is the function of hashCode method?

(1) Foreword, to understand the function of hashCode, you must first know the collection in Java.

There are two types of collections in Java,

  • One is List, and the other is Set.

  • The elements in the former set are ordered and can be repeated;

  • The latter element is unordered, but the element is not repeatable.

So how do we judge whether two elements are repeated? This is object The equals method.

Generally, to find out whether an object is included in a collection, you take out each element one by one and compare it with the element to be found. When you find that the result of the equals method comparison between an element and the object to be found is equal, you stop the search and return positive information, otherwise return negative information, If there are many elements in a set, such as thousands of elements, and there is no object to find, it means that your program needs to take thousands of elements from the set and compare them one by one to get a conclusion. Therefore,

  • Someone invented a hash algorithm to improve the efficiency of finding elements from a collection,
  • In this way, the collection is divided into several storage areas,
  • Each object can calculate a hash code, which can be grouped,
  • Each group corresponds to a storage area,
  • According to the hash code of an object, the area where the object should be stored can be determined.

The hashCode method can be understood as follows:

  • It returns a value converted according to the memory address of the object.

  • In this way, when the collection wants to add a new element, call the hashCode method of the element first, and you can locate the physical location where it should be placed at once.

    • If there is no element in this location, it can be directly stored in this location without any comparison;

    • If there is already an element in this position, call its equals method to compare with the new element,

      • The same words don't exist,
      • Hash other addresses if they are different.
    • In this way, the number of actual calls to the equals method is greatly reduced, almost only once or twice.

(2) First, both the equals() and hashcode() methods are inherited from the object class.

equals()

The equals() method is defined in the object class as follows:

public boolean equals(Object obj) { 

    return (this == obj); 

}

Obviously, it is a comparison of the address values of two objects (that is, whether the comparison references are the same).

But we must be clear that when String, Math, Integer, Double....

When these encapsulated classes use the equals() method, they have overridden the equals() method of the object class. For example, in the String class, the following is true:

public boolean equals(Object anObject) { 

    if (this == anObject) { 

       return true; 

    } 

    if (anObject instanceof String) { 

       String anotherString = (String)anObject; 

       int n = count; 

       if (n == anotherString.count) { 

            char v1[] = value; 

            char v2[] = anotherString.value; 

            int i = offset; 

            int j = anotherString.offset; 

            while (n-- != 0) { 

                if (v1[i++] != v2[j++]) 

                    return false; 

                } 

            return true; 

        } 

    } 

    return false; 

}

Obviously, this is a content comparison, not an address comparison. And so on Double, Integer, Math.... These classes override the equals() method to compare the contents.

java requires equals()

We should also note that the requirements of the Java language for equals() are as follows, which must be followed:

  • \1) Symmetry: if x.equals(y) returns "true", then y.equals(x) should also return "true".

  • \2) Reflexivity: x.equals(x) must return 'true'.

  • \3) Analogy: if x.equals(y) returns "true" and y.equals(z) returns "true", then z.equals(x) should also return "true".

  • \4) There is also consistency: if x.equals(y) returns "true", as long as the contents of X and Y remain unchanged, no matter how many times you repeat x.equals(y), the return is "true".

  • \5) In any case, x.equals(null) always returns "false"; x. Equals (objects of different types from x) always returns "false".

The above five points are the guidelines that must be followed when rewriting the equals() method. If you violate them, unexpected results will occur. Please follow them.

hashCode

(3) Secondly, the hashcode() method is defined in the object class as follows:

public native int hashCode();

It shows that it is a local method, and its implementation is related to the local machine.

Of course, we can override the hashcode() method in our own class,

  • For example, String, Integer, Double and other classes override the hashcode() method.

For example, the hashcode() method defined in the String class is as follows:

public int hashCode() { 

    int h = hash; 

    if (h == 0) { 

        int off = offset; 

        char val[] = value; 

        int len = count;   

        for (int i = 0; i < len; i++) { 

            h = 31*h + val[off++]; 

        } 

        hash = h; 

    } 

    return h; 

}

Explain the program (written in the API of String):

s[0]*31^(n-1) + 
s[1]*31^(n-2) + ... + 
s[n-1]

Using the int algorithm,

  • Here s[i] is the ith character of the string,
  • n is the length of the string,
  • ^Means exponentiation. (the hash code of the empty string is 0.)

hashCode() and equals()

(4) When it comes to hashcode() and equals(), we have to talk about it

  • How to use HashSet, HashMap and hashtable? Please see the following analysis:

HashSet inherits the Set interface, which implements the Collection interface, which is a hierarchical relationship. So how does HashSet access objects?

  • Duplicate objects are not allowed in hashset, and the position of elements is uncertain.

How to determine whether elements are repeated in hashset?

The rule to judge whether two objects are equal is:

  • 1) , judge whether the hashcodes of two objects are equal.

    • If they are not equal, it is considered that the two objects are not equal. Over,
    • If equal, go to 2
  • 2) , judge whether two objects are equal by equals operation.

    • If they are not equal, the two objects are considered not equal.

    • If equal, two objects are considered equal

      • (equals() is the key to judging whether two objects are equal).

Why are there two principles? Can't the first one be used?

No, because as I said earlier,

  • When hashcode() is equal, the equals() method may also be unequal,
  • Therefore, the second criterion must be used to limit in order to ensure that the added elements are non repeating elements.

For example, the following code:

        String s1 = new String("zhangsan");
        String s2 = new String("zhangsan");

        System.out.println(s1 == s2);// false

        System.out.println(s1.equals(s2));// true

        //-1432604556
        System.out.println(s1.hashCode());// s1.hashcode() equals S2 hashcode()
        System.out.println(s2.hashCode());

        //true
        System.out.println(s1.equals(s2));

        Set hashset = new HashSet();
        hashset.add(s1);
        hashset.add(s2);
        System.out.println(hashset.size());//1

Let's look at the following examples:

Several very simple examples illustrate some very simple principles

Example 1:

public class Point {

    private int x;

    private int y;

    public Point(int x, int y) {

        super();

        this.x = x;

        this.y = y;

    }

    public static void main(String[] args) {

        Point p1 = new Point(3, 3);

        Point p2 = new Point(5, 5);

        Point p3 = new Point(3, 3);

        Collection<Point> collection = new ArrayList<Point>();

        //If you change to hashSet, the length is 3
        //Collection<Point> collection = new HashSet<>();
        //3. Because duplicate objects will not be saved in the HashSet, judge each element before adding it. If it already exists, it will not be added. It is out of order!

        collection.add(p1);

        collection.add(p2);

        collection.add(p3);

        collection.add(p1);

        System.out.println(collection.size());//4. The result is output 4, thinking that there can be duplicate elements in the List and they are orderly.

    }

}

Example 3 (if we need p1 and p3 to be equal, we must re hashcode() and equal() methods):

Override hashCode() and equals()

@Override
public int hashCode() {

    final int prime = 31;

    int result = 1;
	//If 31 + x + 31 + y are equal, the objects are the same
    result = prime * result + x;

    result = prime * result + y;

    return result;

}

	@Override
    public boolean equals(Object obj) {
        //If it is this, return true directly
        if (this == obj) {
            return true;
        }

        if (obj == null) {
			//If it is null, return false directly
            return false;
        }

        if (getClass() != obj.getClass()) {
			// If the class() is different, false is also returned
            return false;
        }

        //Strong rotation
        final Point other = (Point) obj;
			
        if (x != other.x) {
			//x different return false
            return false;
        }

        if (y != other.y) {
			//y different return false
            return false;
        }
		//Finally, it returns true
        return true;

    }

        Collection<Point> collection = new HashSet<>();
		//Output 2, where p1 and p3 are equal

Example 4 (if we remove the hashcode() method, see below):

System.out.println(collection.size());//Output 3. At this time, p1 and p3 are not equal again

    //Reason: Although the equals of p1 and p2 are equal at this time,
//But their hashcode s are not equal, so they are stored in different areas,

//The same thing is stored in these two different areas. When searching, only one area is searched and put in.

Note: in order to avoid the fourth situation, generally,

  • If the two objects of an instance have the same equals, their hashcode must also be equal. Otherwise, it is not true,
  • Of course, the hashcode method is valuable only if the object is stored in a set of hash algorithms The purpose is to ensure that the same objects are stored in the same location.

Summary:

(1) Only when the instance object of the class needs to be stored and retrieved by hash algorithm, the class needs to override the hashCode method as required,

  • Even if the program may not use the hashCode method of the current class for the time being,

  • But there is nothing wrong with providing it with a hashCode method. Maybe it will be used again in the future,

  • Therefore, the hashCode method and the equals method are usually required to be overridden at the same time.

(2) For two objects equal to equals (), hashcode() must be equal;

  • Two objects whose equals () is not equal cannot prove that their hashcode() is not equal.

  • In other words, hashcode() may be equal to two objects whose equals() method is not equal.

  • Conversely, if hashcode() is not equal, equals() will be introduced;

    • hashcode() is equal, and equals() may be equal or unequal.

Tips:

(1) Generally speaking, when two instance objects of a class compare with the equal method and the results are equal,

  • Their hash codes must also be equal, but the opposite is not true,
  • That is, objects with unequal comparison results of the equals method can have the same hash code,
  • In other words, the equal method comparison results of two objects with the same hash code can be different.

(2) After an object is stored in the hashset set, you cannot modify the fields in the object that are involved in calculating the hash value,

  • Otherwise, the modified hash value of the object is different from the hash value originally stored in the hashset set,

  • In this case, even if the contents method uses the object's

    • The current reference is used as a parameter to retrieve objects from the hashset set,
    • It will also return the result that the object cannot be found, which will also lead to the failure to delete the current object separately from the hashset set, resulting in memory leakage,
      • The so-called memory leak means that an object is no longer used, but it always occupies memory space and has not been released.

https://github.com/godmaybelieve

Keywords: Java hashcode HashSet equals

Added by MattMan on Tue, 21 Dec 2021 10:44:31 +0200