Why can't real numbers be the key of HashMap?

1. Causes

The reason why I pay attention to this is a question: max-points-on-a-line on niuke.com

The title is described as follows:

Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.

The main idea is to give me the X and Y coordinates of some points, find the line with the most points, and output the number of points on this line.

So I typed out the following code:

import java.util.HashMap;
import java.util.Map;
//class Point {
//    int x;
//    int y;
//    Point(int a, int b) { x = a; y = b; }
//}

public class Solution {
    public int maxPoints(Point[] points) {
        if (points.length <= 2) {
            return points.length;
        }
        int max = 2;
        for (int i = 0; i < points.length - 1; i++) {
            Map<Float, Integer> map = new HashMap<>(16);
            // Record vertical points; The maximum number of points currently on the same line as Points[i]; Points perpendicular to Points[i]
            int ver = 0, cur, dup = 0;
            for (int j = i + 1; j < points.length; j++) {
                if (points[j].x == points[i].x) {
                    if (points[j].y != points[i].y) {
                        ver++;
                    } else {
                        dup++;
                    }
                } else {
                    float d = (float)((points[j].y - points[i].y) / (double) (points[j].x - points[i].x));
                    map.put(d, map.get(d) == null ? 1 : map.get(d) + 1);
                }
            }
            cur = ver;
            for (int v : map.values()) {
                cur = Math.max(v, cur);
            }
            max = Math.max(max, cur + dup + 1);
        }
        return max;
    }
}

In my naive opinion, this code has no problem, but there is no way. After a long time of troubleshooting, I wrote the following code and ran it in the above code

public static void main(String[] args) {
    int[][] vals = {{2,3},{3,3},{-5,3}};
    Point[] points = new Point[3];
    for (int i=0; i<vals.length; i++){
        points[i] = new Point(vals[i][0], vals[i][1]);
    }
    Solution solution = new Solution();
    System.out.println(solution.maxPoints(points));
}

Its output is actually 2

That is, it thinks that (3-3) / (3-2) and (3-3) / (- 5-2) are different? What the hell

After debug ging, it is found that the above results are 0.0 and -0.0 respectively

Isn't 0.0 equal to -0.0?

At this time, I was already in a deep groove, but I still wrote the verification code:

System.out.println(0.0 == -0.0);

The result is True, no problem, I'm messy

The underlying java code must be wrong! I'm right

After another debug, I found this statement:

map.put(d, map.get(d) == null ? 1 : map.get(d) + 1);

I think map Get () is very problematic. Its source code is as follows:

public V get(Object key) {
     Node<K, V> e;
     return (e = getNode(hash(key), key)) == null ? null : e.valu;
}

Well, get hash() first, right? Then I found its hash function:

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Again, here is to compare the hashCode of h and key, right? Let's look at the hashCode() function

public native int hashCode();

This is a local method. I can't see the source code. Well, I'll use it to have a look. Just test it. I wrote the following test code:

public static void main(String[] args) {
    System.out.println(0.0 == -0.0);
    System.out.println(new Float(0.0).hashCode() 
                       == new Float(-0.0).hashCode());
}

The results turned out to be True and False!!!

The source is finally found. The hashCode values of 0.0 and - 0.0 are different!

After some modification, I passed this question (in fact, there will be problems with accuracy. BigDecimal should be used, but the requirements of niuke.com are not so high. Later, I thought that the accuracy problem can be completely avoided only by writing the linear equation in the form of Ax+By+C=0)

Next, let's discuss what the hashCode() function of real numbers is like:

2. hashCode () of real number

  • During program execution, as long as the information used by the comparison operation of the equals method is not modified, the same object must be called multiple times, and the hashCode method must always return the same integer.
  • If the two objects are equal according to the equals method comparison, calling the hashCode method of the two objects must return the same integer result.
  • If the two objects are unequal according to the equals method, the hashCode method does not have to return different integers—— <effective java>

Let's see if 0.0 and - 0.0 call the equals method equally:

System.out.println(new Float(0.0).equals(0.0f));
System.out.println(new Float(0.0).equals((float) -0.0));

The output is True and False

Well, the two calls to the equals() method are not equal, which seems to meet the logic in the book

Let's see how the bottom equals function of Float is written:

public boolean equals(Object obj) {
   return (obj instanceof Float) 
          && (floatToIntBits(((Float) obj).value) == floatToIntBits(value));
}

Oh, it turned out that something wonderful happened when converting Float into Bits, so I found the source of everything:

/**
* Returns a representation of the specified floating-point value
* according to the IEEE 754 floating-point "single format" bit
* layout.
*
* <p>Bit 31 (the bit that is selected by the mask
* {@code 0x80000000}) represents the sign of the floating-point
* number.
* Bits 30-23 (the bits that are selected by the mask
* {@code 0x7f800000}) represent the exponent.
* Bits 22-0 (the bits that are selected by the mask
* {@code 0x007fffff}) represent the significand (sometimes called
* the mantissa) of the floating-point number.
*
* <p>If the argument is positive infinity, the result is
* {@code 0x7f800000}.
*
* <p>If the argument is negative infinity, the result is
* {@code 0xff800000}.
*
* <p>If the argument is NaN, the result is {@code 0x7fc00000}.
*
* <p>In all cases, the result is an integer that, when given to the
* {@link #intBitsToFloat(int)} method, will produce a floating-point
* value the same as the argument to {@code floatToIntBits}
* (except all NaN values are collapsed to a single
* "canonical" NaN value).
*
* @param   value   a floating-point number.
* @return the bits that represent the floating-point number.
*/
public static int floatToIntBits(float value) {
    int result = floatToRawIntBits(value);
    // Check for NaN based on values of bit fields, maximum
    // exponent and nonzero significand.
    if (((result & FloatConsts.EXP_BIT_MASK) ==
          FloatConsts.EXP_BIT_MASK) &&
         (result & FloatConsts.SIGNIF_BIT_MASK) != 0)
        result = 0x7fc00000;
    return result;
}

This document is very long. I checked other materials and finally understood it after reading it for a long time

That is, the semantics of Java floating-point numbers generally follow IEEE 754 binary floating-point arithmetic standard. The IEEE 754 standard provides definitions of floating point infinity, negative infinity, negative zero, and NaN (non numeric). In the process of using Java, some special floating-point numbers usually confuse everyone

When a floating-point operation produces a negative floating-point number very close to 0, it will produce "- 0.0", which cannot be represented normally

We can output a wave of 0.0 and -0.0 data:

System.out.println(Float.floatToIntBits((float) 0.0));
System.out.println(Float.floatToIntBits((float) -0.0));
System.out.println(Float.floatToRawIntBits(0.0f));
System.out.println(Float.floatToRawIntBits((float)-0.0));

result:

0
-2147483648
0
-2147483648

In other words, 0x80000000 is used to store - 0.0

This is also the familiar integer MIN\_ VALUE

3. Summary

The representation of floating-point numbers in java is complex, especially involving - 0.0, NaN, positive and negative infinity, so it is not suitable to be used as the key of Map, because it may be inconsistent with what we expected

Keywords: Java HashMap

Added by HTTP-404 on Sat, 15 Jan 2022 11:31:17 +0200