Differences in the use of HashSet, TreeSet and LinkedHashSet

We often deal with sets in daily life, but how to choose the corresponding data structure is often unclear. Today, let's take a brief look at the differences between HashSet, TreeSet and LinkedHashSet

We all know that Set sets do not contain duplicate elements, which is an important reason why we choose to use Set sets

Set set has three commonly used implementation classes: HashSet, TreeSet and LinkedHashSet

How to choose which set to use according to the scene is a headache

In short, what you need is a fast set. I suggest you use HashSet,

If you need a sort set, select TreeSet,

If you need a set that can store the insertion order, please use LinkedHashSet.


1. Set interface

The Set interface inherits the Collection interface Duplicate elements are not allowed in the Collection. You can simply add them, and the duplicate elements will be automatically removed.

 

2. HashSet vs. TreeSet vs. LinkedHashSet

HashSet is implemented using hash table, and the elements are unordered. The time complexity of adding and deleting operations is O(1).

The internal structure of TreeSet is a tree structure (red black tree). The elements are ordered. The time complexity of adding and deleting operations is O(log(n)). first(), last(), headSet(), tailSet() and other methods are provided to deal with ordered sets.

LinkedHashSet is between HashSet and TreeSet. It is internally a two-way linked list structure, so its insertion is orderly and its time complexity is O(1).


3. TreeSet example

TreeSet<Integer> tree = new TreeSet<Integer>();
tree.add(12);
tree.add(63);
tree.add(34);
tree.add(45);
 
Iterator<Integer> iterator = tree.iterator();
System.out.print("Tree set data: ");
while (iterator.hasNext()) {
    System.out.print(iterator.next() + " ");
}

The output is ordered:

Tree set data: 12 34 45 63

Now let's define a Dog class as follows:

class Dog {
    int size;
 
    public Dog(int s) {
        size = s;
    }
 
    public String toString() {
        return size + "";
    }
}

Add several dogs to TreeSet:

import java.util.Iterator;
import java.util.TreeSet;
 
public class TestTreeSet {
    public static void main(String[] args) {
        TreeSet<Dog> dset = new TreeSet<Dog>();
        dset.add(new Dog(2));
        dset.add(new Dog(1));
        dset.add(new Dog(3));
 
        Iterator<Dog> iterator = dset.iterator();
 
        while (iterator.hasNext()) {
            System.out.print(iterator.next() + " ");
        }
    }
}

Compilation passed, but an error occurred while running:

Exception in thread "main" java.lang.ClassCastException: collection.Dog cannot be cast to java.lang.Comparable
    at java.util.TreeMap.put(Unknown Source)
    at java.util.TreeSet.add(Unknown Source)
    at collection.TestTreeSet.main(TestTreeSet.java:22)

This is because TreeSet is ordered while Dog class is not. We need to implement Dog class into Comparable interface.

class Dog implements Comparable<Dog>{
    int size;
 
    public Dog(int s) {
        size = s;
    }
 
    public String toString() {
        return size + "";
    }
 
    @Override
    public int compareTo(Dog o) {
            return size - o.size;
    }
}

Output:

1 2 3 

Therefore, when we use TreeSet, the elements in it must be orderly, otherwise we should not choose TreeSet.

4. HashSet example

HashSet<Dog> dset = new HashSet<Dog>();
dset.add(new Dog(2));
dset.add(new Dog(1));
dset.add(new Dog(3));
dset.add(new Dog(5));
dset.add(new Dog(4));
Iterator<Dog> iterator = dset.iterator();
while (iterator.hasNext()) {
   System.out.print(iterator.next() + " ");
}

Output:

5 3 2 1 4 

5. LinkedHashSet example

LinkedHashSet<Dog> dset = new LinkedHashSet<Dog>();
dset.add(new Dog(2));
dset.add(new Dog(1));
dset.add(new Dog(3));
dset.add(new Dog(5));
dset.add(new Dog(4));
Iterator<Dog> iterator = dset.iterator();
while (iterator.hasNext()) {
    System.out.print(iterator.next() + " ");
}

Output in the order of insertion:

2 1 3 5 4 

6. Performance test

public static void main(String[] args) {
 
    Random r = new Random();
 
    HashSet<Dog> hashSet = new HashSet<Dog>();
    TreeSet<Dog> treeSet = new TreeSet<Dog>();
    LinkedHashSet<Dog> linkedSet = new LinkedHashSet<Dog>();
 
    // start time
    long startTime = System.nanoTime();
 
    for (int i = 0; i < 1000; i++) {
        int x = r.nextInt(1000 - 10) + 10;
        hashSet.add(new Dog(x));
    }
    // end time
    long endTime = System.nanoTime();
    long duration = endTime - startTime;
    System.out.println("HashSet: " + duration);
 
    // start time
    startTime = System.nanoTime();
    for (int i = 0; i < 1000; i++) {
        int x = r.nextInt(1000 - 10) + 10;
        treeSet.add(new Dog(x));
    }
    // end time
    endTime = System.nanoTime();
    duration = endTime - startTime;
    System.out.println("TreeSet: " + duration);
 
    // start time
    startTime = System.nanoTime();
    for (int i = 0; i < 1000; i++) {
        int x = r.nextInt(1000 - 10) + 10;
        linkedSet.add(new Dog(x));
    }
    // end time
    endTime = System.nanoTime();
    duration = endTime - startTime;
    System.out.println("LinkedHashSet: " + duration);
 
}

As can be seen from the following output results, HashSet is the fastest.

HashSet: 2244768
TreeSet: 3549314
LinkedHashSet: 2263320

Although the test is not accurate enough, it can be reflected that TreeSet is much slower because it is orderly.

Keywords: Java data structure

Added by dzekic on Sun, 30 Jan 2022 14:11:30 +0200