Java local cache framework series-Caffeine-1. Introduction and use

Caffeine is a high-performance local cache framework based on Java 8. Its structure is basically the same as that of Guava Cache, and its api is also the same. It is basically easy to replace. Caffeine is actually based on Guava Cache, using some new features of Java 8 to improve the performance efficiency in some scenarios.

In this chapter, we will introduce some problems from the use of Caffeine, and then analyze its source code to solve these problems, so that we can better understand the principle of Caffeine, better use and optimization, and it will be beneficial for our later coding.

Let's take a look at the basic use of Caffeine. First, create a cache:

Limit cache size

Caffeine has two ways to limit the cache size. Two configurations are mutually exclusive and cannot be configured at the same time

1. Create a limited capacity Cache

Cache<String, Object> cache = Caffeine
                            .newBuilder()
                            //Set the maximum number of cached Entries to 1000
                            .maximumSize(1000)
                            .build();

It should be noted that this limitation is not rigid in practice for performance reasons:

  • When the number of cache elements is about to reach the maximum limit, the expiration policy begins to execute, so some entries that are unlikely to be accessed again may be expired before reaching the maximum capacity.
  • Sometimes, because the overdue Entry task has not been completed, more entries are put into the cache, resulting in the number of cached entries exceeding this limit for a short time

If maximumSize is configured, the following maximumWeight and weight cannot be configured

2. Create a Cache with user-defined weight limit capacity

Cache<String, List<Object>> stringListCache = Caffeine.newBuilder()
    //The maximum weight value. When the weight of all entries is close to this limit, the cache will expire, and some caches will be eliminated
    .maximumWeight(1000)
    //weight value of each Entry
    .weigher(new Weigher<String, List<Object>>() {
        @Override
        public @NonNegative int weigh(@NonNull String key, @NonNull List<Object> value) {
            return value.size();
        }
    })
    .build();

When the key or Value of your cache is large, you can use this method to flexibly control the cache size. Our key above is a list. The size of the list is the size of the Entry. When Weigher is implemented to return only 1, maximumWeight is actually equivalent to maximumSize. Again, for performance reasons, this limitation is not rigid.

Here, we ask the first question: how to save the Entry and how to expire it?

3. Specify initial size

Cache<String, Object> cache = Caffeine.newBuilder()
    //Specify initial size
    .initialCapacity(1000)
    .build();

Similar to HashMap, it can reduce the performance loss caused by capacity expansion by specifying an initial size. This value should not be too large, wasting memory.

Here, we ask the second question: what are the storage parameters affected by this initial size?

4. Specify key and value as non strong reference type

Cache<String, Object> cache = Caffeine.newBuilder()
    // Set key to WeakReference
    .weakKeys()
    .build();
cache = Caffeine.newBuilder()
    // Set key to WeakReference
    .weakKeys()
    // Set value to WeakReference
    .weakValues()
    .build();
cache = Caffeine.newBuilder()
    // Set key to WeakReference
    .weakKeys()
    // Set value to SofReference
    .softValues()
    .build();

For StrongReference, WeakReference and SoftReference in Java, please refer to another article: JDK core JAVA source code analysis (3) - reference related Here is a brief summary:

  1. StrongReference: a strong reference is a common one in program code. A general new object and its assignment to an object variable is a strong reference. As long as an object has a strong reference associated with it, the JVM will not recycle this object, even if the memory is insufficient, the JVM would rather throw an OutOfMemory error than recycle this object.
  2. SoftReference: soft reference is used to describe some useful but not necessary objects. In Java, it is represented by java.lang.ref.SoftReference class.. For the objects associated with soft references, they will be listed in the recycle scope for a second recycle before the system will have a memory overflow exception. If there is not enough memory in this recycle, a memory overflow exception will be thrown.
  3. WeakReference: used to describe non necessary objects, but its strength is weaker than soft reference. Objects associated with weak reference can only survive until the next garbage collection. When the garbage collector works, objects that are only weakly referenced are recycled, regardless of whether the current memory is sufficient or not. In Java, it is represented by the java.lang.ref.WeakReference class

The Key in the Caffeine can be a WeakReference, but it can't be specified as a SoftReference at present, so we raise the third question here, why the Key can't be specified as a SoftReference, and why the SoftReference is treated differently.

Setting the Reference types of Key and Value is also a way to limit the size, but there are many restrictions:

  • You can't use Writer with weakKeys (the fourth question is why we can't use both)
  • You can't use asynchronous cache buildAsync with weakValues or softValues (the fifth question is why you can't use asynchronous cache with weakValues or softValues)

Generally, we can meet our needs through maximumSize and maximumWeight.

Set expiration time related

1. Custom expiration

Cache<String, Order> cache = Caffeine.newBuilder()
    .expireAfter(new Expiry<String, Order>() {
        @Override
        //Set expiration time after Entry creation
        //This is set to expire in 60s
        public long expireAfterCreate(@NonNull String key, @NonNull Order value, long currentTime) {
            return 1000 * 1000 * 1000 * 60;
        }

        @Override
        //Set expiration time after Entry update
        //Return currentDuration here to indicate that it will never expire
        public long expireAfterUpdate(@NonNull String key, @NonNull Order value, long currentTime, @NonNegative long currentDuration) {
            return currentDuration;
        }

        @Override
        //Set expiration time after Entry read
        //This is set to expire in 60s after createTime of Order
        public long expireAfterRead(@NonNull String key, @NonNull Order value, long currentTime, @NonNegative long currentDuration) {
            return 1000 * 1000 * 1000 * 60 - (System.currentTimeMillis() - value.createTime()) * 1000;
        }
    })
    .build();

Set expiration policy by implementing expiration interface. This interface mainly includes three values:

  • Expiration time after Entry creation: the parameters are the Key and Value of the Entry, as well as the creation time of the Entry. You need to return the birth expiration time of this Entry, in nanoSeconds
  • Expiration time after the Entry is updated: the parameters are the Key and Value of the Entry, the current time (not the current system time, but the current time in Ticker, if you need to get the current system time, you need to get it manually) and the current remaining expiration time. You need to return the remaining expiration time of this Entry, in nanoSeconds. If it never expires, you can return currentDuration to indicate that the remaining time will never expire.
  • Expiration time after the Entry is read: the parameter is the Key and Value of the Entry, as well as the current time (not the current system time, but the current time in Ticker, if you need to get the current system time, you need to get it manually) and the current remaining expiration time. You need to return the remaining time of this Entry, in nanoSeconds. If it never expires, you can return currentDuration to indicate that the remaining time will never expire.

This configuration is mutually exclusive with the following expireAfterWrite and expireAfterAccess. Cannot be configured at the same time

**2. Set write and expire after update**

Cache<String, Object> cache = Caffeine.newBuilder()
    //After one minute of writing or updating, the cache expires and becomes invalid
    .expireAfterWrite(1, TimeUnit.MINUTES)
    .build();

This configuration and the above expireAfter are mutually exclusive, and cannot be configured at the same time

**3. Expire after setting operation**

Cache<String, Object> cache = Caffeine.newBuilder()
    //After one minute of writing, updating or reading, the cache expires and becomes invalid
    .expireAfterAccess(1, TimeUnit.MINUTES)
    .build();

This configuration and the above expireAfter are mutually exclusive, and cannot be configured at the same time

LoadingCache related

**1. Generate LoadingCache**

Cache<String, Object> cache = Caffeine.newBuilder()
    //Initialize with CacheLoader
    .build(key -> {
        return loadFromDB(key);
    });

When the Key does not exist or has expired, the CacheLoader will be called to reload the Key. So here are some questions:

  1. Whether Key can be Null and why
  2. When calling CacheLoader, what happens if there is an exception

2. Set the scheduled reload time

Cache<String, Object> cache = Caffeine.newBuilder()
    //Set to call CacheLoader to reload 1 minute after write or update
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    //Initialize with CacheLoader
    .build(key -> {
        return loadFromDB(key);
    });

Note that if this configuration is set, the LoadingCache can only be generated through build(CacheLoader), not ordinary Cache

Additional configuration

1. Statistical records

Cache<String, Object> cache = Caffeine.newBuilder()
    //Open data collection
    .recordStats().build();
Cache<String, Object> cache = Caffeine.newBuilder()
    //Custom data collector
    .recordStats(() -> new StatsCounter() {
        @Override
        public void recordHits(@NonNegative int count) {
            
        }
    
        @Override
        public void recordMisses(@NonNegative int count) {
    
        }
    
        @Override
        public void recordLoadSuccess(@NonNegative long loadTime) {
    
        }
    
        @Override
        public void recordLoadFailure(@NonNegative long loadTime) {
    
        }
    
        @Override
        public void recordEviction() {
    
        }
    
        @Override
        public void recordEviction(@NonNegative int weight) {
            
        }
    
        @Override
        public void recordEviction(@NonNegative int weight, RemovalCause cause) {
    
        }
    
        @Override
        public @NonNull CacheStats snapshot() {
            return null;
        }
}).build();

Here we ask two questions:

  1. Does default data collection affect performance
  2. What data will be collected in data collection

2. Callback after an Entry expires and is removed

Cache<String, Object> cache = Caffeine
    .newBuilder()
    .removalListener((key, value, cause) -> {
        log.info("{}, {}, {}", key, value, cause);
    })
    .build();

There are three parameters in the callback, including the Key of the Entry, the Value of the Entry and the reason for removal. This is be cause of an enumeration type:

public enum RemovalCause {
    EXPLICIT {
        @Override public boolean wasEvicted() {
          return false;
        }
    },
    REPLACED {
        @Override public boolean wasEvicted() {
          return false;
        }
    },
    COLLECTED {
        @Override
        public boolean wasEvicted() {
            return true;
        }
    },
    EXPIRED {
        @Override
        public boolean wasEvicted() {
            return true;
        }
    },
    SIZE {
        @Override
        public boolean wasEvicted() {
            return true;
        }
    };
}

Here is another question: which API operation causes the failure?

3. Cache actively updates other storage or resources

We can also set Writer to apply the update to cache to other storage, such as database:

Cache<String, Object> cache = Caffeine.newBuilder()
    .writer(new CacheWriter<String, Object>() {
        @Override
        public void write(@NonNull String key, @NonNull Object value) {
            //When the cache is updated (including creation and modification, excluding load), call back here
            //Database update
            db.upsert(key, value);
        }

        @Override
        public void delete(@NonNull String key, @Nullable Object value, @NonNull RemovalCause cause) {
            //When the cache fails (including failure for any reason), call back here
            //Database update
            db.markAsDeleted(key, value);
        }
    })
    .build();

Then the following questions are raised:

  • What happens if an exception occurs to the callback?
  • Which API s will trigger write and which will trigger delete

Asynchronous cache

1. Generate asynchronous cache

AsyncCache<String, Object> cache = Caffeine.newBuilder()
    //Generate asynchronous cache
    .buildAsync();

In this cache, the Value obtained is a completable future.

**2. Generate asynchronous LoadingCache**

AsyncCache<String, Object> cache = Caffeine.newBuilder()
    //Generate asynchronous cache
    .buildAsync(key -> {
        return loadFromDB(key);
    });

3. Set asynchronous task thread pool

AsyncCache<String, Object> cache = Caffeine.newBuilder()
    .executor(new ForkJoinPool(10))
    //Generate asynchronous cache
    .buildAsync();

Here we ask the following questions:

  1. In asynchronous cache, which operations are asynchronous?
  2. What is the default thread pool for these asynchronous tasks?
  3. Asynchronous tasks have exceptions. How to deal with them?

Now that we've basically finished creating, let's take a look at using these caches:

Cache<String, String> syncCache = Caffeine.newBuilder().build();
//Join cache
syncCache.put(key, value);
//Batch join
syncCache.putAll(keyValueMap);
//Read the cache. If it does not exist, perform the subsequent mappingFunction reading and put it into the cache
syncCache.get(key, k -> {
    return readFromOther(k);
});
//Bulk read
syncCache.getAll(keys, ks -> {
   return readFromOther(k);
});
//Get cache configuration information, as well as information about other dimensions
Policy<String, String> policy = syncCache.policy();
//Get statistics if statistics must be turned on
CacheStats stats = syncCache.stats();
//Get a key and return null if it does not exist
syncCache.getIfPresent(key);
//Convert map to map, changes to map will affect cache
ConcurrentMap<@NonNull String, @NonNull String> map = syncCache.asMap();
//Make a key effective
syncCache.invalidate(key);
//Disable all key s
syncCache.invalidateAll();
//Batch failure
syncCache.invalidateAll(keys);
//Estimated size
@NonNegative long estimatedSize = syncCache.estimatedSize();
//Wait for the expiration cleanup task to complete, leaving the cache in a stable state
syncCache.cleanUp();

Only synchronous cache is mentioned here. The API of asynchronous cache is similar, but the value is wrapped by completable future

In the next chapter, we will study the source code, implementation principle and idea of Caffeine

Keywords: Programming Java Database jvm JDK

Added by paparanch on Fri, 24 Apr 2020 12:56:46 +0300