Pit of incr+expire of redis

background


The user needs to perform ocr identification. In order to prevent the interface from being brushed, there is a limit (the number of calls per minute cannot exceed xxx).
After investigation, we decided to use the incr and expire of redis to implement this function

Note: the following code is implemented using golang

First edition code

// Execute ocr call
func (o *ocrSvc)doOcr(ctx context.Context,uid int)(interface,err){
	// If the number of calls exceeds the specified limit, the request will be rejected directly
	ok,err := o.checkMinute(uid)
	if err != nil {
		return nil,err
	}
	if !ok {
		return nil,errors.News("frequently called")
	}
	// Execute third-party ocr calls (pseudo code)
	ocrRes,err := doOcrByThird()
	if err != nil {
		return nil,err
	}
	// If the call is successful, perform the incr operation
	if err := o.redis.Incr(ctx,buildUserOcrCountKey(uid));err!=nil{
	   return nil,err
	}
	return ocrRes,nil
}

// Verify that the number of calls per minute exceeds
func (o *ocrSvc)checkMinute (ctx context.Context,uid int) (bool, error) {
	minuteCount, err := o.redis.Get(ctx, buildUserOcrCountKey(uid))
	if err != nil && !errors.Is(err, eredis.Nil) {
		elog.Error("checkMinute: redis.Get failed", zap.Error(err))
		return false, constx.ErrServer
	}
	if errors.Is(err, eredis.Nil) {
		// Expired, or there is no record of the user's call times (set the initial value to 0 and the expiration time to 1 minute)
		o.redis.Set(ctx, buildUserOcrCountKey(uid),0,time.Minute)
		return true, nil
	}
	// The number of calls per minute has been exceeded
	if cast.ToInt(minuteCount) >= config.UserOcrMinuteCount() {
		elog.Warn("checkMinute: user FrequentlyCalled", zap.Int64("uid", uid), zap.String("minuteCount", minuteCount))
		return false, nil
	}
	return true, nil
}

Explain in detail

I won't say what problems exist in this version of code. You can YY it yourself first


explain:

  1. It is assumed that the current user does not exceed the number of calls during ocr identification. But ttl in redis has 1 second left
  2. Then call the third party ocr to identify.
  3. After successful identification, the number of calls is +1. There may be problems here. For example, when the key expires at incr, how does redis do it? It will set the value of the key to 1, ttl to - 1, ttl to - 1, and ttl to - 1 (say important things three times)
  4. At this time, the bug appears. The user's call times are rising and will not expire. When the critical value is reached, the user's request will be rejected

summary

The above code illustrates a problem, that is, incr and expire must be atomic. The first version of our code obviously does not meet the requirements under boundary conditions, which is likely to cause bug s and affect the user experience. It is strongly not recommended. Next, we introduce the modified code (lua script)

Second edition code

After suffering from the loss of the first version of the code, we decided to execute incr+expire in the lua script. There's not much nonsense. Go straight to the code

// Execute ocr call
func (o *ocrSvc)doOcr(ctx context.Context,uid int)(interface,err){
	// If the number of calls exceeds the specified limit, the request will be rejected directly
	ok,err := o.checkMinute(uid)
	if err != nil {
		return nil,err
	}
	if !ok {
		return nil,errors.News("frequently called")
	}
	// Execute third-party ocr calls (pseudo code)
	ocrRes,err := doOcrByThird()
	if err != nil {
		return nil,err
	}
	// If the call is successful, perform the incr operation
	if err := o.redis.Incr(ctx,buildUserOcrCountKey(uid));err!=nil{
	   return nil,err
	}
	return ocrRes,nil
}

func (b *baiduOcrSvc) incrCount(ctx context.Context, uid int64) error {
   /*
   The role of this lua script:
     The first step is to perform incr operation first
     local current = redis.call('incr',KEYS[1])
     Step 2: look at the ttl of the key
     local t = redis.call('ttl',KEYS[1]); 
     Step 3, if ttl is - 1 (never expires)
     if t == -1 then
         Then reset the expiration time to "one minute"
		 redis.call('expire',KEYS[1],ARGV[1])
	 end;
   */
	script := redis.NewScript(
		`local current = redis.call('incr',KEYS[1]);
		 local t = redis.call('ttl',KEYS[1]); 
		 if t == -1 then
		 	redis.call('expire',KEYS[1],ARGV[1])
		 end;
		 return current
	`)
	var (
		expireTime = 60 // 60 seconds
	)
	_, err := script.Run(ctx, b.redis.Client(), []string{buildUserOcrCountKey(uid)}, expireTime).Result()
	if err != nil {
		return err
	}
	return nil
}

// Verify that the number of calls per minute exceeds
func (o *ocrSvc)checkMinute (ctx context.Context,uid int) (bool, error) {
	minuteCount, err := o.redis.Get(ctx, buildUserOcrCountKey(uid))
	if err != nil && !errors.Is(err, eredis.Nil) {
		elog.Error("checkMinute: redis.Get failed", zap.Error(err))
		return false, constx.ErrServer
	}
	if errors.Is(err, eredis.Nil) {
	    // In the second version of the code, initialization is not performed during check
		// Expired, or there is no record of the user's call times (set the initial value to 0 and the expiration time to 1 minute)
		// o.redis.Set(ctx, buildUserOcrCountKey(uid),0,time.Minute)
		return true, nil
	}
	// The number of calls per minute has been exceeded
	if cast.ToInt(minuteCount) >= config.UserOcrMinuteCount() {
		elog.Warn("checkMinute: user FrequentlyCalled", zap.Int64("uid", uid), zap.String("minuteCount", minuteCount))
		return false, nil
	}
	return true, nil
}

summary

After some tossing and turning, it seems that the most difficult problem has been solved. Let me leave you a question. What other problems do you think exist in the second version of the code? Welcome to leave a message in the comment area

Writing is not easy. Please give me a compliment

Keywords: Go Database Redis lua

Added by aris1234 on Fri, 14 Jan 2022 16:30:01 +0200