Cloud Hystrix fuse

1. Overview

1.1 problems faced by distributed system
Applications in complex distributed architecture have dozens of dependencies, and each dependency will inevitably fail at some time. This creates the possibility of a service avalanche. So what is a service avalanche?
When calling between multiple microservices, suppose microservice A calls microservice B and microservice C, and microservice B and microservice C call other microservices, which is the so-called "fan out" (to an open folding fan). If the call response time of A microservice on the fan out link is too long or unavailable, the call to microservice A will occupy more and more system resources, resulting in system crash, which is the so-called "avalanche effect", that is, the high availability of the system is broken.
For high traffic applications, a single back-end dependency may cause all resources on all servers to saturate in a few seconds. Worse than failure, these applications may also lead to increased latency between services, tight backup queues, threads and other system resources, resulting in more cascading failures sent by the whole system. These all indicate the need to isolate and manage failures and delays so that the failure of a single dependency cannot cancel the entire application or system.
Therefore, when an instance under a module is found to fail, the module will still accept traffic, and then the problematic module calls other modules, which will lead to cascade failure, or avalanche. In the face of this bad problem, we should solve it by means of service degradation and service fusing.
1.2 what is Hystrix
Hystrix is an open source library used to handle the delay and fault tolerance of distributed systems. In distributed systems, many dependencies inevitably fail to call, such as timeout and exception. Hystrix can ensure that when a dependency fails, it will not lead to the failure of the whole service, avoid cascading failures, and improve the elasticity of distributed systems.
"Circuit breaker" itself is a kind of switching device. When a service unit fails, it returns an expected and treatable alternative response (FallBack) to the caller through the fault monitoring of the circuit breaker (similar to the physical blown fuse), rather than waiting for a long time or throwing an exception that the caller cannot handle, This ensures that the thread of the service caller will not be occupied unnecessarily for a long time, so as to avoid the spread and even avalanche of faults in the distributed system.
1.3 what can Hystrix do?
It mainly includes service degradation, service fusing, near real-time monitoring, current limiting, isolation, etc. its official documents are for reference. Of course, Hystrix has stopped working now. Although there are some substitutes, it is very important to learn Hystrix and its ideas!

2. Key concepts of Hystrix

2.1. Service degradation - Fall Back
Suppose that service B to be called by microservice A is unavailable, service B needs to provide a thorough solution instead of letting service a wait and die. Don't let the client wait and return a friendly icon immediately. For example, the client prompts that the server is busy, please try again later, etc. What conditions trigger service degradation?
For example, abnormal program operation, timeout, service fuse triggered service degradation, and full thread pool / semaphore will also lead to service degradation.
2.2. Service fuse Break
Service fusing is equivalent to physical fusing. The analog fuses reach the maximum service access, directly deny access, pull the breakpoint, then call the service degradation method and return friendly hints.
2.3 service Flow Limit
Second kill, high concurrency and other operations. It is strictly forbidden to rush over and crowd. Everyone queue up, N in a second, in an orderly manner.

3. Hystrix case practice

3.1 construction
Create a new Module: cloud provider Hystrix payment8001 as the micro service of the service provider. Like the previous service consumers, 8001 port is selected, but the dependency of Hystrix needs to be introduced into the POM file:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <artifactId>cloud2020</artifactId>
        <groupId>com.atguigu.springcloud</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <artifactId>cloud-provider-hystrix-payment8001</artifactId>
    <dependencies>
        <!--hystrix-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
        </dependency>
        <!--eureka client-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-eureka-server</artifactId>
        </dependency>
        <dependency>
            <groupId>com.atguigu.springcloud</groupId>
            <artifactId>cloud-api-common</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <!--monitor-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <!--Hot deployment-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
</project>

Then write its configuration file with port number 8001 and service name cloud provider hystrix payment to settle it in the service registry.

server:
  port: 8001
spring:
  application:
    name: cloud-provider-hystrix-payment
eureka:
  client:
    register-with-eureka: true
    fetch-registry: true
    service-url:
      defaultZone: http://eureka7001.com:7001/eureka

Then write its main startup class

package com.atguigu.springcloud;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;

/**
 * @create 2021-02-05 13:11
 */
@SpringBootApplication
@EnableEurekaClient
public class PaymentHystrixMain8001 {
    public static void main(String[] args) {
        SpringApplication.run(PaymentHystrixMain8001.class, args);
    }
}

Then write its business class. The service is written as follows:

package com.atguigu.springcloud.service;

import org.springframework.stereotype.Service;

import java.util.concurrent.TimeUnit;

/**
 * @create 2021-02-05 13:20
 */
@Service
public class PaymentService {

    /**
     * Normal access
     * @param id
     * @return
     */
    public String paymentInfo_OK(Long id) {
        return "Thread pool: " + Thread.currentThread().getName() + " paymentInfo_OK, id: " + id;
    }

    /**
     * It takes 3 seconds to simulate complex business
     * @param id
     * @return
     */
    public String paymentInfo_TimeOut(Long id) {
        int time = 3;
        //Pause the thread for a few seconds. There is no error in the program itself, that is, the simulation timeout
        try {
            TimeUnit.SECONDS.sleep(time);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "Thread pool: " + Thread.currentThread().getName() + " paymentInfo_TimeOut, id: " + id;
    }

}

The Controller is written as follows:

package com.atguigu.springcloud.controller;

import com.atguigu.springcloud.service.PaymentService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

/**
 * @create 2021-02-05 13:26
 */
@RestController
@Slf4j
public class PaymentController {

    @Autowired
    private PaymentService paymentService;

    @Value("${server.port}")
    private String serverPort;

    @GetMapping("/payment/hystrix/ok/{id}")
    public String paymentInfo_OK(@PathVariable("id") Long id) {
        String result = paymentService.paymentInfo_OK(id);
        log.info("========result:" + result);
        return result;
    }

    @GetMapping("/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Long id) {
        String result = paymentService.paymentInfo_TimeOut(id);
        log.info("========result:" + result);
        return result;
    }
}

That is, the cloud provider hystrix payment service provides two methods, paymentinfo_ The OK method can be accessed quickly, paymentInfo_TimeOut method we simulate a complex business logic and make it simulate a service method that needs to execute for 3 seconds through thread sleep.
After starting the registration center and 8001 service, we will update the paymentinfo of the service_ OK (replaced by OK below) and paymentInfo_TimeOut (replaced by TO below) respectively. We found that,
http://localhost:8001/payment/hystrix/ok/31 Can be accessed quickly, and http://localhost:8001/payment/hystrix/timeout/31 Each visit takes about 3 seconds.
3.2 high concurrency test
3.2.1 self test pressure test of service provider:
When it takes 3 seconds TO access the complex business logic TO, OK that takes little time can be accessed normally. However, in the case of high concurrency, that is, when there are a lot of visits TO, can OK still be accessed normally? Next, we use Jmeter TO conduct a high concurrency stress test, use 20000 requests TO access the TO service, and create a new thread group in Jmeter: Test Hystrix TO simulate high concurrency access TO the TO service. The thread group configuration parameters are as follows:

Then we use this thread group TO send HTTP requests TO the TO service, and create the following HTTP requests for stress testing:

What happens when we visit the OK service again??
Test results:
OK service can be accessed as quickly as before. Here we simulate 20000 accesses (dare not simulate too many accesses, afraid to kill the system directly. Ha ha ha ha). In fact, there may be far more than 20000 accesses. When there are more accesses, the service may even get stuck. The reason is that the default number of working threads of Tomcat is full, There are no extra threads to break down stress and processing.
The stress test just done is only the test implemented by the service provider 8001. If the external service consumer 80 accesses the service at this time, the service consumer can only do so. Obviously, the consumer will be dissatisfied with such waiting time, and the service provider is likely to be directly dragged to death. We found that 8001 self-test will have problems. What if we use the service consumer test again?
3.2.2 stress test conducted by the service consumer:
Create a new Module: cloud consumer feign hystrix order80 as the service consumer, and modify the POM

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <artifactId>cloud2020</artifactId>
        <groupId>com.atguigu.springcloud</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <artifactId>cloud-consumer-feign-hystrix-order80</artifactId>

    <dependencies>
        <!--hystrix-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
        </dependency>
        <!--openfeign-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-openfeign</artifactId>
        </dependency>
        <!--eureka client-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-eureka-server</artifactId>
        </dependency>
        <dependency>
            <groupId>com.atguigu.springcloud</groupId>
            <artifactId>cloud-api-common</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <!--monitor-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <!--Hot deployment-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

</project>

YML:

server:
  port: 80
eureka:
  client:
    register-with-eureka: false
    fetch-registry: true
    service-url:
      defaultZone: http://eureka7001.com:7001/eureka

Main start:

package com.atguigu.springcloud;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;
import org.springframework.cloud.openfeign.EnableFeignClients;

/**
 * @create 2021-02-05 14:48
 */
@SpringBootApplication
@EnableEurekaClient
@EnableFeignClients
public class OrderHystrixMain80 {
    public static void main(String[] args) {
        SpringApplication.run(OrderHystrixMain80.class, args);
    }
}

The service consumer uses feign to access the service provided by the provider, and writes the corresponding service interface as follows:

package com.atguigu.springcloud.service;

import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.stereotype.Component;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;

/**
 * @create 2021-02-05 14:54
 */
@Component
@FeignClient("CLOUD-PROVIDER-HYSTRIX-PAYMENT")
public interface PaymentHystrixService {

    @GetMapping("/payment/hystrix/ok/{id}")
    public String paymentInfo_OK(@PathVariable("id") Long id);

    @GetMapping("/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Long id);
}

Then write its Controller:

package com.atguigu.springcloud.controller;

import com.atguigu.springcloud.service.PaymentHystrixService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

/**
 * @create 2021-02-05 14:57
 */
@RestController
@Slf4j
public class OrderHystrixController {

    @Autowired
    private PaymentHystrixService paymentHystrixService;

    @GetMapping("/consumer/payment/hystrix/ok/{id}")
    public String paymentInfo_OK(@PathVariable("id") Long id) {
        String result = paymentHystrixService.paymentInfo_OK(id);
        return result;
    }

    @GetMapping("/consumer/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Long id) {
        String result = paymentHystrixService.paymentInfo_TimeOut(id);
        return result;
    }
}

Start the 80 service with http://localhost/consumer/payment/hystrix/ok/1 Access the OK service of the service provider and then conduct the stress test. As before, the service cannot be accessed quickly. If there are more threads in the stress test, it is likely to cause a timeout error. The following error prompt appears:

Fault cause: other interface services at the same level of 8001 are trapped because the working threads in the Tomcat thread pool have been occupied. 80 calling 8001 again at this time will inevitably lead to slow client access response. It is precisely because of this phenomenon that we need technologies such as service degradation, fault tolerance and service flow limitation.
How to solve, solve the requirements?
requirement:
Timeout causes the server to slow down (circle): timeout no longer waits
Error (downtime or program running error): the error should be explained
solve:
The other party's service (8001) has timed out, and the caller (80) cannot wait all the time. There must be service degradation
The other party's service (8001) is down, and the caller (80) can't wait all the time. There must be service degradation
The other party's service (8001) is OK, and the caller (80) has its own fault or self request (its own waiting time is less than that of the service provider)
3.3. Service degradation Fall Back
3.3.1 service degradation of service provider
The degraded configuration is annotated with @ HystrixCommand. The service provider finds problems and sets the peak of its own call timeout. It can operate normally within the peak. If it exceeds the peak, it needs to be handled with a thorough method for service degradation.
First, enable @ HystrixCommand on the business class of the service provider TO realize how TO deal with exceptions. That is, once the service method fails TO be called and an error message is thrown, the fallbackMethod marked by @ HystrixCommand will be automatically called. In the service of the service provider, we modify the TO service:

/**
 * It takes 3 seconds to simulate complex business
 * HystrixCommand Configure normal logic within 3 seconds and service degradation logic over 3 seconds
 * @param id
 * @return
 */
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = {
        @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "3000")
})
public String paymentInfo_TimeOut(Long id) {
    int time = 5;
    //Pause the thread for a few seconds. There is no error in the program itself, that is, the simulation timeout
    try {
        TimeUnit.SECONDS.sleep(time);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    return "Thread pool: " + Thread.currentThread().getName() + " paymentInfo_TimeOut, id: " + id;
}

//Method of covering the bottom
public String paymentInfo_TimeOutHandler(Long id) {
    return "The system is busy, please try again later";
}

Then add the @ enablercircuitbreaker annotation on the main startup class to activate the fuse

package com.atguigu.springcloud;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;

/**
 * @create 2021-02-05 13:11
 */
@SpringBootApplication
@EnableEurekaClient
@EnableCircuitBreaker
public class PaymentHystrixMain8001 {
    public static void main(String[] args) {
        SpringApplication.run(PaymentHystrixMain8001.class, args);
    }
}

The access time of the TO service is 5 seconds, while the peak time configured with the Hystrix is 3 seconds, that is, when the service times out or the service fails, we will access the fallbackMethod service degradation method set by us and access the TO service again http://localhost:8001/payment/hystrix/timeout/31 , we found that the method it executes is indeed a service degradation method:

3.3.2 service degradation of client service consumer
Since the service provider can perform degradation protection, the service consumer can also better protect itself or degrade itself, that is, the Hystrix service degradation can be placed on both the server (service provider) and the client (service consumer), but!!! Generally, the client is used for service degradation. Next, configure its own service degradation protection on the service consumer, that is, the client, modify the configuration file of 80 consumer, and add the following configuration to support Hystrix:

feign:
  hystrix:
    enabled: true

Add @ EnableHystrix to the main startup class of the 80 consumer to activate the Hystrix service.

package com.atguigu.springcloud;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;
import org.springframework.cloud.openfeign.EnableFeignClients;

/**
 * @create 2021-02-05 14:48
 */
@SpringBootApplication
@EnableEurekaClient
@EnableFeignClients
@EnableCircuitBreaker
public class OrderHystrixMain80 {
    public static void main(String[] args) {
        SpringApplication.run(OrderHystrixMain80.class, args);
    }
}

Then add * * @ HystrixCommand * * annotation in the Controller of 80 to realize service degradation:

@GetMapping("/consumer/payment/hystrix/timeout/{id}")
@HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler", commandProperties = {
        @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1500")
})
public String paymentInfo_TimeOut(@PathVariable("id") Long id) {
    String result = paymentHystrixService.paymentInfo_TimeOut(id);
    return result;
}


/**
 * Bottom covering method
 * @param id
 * @return
 */
public String paymentInfo_TimeOutHandler(Long id) {
    return "80 The system is busy, please try again later";
}

In other words, if the consumer accesses the service provider for more than 1.5 seconds, it will access its own degraded service method: http://localhost/consumer/payment/hystrix/timeout/31 .

3.3.3. Unified global service degradation method
The current processing method is problematic, that is, each business method corresponds to a service degradation violation, which will lead to code expansion. Therefore, we should define a unified service degradation method, which is separated from the user-defined method. Moreover, we mix the service degradation method with business logic, which will lead to code confusion and unclear business logic.
For the first problem, we can use the @ DefaultProperties(defaultFallback = "") annotation in the feign interface to configure the global service degradation method, that is, the self configured service degradation method is adopted for those who have configured * * @ HystrixCommand(fallbackMethod = "") * * fallbackMethod, For those not configured, the global service degradation method configured by * * @ DefaultProperties(defaultFallback = "") * * is adopted. In this way, the general service degradation method and the exclusive service degradation method are separated to avoid code expansion, reasonably reduce the amount of code, and modify the Controller of the service consumer 80 into the following:

package com.atguigu.springcloud.controller;

import com.atguigu.springcloud.service.PaymentHystrixService;
import com.netflix.hystrix.contrib.javanica.annotation.DefaultProperties;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

/**
 * @create 2021-02-05 14:57
 */
@RestController
@Slf4j
@DefaultProperties(defaultFallback = "payment_Global_FallbackMethod") //Global service degradation
public class OrderHystrixController {

    @Autowired
    private PaymentHystrixService paymentHystrixService;

    @GetMapping("/consumer/payment/hystrix/ok/{id}")
    @HystrixCommand
    public String paymentInfo_OK(@PathVariable("id") Long id) {
        int i = 1/0; //Manual simulation error
        String result = paymentHystrixService.paymentInfo_OK(id);
        return result;
    }

    @GetMapping("/consumer/payment/hystrix/timeout/{id}")
    //Custom downgrade service
    @HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler", commandProperties = {
            @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1500")
    })
    public String paymentInfo_TimeOut(@PathVariable("id") Long id) {
        String result = paymentHystrixService.paymentInfo_TimeOut(id);
        return result;
    }


    /**
     * Customized service degradation method
     * @param id
     * @return
     */
    public String paymentInfo_TimeOutHandler(Long id) {
        return "80 The system is busy, please try again later";
    }


    /**
     * Global service degradation method
     * @return
     */
    public String payment_Global_FallbackMethod() {
        return "Global exception handling information";
    }

}

paymentInfo_TimeOut because the degraded service is customized, the access to the service provider will timeout (the service execution time provided by the service provider exceeds 1.5 seconds), and paymentInfo_OK, we use int i = 1/0; This line of code simulates an error and accesses it separately http://localhost/consumer/payment/hystrix/timeout/31 And http://localhost/consumer/payment/hystrix/ok/31 , we get the following results:

You can see that due to paymentInfo_OK has no customized service degradation method, so it accesses the global service degradation method instead of paymentInfo_TimeOut accesses the customized service degradation method. It should be noted here that no matter whether the customized service degradation method is configured or not, the annotation @ HystrixCommand should be added to the service. Otherwise, the service degradation has nothing to do with the service, such as paymentInfo_OK if this annotation is not added, the divisor will be reported as 0.
For the second problem, we can add an implementation class for service degradation processing to the interface defined by Feign client to realize decoupling. Our 80 client already has PaymentHystrixService interface. We create a new class PaymentFallbackService to implement the interface, rewrite the methods in the interface and handle exceptions for the methods in the interface, And we declare the class of its service degradation method in PaymentHystrixService:

package com.atguigu.springcloud.service;

import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.stereotype.Component;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;

/**
 * @create 2021-02-05 14:54
 */
@Component
//When an error occurs, go to the PaymentFallbackService class to find the service degradation method
@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT",fallback = PaymentFallbackService.class)
public interface PaymentHystrixService {

    @GetMapping("/payment/hystrix/ok/{id}")
    public String paymentInfo_OK(@PathVariable("id") Long id);

    @GetMapping("/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Long id);
}

Implementation class:

package com.atguigu.springcloud.service;

import org.springframework.stereotype.Component;

/**
 * @create 2021-02-06 17:35
 */
@Component
public class PaymentFallbackService implements  PaymentHystrixService {
    @Override
    public String paymentInfo_OK(Long id) {
        return "paymentInfo_OK An exception occurred";
    }

    @Override
    public String paymentInfo_TimeOut(Long id) {
        return "paymentInfo_TimeOut An exception occurred";
    }
}

Then we shut down the 8001 service provider to simulate server downtime
Test: http://localhost/consumer/payment/hystrix/ok/31 , http://localhost/consumer/payment/hystrix/timeout/31 , because the global exception handling settings are still valid, OK displays the global exception handling information

Then we cancel all the coupled code in the Controller

package com.atguigu.springcloud.controller;

import com.atguigu.springcloud.service.PaymentHystrixService;
import com.netflix.hystrix.contrib.javanica.annotation.DefaultProperties;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

/**
 * @create 2021-02-05 14:57
 */
@RestController
@Slf4j
public class OrderHystrixController {

    @Autowired
    private PaymentHystrixService paymentHystrixService;

    @GetMapping("/consumer/payment/hystrix/ok/{id}")

    public String paymentInfo_OK(@PathVariable("id") Long id) {
        String result = paymentHystrixService.paymentInfo_OK(id);
        return result;
    }

    @GetMapping("/consumer/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Long id) {
        String result = paymentHystrixService.paymentInfo_TimeOut(id);
        return result;
    }

}

Test again, as shown in the figure, when there is an error in service access, access the service degradation method in the PaymentFallbackService class configured by us, so as to realize code decoupling and make the business logic no longer chaotic

3.4. Service Break
3.4.1 overview of circuit breaker mechanism
Fusing mechanism is a microservice link protection mechanism to deal with the avalanche effect. When a microservice in the fan out link is unavailable due to error or the response time is too long, it will degrade the service, and then fuse the call of the node microservice, that is, service fusing will degrade the service and quickly return the wrong response information. When it is detected that the microservice call response of the node is normal, the call link is restored. In other words, the service provider will re allow access to the service after the service is ready. In the spring cloud framework, the fuse mechanism is implemented through hystrix. Hystrix will monitor the call status between microservices. When the failed call reaches a certain threshold, the default is 20 calls in 5 seconds, and the fuse mechanism will be started. The annotation of the fuse mechanism is @ HystrixCommand. For details about the circuit breaker mechanism, please refer to the paper CircuitBreaker.
3.4.2 practical operation
Modify the cloud provider hystrix payment8001 Service provider and add the following code:

//======Service fuse

/**
 * fallbackMethod                               Service degradation method
 * circuitBreaker.enabled                       Is the circuit breaker open
 * circuitBreaker.requestVolumeThreshold        Number of requests
 * circuitBreaker.sleepWindowInMilliseconds     Time window period
 * circuitBreaker.errorThresholdPercentage      What is the failure rate before tripping
 * The following configuration means to request 10 times in 10 seconds. If 6 of them fail, the fuse will be triggered
 * The properties in the annotation @ HystrixProperty are on COM netflix. hystrix. View in the hystrixcommandproperties class
 * @param id
 * @return
 */
@HystrixCommand(fallbackMethod = "paymentCircuitBreaker_fallback", commandProperties = {
        @HystrixProperty(name = "circuitBreaker.enabled", value = "true"),
        @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "10"),
        @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "10000"),
        @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "60")
})
public String paymentCircuitBreaker(@PathVariable("id") Long id) {
    if (id < 0) {
        throw new RuntimeException("id Cannot be negative");
    }
    String serialNumber = IdUtil.simpleUUID();
    return Thread.currentThread().getName() + " Call succeeded, serial number: " + serialNumber;
}

/**
 * Service degradation method triggered by service fuse
 * @param id
 * @return
 */
public String paymentCircuitBreaker_fallback(@PathVariable("id") Long id) {
    return "id Cannot be negative. Please try again later. id:" + id;
}

Configure the parameters of the fuse mechanism in the @ HystrixCommand annotation. The meanings of the configured parameters are as follows:

Attribute namemeaningDefault value
circuitBreaker.enabledIs the circuit breaker opentrue
circuitBreaker.requestVolumeThresholdNumber of requests20
circuitBreaker.sleepWindowInMillisecondsTime window period5000
circuitBreaker.errorThresholdPercentageWhat is the failure rate before tripping50

The specific meaning of these attribute names is at the first level, and their default values can be found in com netflix. hystrix. View in the hystrixcommandproperties class. What we configure in the service means that we request 10 times in 10 seconds. If 6 times fail, the fuse will be triggered.
Add the service in the Controller:

    @GetMapping("payment/circuit/{id}")
    public String paymentCircuitBreaker(@PathVariable("id") Long id) {
        String result = paymentService.paymentCircuitBreaker(id);
        log.info("=========result:" + result);
        return result;
    }

3.4.3 test
According to our business logic, that is, when our id is an integer, the service can be accessed normally, and when the id is a negative number, there is an error accessing the service. Let's visit first http://localhost:8001/payment/circuit/1 On behalf of the correct service request, you can find that everything is normal!!!:

Then we make a lot of wrong access, forcibly trigger the service fuse, and then make the correct access.

We found that after the wrong access beyond our threshold, the service fuse is triggered, and even the correct access cannot be carried out. However, after a certain time, the correct service access can be carried out smoothly. This is the overall process of service fuse: after the service fuse is triggered, the service is degraded first, and then the calling link is gradually restored.
3.4.4 summary
Combined with the description of the fusing mechanism in the official website, the fusing process can be described as follows:
The precise ways of opening and closing fuses are as follows:
1. Suppose the access on the circuit reaches a certain threshold (hystrixcommandproperties. Circuitbreakerrequestvolumthreshold())
2. And assume that the error percentage exceeds the threshold error percentage (hystrixcommandproperties. Circuitbreaker errorthresholdpercentage())
3. Then, the circuit breaker changes from CLOSED to OPEN to trigger the fusing mechanism.
4. When it is disconnected, it will short circuit all requests for the circuit breaker.
5. After a period of time (hystrixcommandproperties. Circuitbreakersleepwindownmilliseconds()), the next single request is allowed to pass (this is the HALF-OPEN state). If the request fails, the circuit breaker will OPEN and return to this state during the sleep window. If the request is successful, the circuit breaker is switched to CLOSED, and 1** The logic in takes over again.
That is, in the fuse mechanism, the fuse is divided into three states:

Fuse OPENThe request does not call the current service. The internal set clock is generally MTTR (mean failure processing time). When it is turned on up to the set clock, it will enter the semi fusing state (HALF-OPEN).
Fuse CLOSEDFusing off will not fuse the service.
Half open fuseSome requests call the current service according to the rules. If the request is successful and meets the rules, it is considered that the current service is restored to normal and closed.

The following is the fuse flow chart on the official website:

So when does the fuse start to work?
Three important parameters related to fuse:
1. The snapshot time window period is circuitbreaker Sleepwindowinmilliseconds: whether the fuse is turned on or not needs to count some request and error data, and the statistical time range is the quick search time window, which defaults to the last 10 seconds;
2. Total requests threshold circuitbreaker Requestvolumthreshold: within the snapshot time window, you must meet the threshold of the total number of requests before you are eligible to trigger the fuse. The default is 20 times, which means that within the time specified in the snapshot time window, if the number of calls of the Hystrix command is less than 20 times, even if all requests timeout or fail for other reasons, the fuse will not open;
3. Error percentage threshold circuitbreaker Errorthresholdpercentage: when the total number of requests exceeds the threshold within the snapshot time window, and in these calls, the fuse will open for the wrong calls that exceed the error percentage threshold proportion.
When there is a request for calling after the fuse is opened, the main logic will not be called, but the service degradation method will be called directly, which realizes the effect of automatically discovering errors and switching the degradation logic to the main logic to reduce the response delay.
After the fuse is opened, how can the original main logic be restored?
After the fuse is opened and the main logic is fused, Hystrix will start a sleep time window (5 seconds). In this time window, the degraded logic is the temporary main logic. When the sleep time window expires, the fuse will enter the semi open state and release a request to the original main logic. If the request can be accessed normally, the fuse will enter the closed state, Thus, the main logic is restored. If there is still a problem with the registration request, the fuse remains open and the sleep time window is re timed.
ALl configuration:



4. Hystrix workflow

The overall Hystrix workflow is as follows:

Chinese version:
Process Description:
1. Create a new hystrix command for each call, and encapsulate the dependent call in the run() method
2. execute()/queue for synchronous or asynchronous calls
3. if the current call has been cached, it will directly return the result, otherwise enter step 4.
4. Judge whether the circuit breaker is open. If it is open, skip to step 8 and carry out the degradation strategy. If it is closed, go to step 5
5. Judge whether the thread pool / queue / semaphore is full. If it is full, enter step 8 of degradation, otherwise continue to step 6
6. Call the run method of HystrixCommand Run dependency logic
6.1. Whether there is an exception in the call. No: continue. Yes, go to step 8,
6.2. Whether the call timed out. No: return the call result. Yes, go to step 8
1. Collect all operation states (success, failure, rejection, timeout) of steps 5 and 6 and report them to the fuse for statistics to judge the fuse state
2. getFallback() demote logic Four cases of triggering getFallback call (arrow source in step 8 in the figure): return the execution success result

5. Service monitoring hystrix Dashboard

In addition to isolating the calls of dependent services, hystrix also provides quasi real-time call monitoring - Hystrix Dashboard. Hystrix will continuously record the execution information of all requests initiated through hystrix and display it to users in the form of statistical reports and graphics, including how many requests are executed, how many successes, how many failures, etc. per second. Spring cloud also provides the integration of the Hystrix Dashboard to transform the monitoring content into a visual interface.
5.1. Create a new Module: cloud consumer Hystrix Dashboard 9001 as the Hystrix Dashboard service
5.2. Add dependency of Hystrix Dashboard: POM

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <artifactId>cloud2020</artifactId>
        <groupId>com.atguigu.springcloud</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <artifactId>cloud-consumer-hystrix-dashboard9001</artifactId>
    <description>hystrix monitor</description>
    
    <dependencies>
        <!--hystrix dashboard-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
        </dependency>
        <!--monitor-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>
        <!--Hot deployment-->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

</project>

5.3. Write configuration file application YML, add a port:

server:
  port: 9001

5.4. Write the main startup class, add the @ EnableHystrixDashboard annotation on the main startup class, and enable the function of the Hystrix Dashboard:

package com.atguigu.springcloud;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.netflix.hystrix.dashboard.EnableHystrixDashboard;

/**
 * @create 2021-02-07 14:16
 */
@SpringBootApplication
@EnableHystrixDashboard
public class HystrixDashboardMain9001 {
    public static void main(String[] args) {
        SpringApplication.run(HystrixDashboardMain9001.class);
    }
}

5.5. All service provider microservices (such as our 8001 / 8002) need to monitor the dependency configuration:

<!--actuator Improvement of monitoring information-->
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

visit http://localhost:9001/hystrix We can see the graphical interface of the Hystrix Dashboard

In order for the services of the service provider to be monitored by the Hystrix Dashboard, the following configuration needs to be added to the main startup class of the provider service. Here, we use cloud provider hystrix payment8001:

/**
     *This configuration is for service monitoring and has nothing to do with the service fault tolerance itself. It is an upgrade of spring cloud
     *ServletRegistrationBean Because the default path of springboot is not "/ hystrix.stream",
     *Just configure the following servlet s in your project
     */
    @Bean
    public ServletRegistrationBean getServlet() {
        HystrixMetricsStreamServlet streamServlet = new HystrixMetricsStreamServlet();
        ServletRegistrationBean registrationBean = new ServletRegistrationBean(streamServlet);
        registrationBean.setLoadOnStartup(1);
        registrationBean.addUrlMappings("/hystrix.stream");
        registrationBean.setName("HystrixMetricsStreamServlet");
        return registrationBean;
    }

Enter the service provider to be monitored in the graphical interface of the Hystrix Dashboard: http://localhost:8001/hystrix.stream

Test address: http://localhost:8001/payment/circuit/1 , http://localhost:8001/payment/circuit/-1

Monitoring results, when successful: Closed status

Monitoring result, in case of failure: Open status

7 colors:

1 turn:

Line 1:

Overall drawing description 1:


Overall drawing Description 2:

Added by Sindarin on Mon, 17 Jan 2022 05:53:33 +0200