服务雪崩:多个微服务之间调用的时候,假设A调用B、C,B、C服务又调用其他服务,这就是所谓的扇出。如果扇出的链路上某个微服务调用的时间过长或者不可用,对微服务A的调用就会占用越来越多的资源,从而引起系统崩溃,这就是所谓的"雪崩效应"。
对于高流量的应用来说,单一的后端依赖可能会导致所有服务器上的所有资源在几秒内饱和。比失败更糟糕的是,这些应用程序还可能导致服务之间的延迟增加,备份队列,线程和其他资源紧张,导致整个系统发生其他的级联故障。这些都表示需要对故障和延迟进行隔离和关联,以便单个依赖关系的失败,不能取消整个应用程序或系统。
1.Hystrix简介
1.Hystrix是什么
在布式系统面临的一个重要问题:应用程序可能有数十个依赖,每个依赖关系不可避免的会出现失败,比如超时、异常等。Hystrix是一个用于分布式系统的延迟和容错的开源库,能够在一个依赖出问题的情况下,不会导致整体服务失败,避免级联故障,以提高分布式系统的弹性。
"断路器"本身是一种开关装置,当某个服务单元发生故障之后,通过断路器的故障监控(类似熔断保险丝),向调用方返回一个预期的、可处理的备选响应(FallBack),而不是长时间的等待或者抛出调用方无法处理的异常,这就保证了服务调用方的线程不会被长时间、不必要的占用,从而避免了故障在分布式系统的蔓延乃至雪崩。
git地址:https://github.com/Netflix/Hystrix
2.Hystrix的作用以及重要概念
可以进行服务降级、服务熔断、接近实时的监控。不过官网上Hystrix已经停止更新。与之对应的还有resilience4j、sentinel。
服务降级(fallback):就是某个服务出现故障了,不让调用方一直等待,返回一个友好的提示(fallback)。下面情况会发出降级:程序运行异常、超时、服务熔断触发服务降级、线程池信号量打满也会导致服务降级。
服务熔断(break):类比保险丝达到最大服务访问后,直接拒绝访问,拉闸限电,然后调用服务降级的方法并返回友好提示。通常过程:服务降级-》熔断-》恢复调用链路。
服务限流(flowlimit): 限制接口的访问次数,严禁接口无休止的调用,比如某个接口1秒钟只能调用200次。自己实现的话可以在Controller层用AOP实现。
2.使用
1. 创建payment支付服务
1.新建项目
2.修改pom
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <parent> <artifactId>cloud</artifactId> <groupId>cn.qz.cloud</groupId> <version>1.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> <artifactId>cloud-provider-hystrix-payment8081</artifactId> <dependencies> <!--hystrix--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-hystrix</artifactId> </dependency> <!--eureka-client--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId> </dependency> <!--引入自己抽取的工具包--> <dependency> <groupId>cn.qz.cloud</groupId> <artifactId>cloud-api-commons</artifactId> <version>${project.version}</version> </dependency> <!--web--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies> </project>
3.修改yml
server:
port: 8081
spring:
application:
name: cloud-provider-hystrix-payment
eureka:
client:
register-with-eureka: true
fetch-registry: true
service-url:
#defaultZone: http://eureka7001.com:7001/eureka,http://eureka7002.com:7002/eureka
defaultZone: http://localhost:7001/eureka
4.启动类
package cn.qz.cloud; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.cloud.netflix.eureka.EnableEurekaClient; /** * @Author: qlq * @Description * @Date: 22:08 2020/10/17 */ @SpringBootApplication @EnableEurekaClient public class PaymentHystrixMain8081 { public static void main(String[] args) { SpringApplication.run(PaymentHystrixMain8081.class, args); } }
5.业务类:
Service(这里直接用class,不用接口)
package cn.qz.cloud.service; import org.springframework.stereotype.Service; import java.util.concurrent.TimeUnit; /** * @Author: qlq * @Description * @Date: 22:15 2020/10/17 */ @Service public class PaymentService { /** * 正常 * * @param id * @return */ public String success(Integer id) { return "success,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } public String timeout(Integer id) { try { TimeUnit.SECONDS.sleep(5); } catch (InterruptedException e) { e.printStackTrace(); } return "timeout,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } }
Controller
package cn.qz.cloud.controller; import cn.qz.cloud.service.PaymentService; import cn.qz.cloud.utils.JSONResultUtil; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Value; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import javax.annotation.Resource; /** * @Author: qlq * @Description * @Date: 22:22 2020/10/17 */ @RestController @Slf4j @RequestMapping("/hystrix/payment") public class PaymentController { @Resource private PaymentService paymentService; @Value("${server.port}") private String serverPort; @GetMapping("/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id) { String result = paymentService.success(id); log.info("*****result: " + result); return JSONResultUtil.successWithData(result); } @GetMapping("/timeout/{id}") public JSONResultUtil timeout(@PathVariable("id") Integer id) { String result = paymentService.timeout(id); log.info("*****result: " + result); return JSONResultUtil.successWithData(result); } }
6.测试正常
2.创建订单服务
1.创建模块
2.修改pom
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <parent> <artifactId>cloud</artifactId> <groupId>cn.qz.cloud</groupId> <version>1.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> <artifactId>cloud-consumer-feign-hystrix-order80</artifactId> <dependencies> <!--hystrix--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-hystrix</artifactId> </dependency> <!--openfeign--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-openfeign</artifactId> </dependency> <!--eureka-client--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId> </dependency> <!--引入自己抽取的工具包--> <dependency> <groupId>cn.qz.cloud</groupId> <artifactId>cloud-api-commons</artifactId> <version>${project.version}</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies> </project>
3.修改yml文件
server:
port: 80
eureka:
client:
register-with-eureka: false
service-url:
defaultZone: http://localhost:7001/eureka/
feign:
hystrix:
enabled: true
4.启动类:
package cn.qz.cloud; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.cloud.netflix.hystrix.EnableHystrix; import org.springframework.cloud.openfeign.EnableFeignClients; /** * @Author: qlq * @Description * @Date: 14:25 2020/10/18 */ @SpringBootApplication @EnableFeignClients @EnableHystrix public class OrderHystrixMain80 { public static void main(String[] args) { SpringApplication.run(OrderHystrixMain80.class, args); } }
5.业务类
(1)Service
package cn.qz.cloud.service; import cn.qz.cloud.utils.JSONResultUtil; import org.springframework.cloud.openfeign.FeignClient; import org.springframework.stereotype.Component; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; /** * @Author: qlq * @Description * @Date: 14:31 2020/10/18 */ @Component @FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT") public interface PaymentHystrixService { @GetMapping("/hystrix/payment/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id); @GetMapping("/hystrix/payment/timeout/{id}") public JSONResultUtil<String> timeout(@PathVariable("id") Integer id); }
(2)controller
package cn.qz.cloud.controller; import cn.qz.cloud.service.PaymentHystrixService; import cn.qz.cloud.utils.JSONResultUtil; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; /** * @Author: qlq * @Description * @Date: 14:49 2020/10/18 */ @RestController @Slf4j @RequestMapping("/consumer/hystrix/payment") public class OrderHystirxController { @Autowired private PaymentHystrixService paymentHystrixService; @GetMapping("/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id) { return paymentHystrixService.success(id); } @GetMapping("/timeout/{id}") public JSONResultUtil timeout(@PathVariable("id") Integer id) { return paymentHystrixService.timeout(id); } }
6.测试:
3.压力测试
(1)用jmeter测试2W个并发访问8081服务的timeout方法。
1)新建线程组:
2)新建HttpRequest请求
3)新建listener->view results Tree
4)执行jmeter测试。相当于2W个请求同时去请求timeout接口,8081的tomcat会分配线程组处理这2W个请求。
可以从8081服务查看日志发现也是用线程池处理timeout请求。
5)访问正常的success接口报错。因为没有可分配的线程来处理success请求。
4.解决上面的超时和报错(服务降级)
服务降级可以在服务消费者端进行,也可以在服务提供者进行,一般是在消费者端进行。
主要从下面三个维度处理:
(1)服务提供者8081超时,调用者80不能一直卡死等待,需要有降级
(2)服务提供者8081down机了,调用者80需要有降级
(3)服务提供者8081服务OK,调用者80自己出故障或者有自我要求(自己的等待时间小于服务的处理时间),需要降级。
1.服务提供者8081进行服务降级,超时或者异常之后走自己指定的fallback方法
(1)主启动类增加注解:
@EnableCircuitBreaker
(2)Service声明HystrixCommand进行降级:
package cn.qz.cloud.service; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty; import org.springframework.stereotype.Service; import java.util.concurrent.TimeUnit; /** * @Author: qlq * @Description * @Date: 22:15 2020/10/17 */ @Service public class PaymentService { /** * 正常 * * @param id * @return */ public String success(Integer id) { return "success,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } @HystrixCommand(fallbackMethod = "timeOutHandler", commandProperties = { @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "3000") }) public String timeout(Integer id) { try { TimeUnit.SECONDS.sleep(5); } catch (InterruptedException e) { e.printStackTrace(); } return "timeout,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } public String timeOutHandler(Integer id) { return "线程池: " + Thread.currentThread().getName() + " 8081系统繁忙或者运行报错,请稍后再试,id: " + id; } }
上面表示超过3s后走 timeOutHandler 降级方法。程序中休眠5s模拟请求耗时五秒。
(3)测试如下:
$ curl -X GET http://localhost:8081/hystrix/payment/timeout/1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 135 0 135 0 0 44 0 --:--:-- 0:00:03 --:--:-- 44{"success":true,"code":"200","msg":"","data":"线程池: HystrixTimer-4 8081系统繁忙或者运行报错,请稍后再试,id: 1"}
可以看到是Hystrix相关的线程池在处理请求。
(4)修改timeout方法,模拟程序报错:
public String timeout(Integer id) { int i = 10 / 0; // try { // TimeUnit.SECONDS.sleep(5); // } catch (InterruptedException e) { // e.printStackTrace(); // } return "timeout,线程池: " + Thread.currentThread().getName() + " success,id: " + id; }
发现程序也是走的 timeOutHandler 方法,可以满足实际中的需求。
(5)还原方法,认为5秒钟是正常请求,线程休眠3s模拟实际处理耗时3s
@HystrixCommand(fallbackMethod = "timeOutHandler", commandProperties = { @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "5000") }) public String timeout(Integer id) { try { TimeUnit.SECONDS.sleep(3); } catch (InterruptedException e) { e.printStackTrace(); } return "timeout,线程池: " + Thread.currentThread().getName() + " success,id: " + id; }
2.服务消费者端80进行服务降级
(1)修改OrderHystirxController
package cn.qz.cloud.controller; import cn.qz.cloud.service.PaymentHystrixService; import cn.qz.cloud.utils.JSONResultUtil; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; /** * @Author: qlq * @Description * @Date: 14:49 2020/10/18 */ @RestController @Slf4j @RequestMapping("/consumer/hystrix/payment") public class OrderHystirxController { @Autowired private PaymentHystrixService paymentHystrixService; @GetMapping("/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id) { return paymentHystrixService.success(id); } @GetMapping("/timeout/{id}") @HystrixCommand(fallbackMethod = "paymentTimeOutFallbackMethod", commandProperties = { @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1500") }) public JSONResultUtil timeout(@PathVariable("id") Integer id) { return paymentHystrixService.timeout(id); } public JSONResultUtil paymentTimeOutFallbackMethod(@PathVariable("id") Integer id) { return JSONResultUtil.successWithData("消费者80,paymentTimeOutFallbackMethod, 线程池: " + Thread.currentThread().getName() + " 8081系统繁忙或者运行报错,请稍后再试,id: " + id); } }
(2)测试:(调用服务提供者的timeout接口走的是paymentTimeOutFallbackMethod方法)
$ curl http://localhost/consumer/hystrix/payment/timeout/1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 199 0 199 0 0 187 0 --:--:-- 0:00:01 --:--:-- 193{"success":true,"code":"200","msg":"","data":"消费者80,paymentTimeOutFallbackMethod, 线程池: hystrix-OrderHystirxController-2 8081系统繁忙或者运行报错,请稍后再试,id: 1"}
3.上面降级存在的问题:
(1)每个方法配置一个fallback降级方法,代码膨胀
(2)降级方法和业务逻辑混在一起,代码混乱
解决办法:
(1)defaultFallback解决上面问题一,实现默认降级处理
controller方法增加全局默认的降级处理,如下:
package cn.qz.cloud.controller; import cn.qz.cloud.service.PaymentHystrixService; import cn.qz.cloud.utils.JSONResultUtil; import com.netflix.hystrix.contrib.javanica.annotation.DefaultProperties; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; /** * @Author: qlq * @Description * @Date: 14:49 2020/10/18 */ @RestController @Slf4j @RequestMapping("/consumer/hystrix/payment") @DefaultProperties(defaultFallback = "paymentGlobalFallbackMethod") public class OrderHystirxController { @Autowired private PaymentHystrixService paymentHystrixService; @GetMapping("/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id) { return paymentHystrixService.success(id); } @GetMapping("/timeout/{id}") // @HystrixCommand(fallbackMethod = "paymentTimeOutFallbackMethod", commandProperties = { // @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1500") // }) // 采用默认的全局fallback @HystrixCommand public JSONResultUtil timeout(@PathVariable("id") Integer id) { int i = 1 / 0; return paymentHystrixService.timeout(id); } public JSONResultUtil paymentTimeOutFallbackMethod(@PathVariable("id") Integer id) { return JSONResultUtil.successWithData("消费者80,paymentTimeOutFallbackMethod, 线程池: " + Thread.currentThread().getName() + " 8081系统繁忙或者运行报错,请稍后再试,id: " + id); } // 下面是全局fallback方法 public JSONResultUtil paymentGlobalFallbackMethod() { return JSONResultUtil.successWithData("消费者80全局服务降级,paymentGlobalFallbackMethod, 线程池: " + Thread.currentThread().getName()); } }
定义了defaultFallback的只需要在方法声明HystrixCommand注解出错或调用的服务超时即可调用paymentGlobalFallbackMethod全局降级方法。也可以对方法单独设置,单独设置的会优先取方法上设置的降级方法。如果没有声明 HystrixCommand注解,不会进行服务的降级,报错和超时都会直接走error。
测试如下:
$ curl http://localhost/consumer/hystrix/payment/timeout/1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 155 0 155 0 0 3297 0 --:--:-- --:--:-- --:--:-- 4843{"success":true,"code":"200","msg":"","data":"消费者80全局服务降级,paymentGlobalFallbackMethod, 线程池: hystrix-OrderHystirxController-2"}
(2)通配服务降级FeignFallback:解决上面问题2,相当于每个方法fallback和业务分离
修改PaymentHystrixService增加 fallback属性
package cn.qz.cloud.service; import cn.qz.cloud.utils.JSONResultUtil; import org.springframework.cloud.openfeign.FeignClient; import org.springframework.stereotype.Component; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; /** * @Author: qlq * @Description * @Date: 14:31 2020/10/18 */ @Component @FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT", fallback = PaymentFallbackService.class) public interface PaymentHystrixService { @GetMapping("/hystrix/payment/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id); @GetMapping("/hystrix/payment/timeout/{id}") public JSONResultUtil<String> timeout(@PathVariable("id") Integer id); }
增加PaymentFallbackService类:相当于处理上面接口中对应方法的fallback
package cn.qz.cloud.service; import cn.qz.cloud.utils.JSONResultUtil; import org.springframework.stereotype.Component; /** * @Author: qlq * @Description * @Date: 20:57 2020/10/18 */ @Component public class PaymentFallbackService implements PaymentHystrixService { @Override public JSONResultUtil<String> success(Integer id) { return JSONResultUtil.successWithData("PaymentFallbackService fallback,success 方法, threadName: " + Thread.currentThread().getName() + " id: " + id); } @Override public JSONResultUtil<String> timeout(Integer id) { return JSONResultUtil.successWithData("PaymentFallbackService fallback,timeout 方法, threadName: " + Thread.currentThread().getName() + " id: " + id); } }
测试:
$ curl http://localhost/consumer/hystrix/payment/timeout/1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 129 0 129 0 0 49 0 --:--:-- 0:00:02 --:--:-- 50{"success":true,"code":"200","msg":"","data":"PaymentFallbackService fallback,timeout 方法, threadName: HystrixTimer-1 id: 1"}
5.服务熔断
熔断机制是应对雪崩效应的一种微服务链路保护机制,当扇出链路的某个服务出错不可用或响应时间太长时,会进行服务的降级,进而熔断该节点微服务的调用,快速返回错误的响应信息。当检测到该节点微服务调用响应正常后,恢复调用链路。
在SpringCloud框架里,熔断机制通过Hystrix实现。Hystrix会监控微服务调用的状况,当失败的调用达到一定的阈值,缺省是5s内20次调用失败,就会启动熔断机制。熔断机制的注解是@HystrixCommand。
参考:https://martinfowler.com/bliki/CircuitBreaker.html
断路器的3种状态:
关闭 - 当一切正常时,断路器保持闭合状态,所有调用都能访问到服务。当故障数超过预定阈值时,断路器跳闸,并进入打开状态。
打开 - 断路器在不执行该服务的情况下为调用返回错误。
半开 - 超时后,断路器切换到半开状态,以测试问题是否仍然存在。如果在这种半开状态下单个调用失败,则断路器再次打开。如果成功,则断路器重置回正常关闭状态。补充一下链路回复的过程:断路器开启一段时间之后(默认5s),这个时候断路器是半开状态,会让其中一个请求进行处理,如果成功则关闭断路器,若失败,继续开启断路器。
支付服务设置服务熔断:
(1)修改PaymentService增加熔断设置
package cn.qz.cloud.service; import cn.hutool.core.util.IdUtil; import cn.qz.cloud.utils.JSONResultUtil; import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand; import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty; import org.springframework.stereotype.Service; import java.util.concurrent.TimeUnit; /** * @Author: qlq * @Description * @Date: 22:15 2020/10/17 */ @Service public class PaymentService { /** * 正常 * * @param id * @return */ public String success(Integer id) { return "success,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } @HystrixCommand(fallbackMethod = "timeOutHandler", commandProperties = { @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "5000") }) public String timeout(Integer id) { try { TimeUnit.SECONDS.sleep(3); } catch (InterruptedException e) { e.printStackTrace(); } return "timeout,线程池: " + Thread.currentThread().getName() + " success,id: " + id; } public String timeOutHandler(Integer id) { return "线程池: " + Thread.currentThread().getName() + " 8081系统繁忙或者运行报错,请稍后再试,id: " + id; } //=====服务熔断 @HystrixCommand(fallbackMethod = "paymentCircuitBreaker_fallback", commandProperties = { @HystrixProperty(name = "circuitBreaker.enabled", value = "true"),// 是否开启断路器 @HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "10"),// 请求次数 @HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "10000"), // 时间窗口期 @HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "60"),// 失败率达到多少后跳闸 }) public JSONResultUtil<String> circuit(Integer id) { if (id < 0) { throw new RuntimeException("******id 不能负数"); } String serialNumber = IdUtil.simpleUUID(); return JSONResultUtil.successWithData(Thread.currentThread().getName() + " 调用成功,id: " + id + " ,流水号: " + serialNumber); } public JSONResultUtil<String> paymentCircuitBreaker_fallback(Integer id) { return JSONResultUtil.successWithData("paymentCircuitBreaker_fallback 降级处理, " + Thread.currentThread().getName() + " 调用失败, id 不能负数, id: " + id); } }
需要注意方法上面三个重要的参数。关于更详细的配置,可以参考类:HystrixCommandProperties
快照时间窗:断路器确定是否打开统计一些请求和错误数据,而统计的时间范围就是快照时间窗,默认为最近的10秒。
请求次数:在快照时间窗内必须达到请求次数才有资格熔断,默认为20.如果达不到总次数,即使全部失败也不会开启断路器。
错误百分比阈值:默认是50,就是一半失败的情况下会开启断路。
(2)controller增加方法:
package cn.qz.cloud.controller; import cn.qz.cloud.service.PaymentService; import cn.qz.cloud.utils.JSONResultUtil; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Value; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import javax.annotation.Resource; /** * @Author: qlq * @Description * @Date: 22:22 2020/10/17 */ @RestController @Slf4j @RequestMapping("/hystrix/payment") public class PaymentController { @Resource private PaymentService paymentService; @Value("${server.port}") private String serverPort; @GetMapping("/success/{id}") public JSONResultUtil<String> success(@PathVariable("id") Integer id) { String result = paymentService.success(id); log.info("*****result: " + result); return JSONResultUtil.successWithData(result); } @GetMapping("/timeout/{id}") public JSONResultUtil timeout(@PathVariable("id") Integer id) { String result = paymentService.timeout(id); log.info("*****result: " + result); return JSONResultUtil.successWithData(result); } //====服务熔断 @GetMapping("/circuit/{id}") public JSONResultUtil<String> circuit(@PathVariable("id") Integer id) { JSONResultUtil<String> result = paymentService.circuit(id); log.info("****result: " + result); return result; } }
(3)测试:
-先测试成功:
liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 139 0 139 0 0 4483 0 --:--:-- --:--:-- --:--:-- 8687{"success":true,"code":"200","msg":"","data":"hystrix-PaymentService-10 调用成功,id: 5 ,流水号: 2f5966b1f6cd469ab8b05ef83067c156"} liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/-5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 158 0 158 0 0 3361 0 --:--:-- --:--:-- --:--:-- 9875{"success":true,"code":"200","msg":"","data":"paymentCircuitBreaker_fallback 降级处理, hystrix-PaymentService-10 调用失败, id 不能负数, id: -5"} liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 139 0 139 0 0 4483 0 --:--:-- --:--:-- --:--:-- 8687{"success":true,"code":"200","msg":"","data":"hystrix-PaymentService-10 调用成功,id: 5 ,流水号: a528efccc26444cdac3219328616b491"}
-多调用几次http://localhost:8081/hystrix/payment/circuit/-5连接,使得10秒钟达到10次请求,并且每次都是失败。断路器会自动开启,过段时间又会关闭断路器,如下:
可以看到即使调用成功的ID,也是走的降级的方法。待自动关闭断路器之后又自动进行链路恢复。
liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 152 0 152 0 0 4903 0 --:--:-- --:--:-- --:--:-- 10133{"success":true,"code":"200","msg":"","data":"paymentCircuitBreaker_fallback 降级处理, http-nio-8081-exec-3 调用失败, id 不能负数, id: 5"} liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 152 0 152 0 0 4903 0 --:--:-- --:--:-- --:--:-- 148k{"success":true,"code":"200","msg":"","data":"paymentCircuitBreaker_fallback 降级处理, http-nio-8081-exec-4 调用失败, id 不能负数, id: 5"} liqiang@root MINGW64 ~/Desktop $ curl http://localhost:8081/hystrix/payment/circuit/5 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 139 0 139 0 0 2957 0 --:--:-- --:--:-- --:--:-- 8687{"success":true,"code":"200","msg":"","data":"hystrix-PaymentService-10 调用成功,id: 5 ,流水号: c6438748e5a44ca9a619161c84a8296b"}
6.服务限流
暂时不做研究。之后研究alibaba的sentinel。
7.hystrix工作流程
注意分为如下步骤:
1.Construct a HystrixCommand or HystrixObservableCommand Object
2.Execute the Command
3.Is the Response Cached?
4.Is the Circuit Open?
5.Is the Thread Pool/Queue/Semaphore Full?
6.HystrixObservableCommand.construct() or HystrixCommand.run()
7.Calculate Circuit Health
8.Get the Fallback
9.Return the Successful Response
补充:Hystrix的资源隔离策略
1.为什么需要资源隔离:
例如,我们容器(Tomcat)配置的线程个数为1000,服务A-服务B,其中服务A的并发量非常的大,需要500个线程来执行,此时,服务A又挂了,那么这500个线程很可能就夯死了,那么剩下的服务,总共可用的线程为500个,随着并发量的增大,剩余服务挂掉的风险就会越来越大,最后导致整个系统的所有服务都不可用,直到系统宕机。这就是服务的雪崩效应。
Hystrix就是用来做资源隔离的,比如说,当客户端向服务端发送请求时,给服务A分配了10个线程,只要超过了这个并发量就走降级服务,就算服务A挂了,最多也就导致服务A不可用,容器的10个线程不可用了,但是不会影响系统中的其他服务。
2.Hystrix的资源隔离策略有两种,分别为:线程池和信号量。
(1)线程池隔离模式:使用一个线程池来存储当前的请求,线程池对请求作处理,设置任务返回处理超时时间,堆积的请求堆积入线程池队列。这种方式需要为每个依赖的服务申请线程池,有一定的资源消耗,好处是可以应对突发流量(流量洪峰来临时,处理不完可将数据存储到线程池队里慢慢处理)。
(2)信号量隔离模式:使用一个原子计数器(或信号量)来记录当前有多少个线程在运行,请求来先判断计数器的数值,若超过设置的最大线程个数则丢弃改类型的新请求,若不超过则执行计数操作请求来计数器+1,请求返回计数器-1。这种方式是严格的控制线程且立即返回模式,无法应对突发流量(流量洪峰来临时,处理的线程超过数量,其他的请求会直接返回,不继续去请求依赖的服务)
3.可以切换hystrix的资源隔离方式,默认是线程池模式。可以对某个方法单独切换,也可以切换全局的,切换全局的如下:
hystrix: command: default: execution: isolation: thread: timeoutInMilliseconds: 1000 strategy: SEMAPHORE # 信号量隔离 # strategy: THREAD # 线程池 semaphore: maxConcurrentRequests: 100 # 最大信号量上限
(1) 默认是线程池模式,HystrixCommand 降级以及熔断方法完全采用hystrix的线程池
(2) 设置信号量模式: 会使用tomcat的线程池,可以通过信号量的多少控制并发量。
参考:https://github.com/Netflix/Hystrix/wiki/How-it-Works
3.Hystrixdashboard实现服务监控
Hystrix提供了对于微服务调用状态的监控信息,但是需要结合spring-boot-actuator模块一起使用。Hystrix Dashboard是Hystrix的一个组件,Hystrix Dashboard提供一个断路器的监控面板,可以使我们更好的监控服务和集群的状态。
1.新建监控微服务
1.新建模块cloud-consumer-hystrix-dashboard9001
2.完善pom
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <parent> <artifactId>cloud</artifactId> <groupId>cn.qz.cloud</groupId> <version>1.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> <artifactId>cloud-consumer-hystrix-dashboard9001</artifactId> <dependencies> <!--引入自己抽取的工具包--> <dependency> <groupId>cn.qz.cloud</groupId> <artifactId>cloud-api-commons</artifactId> <version>${project.version}</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId> </dependency> <!--监控--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-hystrix</artifactId> </dependency> <!--eureka client--> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId> </dependency> <!--热部署--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-devtools</artifactId> <scope>runtime</scope> <optional>true</optional> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies> </project>
3.修改yml
server:
port: 9001
4.启动类:
package cn.qz.cloud; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.cloud.netflix.hystrix.dashboard.EnableHystrixDashboard; @SpringBootApplication // 开启仪表盘监控注解 @EnableHystrixDashboard public class HystrixDashboardMain9001 { public static void main(String[] args) { SpringApplication.run(HystrixDashboardMain9001.class, args); } }
2.修改原来的cloud-provider-hystrix-payment8081服务:
注意pom需要加上:
<!--web--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency>
修改主启动类:(必须增加下面的getServlet配置,否则报错连接不到)
package cn.qz.cloud; import com.netflix.hystrix.contrib.metrics.eventstream.HystrixMetricsStreamServlet; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.boot.web.servlet.ServletRegistrationBean; import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker; import org.springframework.cloud.netflix.eureka.EnableEurekaClient; import org.springframework.cloud.netflix.hystrix.EnableHystrix; import org.springframework.context.annotation.Bean; /** * @Author: qlq * @Description * @Date: 22:08 2020/10/17 */ @SpringBootApplication @EnableEurekaClient @EnableCircuitBreaker @EnableHystrix public class PaymentHystrixMain8081 { public static void main(String[] args) { SpringApplication.run(PaymentHystrixMain8081.class, args); } /** * 此配置是为了服务监控而配置,与服务容错本身无关,springcloud升级后的坑 * ServletRegistrationBean因为SpringBoot的默认路径不是 “/hystrix.stream" * 只要在自己的项目里配置上下的servlet就可以了 */ @Bean public ServletRegistrationBean getServlet() { HystrixMetricsStreamServlet streamServlet = new HystrixMetricsStreamServlet(); ServletRegistrationBean registrationBean = new ServletRegistrationBean(streamServlet); registrationBean.setLoadOnStartup(1); registrationBean.addUrlMappings("/hystrix.stream"); registrationBean.setName("HystrixMetricsStreamServlet"); return registrationBean; } }
修改后启动服务
3.启动hystrixdashboard服务
(1)访问首页如下:
(2)多次访问http://localhost:8081/hystrix/payment/circuit/-1,使其断路器打开
(3)输入以下地址查看:http://localhost:8081/hystrix.stream
(4)进入monitor
(5)上面测试的效果不是很明显,可以用jmeter批量进行测试
补充:可以设置hystrix默认执行时长,超时进行降级处理,这里需要注意下Ribbon链接时长和等待请求处理时长的影响
hystrix: command: default: execution: isolation: thread: timeoutInMilliseconds: 1000
这些配置的属性都可以从类HystrixPropertiesManager、 HystrixCommandProperties 中查看。
补充:如果通过feign调用服务没有进行服务的降级。比如A服务调B服务,B服务抛出除0异常,A服务报错如下:
会报feign相关的错误:
2020-11-26 14:54:17.603 ERROR 27404 --- [o-auto-1-exec-5] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.netflix.hystrix.exception.HystrixRuntimeException: PaymentHystrixService#error() failed and no fallback available.] with root cause feign.FeignException$InternalServerError: status 500 reading PaymentHystrixService#error() at feign.FeignException.serverErrorStatus(FeignException.java:195) ~[feign-core-10.4.0.jar:na] at feign.FeignException.errorStatus(FeignException.java:144) ~[feign-core-10.4.0.jar:na] at feign.FeignException.errorStatus(FeignException.java:133) ~[feign-core-10.4.0.jar:na] at feign.codec.ErrorDecoder$Default.decode(ErrorDecoder.java:92) ~[feign-core-10.4.0.jar:na] at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:151) ~[feign-core-10.4.0.jar:na] at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:80) ~[feign-core-10.4.0.jar:na] at feign.hystrix.HystrixInvocationHandler$1.run(HystrixInvocationHandler.java:109) ~[feign-hystrix-10.4.0.jar:na] at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:302) ~[hystrix-core-1.5.18.jar:1.5.18] at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:298) ~[hystrix-core-1.5.18.jar:1.5.18] at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8] at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) ~[rxjava-1.3.8.jar:1.3.8] at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8] at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8] at rx.internal.operators.OperatorSubscribeOn$SubscribeOnSubscriber.call(OperatorSubscribeOn.java:100) ~[rxjava-1.3.8.jar:1.3.8] at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) ~[hystrix-core-1.5.18.jar:1.5.18] at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) ~[hystrix-core-1.5.18.jar:1.5.18] at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) ~[hystrix-core-1.5.18.jar:1.5.18] at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) ~[rxjava-1.3.8.jar:1.3.8] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_171] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_171] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_171] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_171] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
补充:另外每个应用可以通过全局异常拦截器进行规避一些错误,拦截到错误之后将错误信息返回给上个服务。