文章目录
- SpringCloud Hystrix 熔断器、服务降级防止服务雪崩
- 需求背景
- 引入依赖
- 启动类加Hystrix注解
- 接口配置熔断
- 常规配置
- 超时断开
- 错误率熔断
- 请求数熔断限流
- 可配置项
- HystrixCommand.Setter参数
- Command Properties
- 服务降级
SpringCloud Hystrix 熔断器、服务降级防止服务雪崩
Hystrix,英文意思是豪猪,全身是刺,刺是一种保护机制。Hystrix也是Netflflix公司的一款组件。
Hystrix是什么?
在分布式环境中,许多服务依赖项中的部分服务必然有概率出现失败。Hystrix是一个库,通过添加延迟和容错逻辑,来帮助你控制这些分布式服务之间的交互。Hystrix通过隔离服务之间的访问点阻止级联失败,通过提供回退选项来实现防止级联出错。提高了系统的整体弹性。与Ribbon并列,也几乎存在于每个Spring Cloud构建的微服务和基础设施中。
Hystrix被设计的目标是:
对通过第三方客户端库访问的依赖项(通常是通过网络)的延迟和故障进行保护和控制。
在复杂的分布式系统中阻止雪崩效应。
快速失败,快速恢复。
回退,尽可能优雅地降级。
需求背景
项目有慢接口,多个渠道都需要这个接口,一旦有突发大流量过来会导致这个接口变慢,甚至超时。引发第三方重试不断调用我们接口,这个接口又会调用其他接口,在突发大流量时会引发雪崩,且雪崩积攒的重试流量会把刚滚动启动的实例(实例有探针发现持续两次不响应ping就会滚动重启实例,起一个新的,回收旧实例)又打垮,因此需要加入服务熔断、服务降级,让系统将无法消化的流量直接熔断返回报错,让服务不被阻塞住。
引入依赖
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
启动类加Hystrix注解
启动类加@EnableHystrix 或@EnableCircuitBreaker 注解都可以,建议用@EnableHystrix ,因为后一个在后续版本被弃用,@EnableHystrix是包含了@EnableCircuitBreaker的
@Target(ElementType.TYPE)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@EnableCircuitBreaker
public @interface EnableHystrix {
}
@SpringCloudApplication
@EnableHystrix
public class ManageApplication{
}
接口配置熔断
熔断效果测试可以使用jmeter这样的工具来实现
https://jmeter.apache.org/download_jmeter.cgi
常规配置
配置的值自己根据自己需求去改,这里是测试用设的值
@HystrixCommand(commandProperties = {
// 隔离策略,这个配置项默认是THREAD(线程),另外一个选项是SEMAPHORE(信号量),在信号量策略下无法使用超时时间设置。
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "THREAD"),
// 单实例vivo接口最大并发请求数150,多处的直接拒绝
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_SEMAPHORE_MAX_CONCURRENT_REQUESTS, value = "1"),
// 线程超时时间,10秒
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "100"),
//错误百分比条件,达到熔断器最小请求数后错误率达到百分之多少后打开熔断器
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "10"),
//断容器最小请求数,达到这个值过后才开始计算是否打开熔断器
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD, value = "3"),
// 默认5秒; 熔断器打开后多少秒后 熔断状态变成半熔断状态(对该微服务进行一次请求尝试,不成功则状态改成熔断,成功则关闭熔断器
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS, value = "5000")
})
EXECUTION_ISOLATION_STRATEGY隔离策略,这个配置项默认是THREAD(线程),另外一个选项是SEMAPHORE(信号量),在信号量策略下无法使用超时时间设置。
超时断开
@RequestMapping("testTimeout")
@HystrixCommand(commandProperties = {
// 线程超时时间,10秒
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "100")
})
public String testTimeout() throws InterruptedException {
Thread.sleep(5000);
return "test";
}
错误率熔断
请求报错率达到多少之后开启熔断
@RequestMapping("test")
@HystrixCommand(groupKey = "vivoApi", commandProperties = {
//错误百分比条件,达到熔断器最小请求数后错误率达到百分之多少后打开熔断器
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "10")
})
public String test() throws Exception {
Thread.sleep(50);
return "test";
}
请求数熔断限流
处理调用的线程池核心线程一个,任务队列长度为10,超出的请求会被熔断打回,在执行和队列里的请求没事,正常执行。
@HystrixCommand(groupKey = "vivoApi", threadPoolProperties = {
@HystrixProperty(name = HystrixPropertiesManager.CORE_SIZE, value = "1"),
@HystrixProperty(name = HystrixPropertiesManager.MAX_QUEUE_SIZE, value = "10")
})
public String test() throws Exception {
Thread.sleep(50);
return "test";
}
可配置项
配置的中文翻译 Hystrix配置中文文档
关键配置类 HystrixCommandProperties.java、HystrixPropertiesManager.java
HystrixCommand.Setter参数
HystrixCommandGroupKey:区分一组服务,一般以接口为粒度。
HystrixCommandKey:区分一个方法,一般以方法为粒度。
HystrixThreadPoolKey:一个HystrixThreadPoolKey下的所有方法共用一个线程池。
HystrixCommandProperties:基本配置
Command Properties
hystrix.command.default.execution.isolation.strategy 隔离策略,默认是Thread,可选Thread|Semaphore。thread用于线程池的隔离,一般适用于同步请求。semaphore是信号量模式,适用于异步请求
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds 命令执行超时时间,默认1000ms
hystrix.command.default.execution.timeout.enabled 执行是否启用超时,默认启用true
hystrix.command.default.execution.isolation.thread.interruptOnTimeout 发生超时是是否中断,默认true
hystrix.command.default.execution.isolation.semaphore.maxConcurrentRequests 最大并发请求数,默认10,该参数当使用ExecutionIsolationStrategy.SEMAPHORE策略时才有效。如果达到最大并发请求数,请求会被拒绝。理论上选择semaphore size的原则和选择thread size一致,但选用semaphore时每次执行的单元要比较小且执行速度快(ms级别),否则的话应该用thread。
hystrix.command.default.execution.isolation.thread.interruptOnCancel
public abstract class HystrixCommandProperties {
private static final Logger logger = LoggerFactory.getLogger(HystrixCommandProperties.class);
/* defaults */
/* package */ static final Integer default_metricsRollingStatisticalWindow = 10000;// default => statisticalWindow: 10000 = 10 seconds (and default of 10 buckets so each bucket is 1 second)
private static final Integer default_metricsRollingStatisticalWindowBuckets = 10;// default => statisticalWindowBuckets: 10 = 10 buckets in a 10 second window so each bucket is 1 second
private static final Integer default_circuitBreakerRequestVolumeThreshold = 20;// default => statisticalWindowVolumeThreshold: 20 requests in 10 seconds must occur before statistics matter
private static final Integer default_circuitBreakerSleepWindowInMilliseconds = 5000;// default => sleepWindow: 5000 = 5 seconds that we will sleep before trying again after tripping the circuit
private static final Integer default_circuitBreakerErrorThresholdPercentage = 50;// default => errorThresholdPercentage = 50 = if 50%+ of requests in 10 seconds are failures or latent then we will trip the circuit
private static final Boolean default_circuitBreakerForceOpen = false;// default => forceCircuitOpen = false (we want to allow traffic)
/* package */ static final Boolean default_circuitBreakerForceClosed = false;// default => ignoreErrors = false
private static final Integer default_executionTimeoutInMilliseconds = 1000; // default => executionTimeoutInMilliseconds: 1000 = 1 second
private static final Boolean default_executionTimeoutEnabled = true;
private static final ExecutionIsolationStrategy default_executionIsolationStrategy = ExecutionIsolationStrategy.THREAD;
private static final Boolean default_executionIsolationThreadInterruptOnTimeout = true;
private static final Boolean default_executionIsolationThreadInterruptOnFutureCancel = false;
private static final Boolean default_metricsRollingPercentileEnabled = true;
private static final Boolean default_requestCacheEnabled = true;
private static final Integer default_fallbackIsolationSemaphoreMaxConcurrentRequests = 10;
private static final Boolean default_fallbackEnabled = true;
private static final Integer default_executionIsolationSemaphoreMaxConcurrentRequests = 10;
private static final Boolean default_requestLogEnabled = true;
private static final Boolean default_circuitBreakerEnabled = true;
private static final Integer default_metricsRollingPercentileWindow = 60000; // default to 1 minute for RollingPercentile
private static final Integer default_metricsRollingPercentileWindowBuckets = 6; // default to 6 buckets (10 seconds each in 60 second window)
private static final Integer default_metricsRollingPercentileBucketSize = 100; // default to 100 values max per bucket
private static final Integer default_metricsHealthSnapshotIntervalInMilliseconds = 500; // default to 500ms as max frequency between allowing snapshots of health (error percentage etc)
@SuppressWarnings("unused") private final HystrixCommandKey key;
private final HystrixProperty<Integer> circuitBreakerRequestVolumeThreshold; // number of requests that must be made within a statisticalWindow before open/close decisions are made using stats
private final HystrixProperty<Integer> circuitBreakerSleepWindowInMilliseconds; // milliseconds after tripping circuit before allowing retry
private final HystrixProperty<Boolean> circuitBreakerEnabled; // Whether circuit breaker should be enabled.
private final HystrixProperty<Integer> circuitBreakerErrorThresholdPercentage; // % of 'marks' that must be failed to trip the circuit
private final HystrixProperty<Boolean> circuitBreakerForceOpen; // a property to allow forcing the circuit open (stopping all requests)
private final HystrixProperty<Boolean> circuitBreakerForceClosed; // a property to allow ignoring errors and therefore never trip 'open' (ie. allow all traffic through)
private final HystrixProperty<ExecutionIsolationStrategy> executionIsolationStrategy; // Whether a command should be executed in a separate thread or not.
private final HystrixProperty<Integer> executionTimeoutInMilliseconds; // Timeout value in milliseconds for a command
private final HystrixProperty<Boolean> executionTimeoutEnabled; //Whether timeout should be triggered
private final HystrixProperty<String> executionIsolationThreadPoolKeyOverride; // What thread-pool this command should run in (if running on a separate thread).
private final HystrixProperty<Integer> executionIsolationSemaphoreMaxConcurrentRequests; // Number of permits for execution semaphore
private final HystrixProperty<Integer> fallbackIsolationSemaphoreMaxConcurrentRequests; // Number of permits for fallback semaphore
private final HystrixProperty<Boolean> fallbackEnabled; // Whether fallback should be attempted.
private final HystrixProperty<Boolean> executionIsolationThreadInterruptOnTimeout; // Whether an underlying Future/Thread (when runInSeparateThread == true) should be interrupted after a timeout
private final HystrixProperty<Boolean> executionIsolationThreadInterruptOnFutureCancel; // Whether canceling an underlying Future/Thread (when runInSeparateThread == true) should interrupt the execution thread
private final HystrixProperty<Integer> metricsRollingStatisticalWindowInMilliseconds; // milliseconds back that will be tracked
private final HystrixProperty<Integer> metricsRollingStatisticalWindowBuckets; // number of buckets in the statisticalWindow
private final HystrixProperty<Boolean> metricsRollingPercentileEnabled; // Whether monitoring should be enabled (SLA and Tracers).
private final HystrixProperty<Integer> metricsRollingPercentileWindowInMilliseconds; // number of milliseconds that will be tracked in RollingPercentile
private final HystrixProperty<Integer> metricsRollingPercentileWindowBuckets; // number of buckets percentileWindow will be divided into
private final HystrixProperty<Integer> metricsRollingPercentileBucketSize; // how many values will be stored in each percentileWindowBucket
private final HystrixProperty<Integer> metricsHealthSnapshotIntervalInMilliseconds; // time between health snapshots
private final HystrixProperty<Boolean> requestLogEnabled; // whether command request logging is enabled.
private final HystrixProperty<Boolean> requestCacheEnabled; // Whether request caching is enabled.
/**
* Isolation strategy to use when executing a {@link HystrixCommand}.
* <p>
* <ul>
* <li>THREAD: Execute the {@link HystrixCommand#run()} method on a separate thread and restrict concurrent executions using the thread-pool size.</li>
* <li>SEMAPHORE: Execute the {@link HystrixCommand#run()} method on the calling thread and restrict concurrent executions using the semaphore permit count.</li>
* </ul>
*/
public static enum ExecutionIsolationStrategy {
THREAD, SEMAPHORE
}
服务降级
简单来说也就是在@HystrixCommand注解上配置fallbackMethod参数指定降级后调用的方法名即可。当接口不可用、超时、错误率高于阈值时自动调用fallbackMethod指定的方法处理请求,称为服务降级,fallbackMethod方法可以直接返回错误,或者提供带缓存的高性能的实现等,如果fallbackMethod也不可用,会发生熔断报错返回。
@RestController
@RequestMapping("hystrix")
public class HystrixController {
@Autowired
private HystrixService hystrixService;
// 熔断降级
@GetMapping("{num}")
@HystrixCommand(fallbackMethod="circuitBreakerFallback", commandProperties = {
@HystrixProperty(name=HystrixPropertiesManager.CIRCUIT_BREAKER_ENABLED, value = "true"),
// 是否开启熔断器
@HystrixProperty(name=HystrixPropertiesManager.CIRCUIT_BREAKER_REQUEST_VOLUME_THRESHOLD,value = "20"), // 统计时间窗内请求次数
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_ERROR_THRESHOLD_PERCENTAGE, value = "50"),// 在统计时间窗内,失败率达到50%进入熔断状态
@HystrixProperty(name = HystrixPropertiesManager.CIRCUIT_BREAKER_SLEEP_WINDOW_IN_MILLISECONDS, value = "5000"), // 休眠时间窗口
@HystrixProperty(name = HystrixPropertiesManager.METRICS_ROLLING_STATS_TIME_IN_MILLISECONDS, value = "10000") // 统计时间窗
})
public String testCircuitBreaker(@PathVariable Integer num, @RequestParam String name) {
if (num % 2 == 0) {
return "请求成功";
} else {
throw RunTimeException("");
}
}
// fallback方法的参数个数、参数类型、返回值类型要与原方法对应,fallback方法的参数多加个Throwable
public String circuitBreakerFallback(Integer num, String name) {
return "请求失败,请稍后重试";
}
// 超时降级
@GetMapping
@HystrixCommand(fallbackMethod = "timeoutFallback", commandProperties = {
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_TIMEOUT_ENABLED, value = "true"),
// 是否开启超时降级
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_TIMEOUT_IN_MILLISECONDS, value = "10000"),
// 请求的超时时间,默认10000
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_THREAD_INTERRUPT_ON_TIMEOUT, value = "true")
// 当请求超时时,是否中断线程,默认true
})
public String testTimeout(@RequestParam String name) throws InterruptedException{
Thread.sleep(200)
return "success";
}
public String timeoutFallback(String name) {
return "请求超时,请稍后重试";
}
// 资源隔离(线程池)触发降级
@GetMapping("isolation/threadpool")
@HystrixCommand(fallbackMethod = "isolationFallback",
commandProperties = {
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "THREAD")
},
threadPoolProperties = {
@HystrixProperty(name = HystrixPropertiesManager.CORE_SIZE, value = "10"),
@HystrixProperty(name = HystrixPropertiesManager.MAX_QUEUE_SIZE, value = "-1"),
@HystrixProperty(name = HystrixPropertiesManager.QUEUE_SIZE_REJECTION_THRESHOLD, value = "2"),
@HystrixProperty(name = HystrixPropertiesManager.KEEP_ALIVE_TIME_MINUTES, value = "1"),
}
)
public String testThreadPoolIsolation(@RequestParam String name) throws InterruptedException {
Thread.sleep(200)
return "success";
}
public String isolationFallback(String name) {
return "资源隔离拒绝,请稍后重试";
}
// 信号量资源隔离
@GetMapping("isolation/semaphore")
@HystrixCommand(fallbackMethod = "isolationFallback",
commandProperties = {
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_STRATEGY, value = "SEMAPHORE"),
@HystrixProperty(name = HystrixPropertiesManager.EXECUTION_ISOLATION_SEMAPHORE_MAX_CONCURRENT_REQUESTS, value = "2")
}
)
public String testSemaphoreIsolation(@RequestParam String name) throws InterruptedException {
Thread.sleep(200)
return "success";
}
public String isolationFallback(String name) {
return "资源隔离拒绝,请稍后重试";
}
}