报错信息
Nacos集群报错如下:
2021-07-13 16:21:56,363 ERROR [IP-DEAD] failed to delete ip automatically, ip: {"instanceId":"10.200.13.135#3012#DEFAULT#DEFAULT_GROUP@@service-ecs","ip":"10.200.13.135","port":3012,"weight":1.0,"healthy":false,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@cloudproduct-service-ecs","metadata":{"":"SPRING_CLOUD"},"lastBeat":1625552931192,"marked":false,"app":"unknown","instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"ipDeleteTimeout":30000}, error: {}
: Request cannot be executed; I/O reactor status: STOPPED
at (:46)
at (:90)
at (:123)
at (:75)
at (:108)
at (:92)
at (:52)
at (:364)
at (:105)
at (:193)
at (:157)
at (:142)
at (:120)
at $RunnableAdapter.call(:511)
at (:308)
at $ScheduledFutureTask.access$301(:180)
at $ScheduledFutureTask.run(:294)
at (:1149)
at $Worker.run(:624)
at (:748)
原因
- 从堆栈信息中可以看出来,这个错是
Apache HttpComponents
中发送的异步请求的http客户端报出的错,jar包如下
<dependency>
<groupId></groupId>
<artifactId>httpasyncclient</artifactId>
</dependency>
- 我们具体看一下报错的源代码,就明白问题所在了,抛出这个异常的类是
,源代码如下,博主这里对源代码做了些删减,可以更加清晰的看出代码逻辑:
abstract class CloseableHttpAsyncClientBase extends CloseableHttpPipeliningClient {
private final Log log = LogFactory.getLog(getClass());
// 线程状态枚举
static enum Status {INACTIVE, ACTIVE, STOPPED}
private final NHttpClientConnectionManager connmgr;
// 执行异步任务的线程
private final Thread reactorThread;
// 记录线程的状态
private final AtomicReference<Status> status;
// 构造函数,初始化执行异步任务的线程
public CloseableHttpAsyncClientBase(
final NHttpClientConnectionManager connmgr,
final ThreadFactory threadFactory,
final NHttpClientEventHandler handler) {
super();
this.connmgr = connmgr;
if (threadFactory != null && handler != null) {
// 初始化线程
this.reactorThread = threadFactory.newThread(new Runnable() {
@Override
public void run() {
try {
final IOEventDispatch ioEventDispatch = new InternalIODispatch(handler);
connmgr.execute(ioEventDispatch);
} catch (final Exception ex) {
log.error("I/O reactor terminated abnormally", ex);
} finally {
status.set(Status.STOPPED);
}
}
});
} else {
this.reactorThread = null;
}
this.status = new AtomicReference<Status>(Status.INACTIVE);
}
// 开始运行线程
@Override
public void start() {
if (this.status.compareAndSet(Status.INACTIVE, Status.ACTIVE)) {
if (this.reactorThread != null) {
this.reactorThread.start();
}
}
}
// 执行任务前判断线程的状态,如果线程关闭了,则抛出异常,
protected void ensureRunning() {
final Status currentStatus = this.status.get();
Asserts.check(currentStatus == Status.ACTIVE, "Request cannot be executed; " +
"I/O reactor status: %s", currentStatus);
}
// 关闭线程
@Override
public void close() {
if (this.status.compareAndSet(Status.ACTIVE, Status.STOPPED)) {
if (this.reactorThread != null) {
try {
this.connmgr.shutdown();
} catch (final IOException ex) {
this.log.error("I/O error shutting down connection manager", ex);
}
try {
this.reactorThread.join();
} catch (final InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
}
}
}
- 从代码中我们可以看到,抛出
I/O reactor status: STOPPED
异常的原因是执行异步任务的线程过早的结束了,再提交任务的时候,无法提交所以抛出异常。 - 这里也要说一下
Nacos
在设计NacosAsyncRestTemplate
类的时候,这里有一个缺陷,NacosAsyncRestTemplate
类的主要做用是用来发送异步HTTP请求的,在Nacos
中是将这个类设计为单例了,但是如果其中执行异步请求的线程挂掉了,整个工程中所有使用NacosAsyncRestTemplate
的功能,都不可用了。 -
Nacos
集群在同步数据时使用的就是这个类,当异步线程关闭后,nacos日志中就会持续的输出I/O reactor status: STOPPED
异常了
解决
解决办法其实很简单了,异步线程挂掉了之后,Nacos是不会主动拉起这个线程的,所以就需要我们重启一下Nacos。