不优雅的停机: 当进程存在正在运行的线程时,如果直接执行kill -9 pid时,那么这个正在执行的线程被中断,就好像一个机器运行中突然遭遇断电的情况,所导致的结果是造成服务调用的消费端报错,也有可能导致服务端产生脏数据的后果。
1 什么叫优雅停机?
优雅停机: 当进程存在未执行完毕的线程或者存在待释放的资源时,先让这些线程执行完,和释放资源,再关闭进程。
2 JAVA实现优雅停机
我们可以通过Runtime.getRuntime().addShutdownHook()方法来注册钩子,以保证程序平滑退出。当jvm关闭的时候,会执行系统中已经设置的所有通过方法addShutdownHook添加的钩子,当系统执行完这些钩子后,jvm才会关闭。所以这些钩子可以在jvm关闭的时候进行内存清理、对象销毁等操作。
Runtime.getRuntime().addShutdownHook(
//此线程会在kill pid后执行
new Thread(new Runnable() {
@Override public void run() {
//释放资源,检测未完结线程等...
System.out.println("hook running...");
}
}
3 dubbo实现优雅停机原理
1) 关闭所有已创建到注册中心(zookeeper)的服务
2) 关闭服务
3) 关闭客户端
));
4 优雅关闭的代码入口:
类 com.alibaba.dubbo.config.AbstractConfig中的注册的jvm关闭钩子事件:
static { Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() { public void run() { if (logger.isInfoEnabled()) { logger.info("Run shutdown hook now."); } ProtocolConfig.destroyAll(); //方法实现在下面 } }, "DubboShutdownHook")); }
com.alibaba.dubbo.config.ProtocolConfig的destroyAll方法的实现
public static void destroyAll() { if (!destroyed.compareAndSet(false, true)) { return; } AbstractRegistryFactory.destroyAll();//关闭在zookeeper上面注册的服务 ExtensionLoader<Protocol> loader = ExtensionLoader.getExtensionLoader(Protocol.class); for (String protocolName : loader.getLoadedExtensions()) { try { Protocol protocol = loader.getLoadedExtension(protocolName); if (protocol != null) { protocol.destroy(); } } catch (Throwable t) { logger.warn(t.getMessage(), t); } } }
5 关闭所有已创建到注册中心(zookeeper)的服务过程
com.alibaba.dubbo.config.ProtocolConfig的destroyAll方法
public static void destroyAll() { if (!destroyed.compareAndSet(false, true)) { return; } AbstractRegistryFactory.destroyAll(); //----->关闭zookeeper上面的服务 ExtensionLoader<Protocol> loader = ExtensionLoader.getExtensionLoader(Protocol.class); for (String protocolName : loader.getLoadedExtensions()) { try { Protocol protocol = loader.getLoadedExtension(protocolName); if (protocol != null) { protocol.destroy(); } } catch (Throwable t) { logger.warn(t.getMessage(), t); } } }
AbstractRegistryFactory.destroyAll()方法的实现:
public static void destroyAll() { if (LOGGER.isInfoEnabled()) { LOGGER.info("Close all registries " + getRegistries()); } // 锁定注册中心关闭过程 LOCK.lock(); try { for (Registry registry : getRegistries()) { try { registry.destroy(); } catch (Throwable e) { LOGGER.error(e.getMessage(), e); } } REGISTRIES.clear(); } finally { // 释放锁 LOCK.unlock(); } }
其中zookeeper关闭注册服务的实现如下:
com.alibaba.dubbo.registry.zookeeper.ZookeeperRegistry.doUnregister(URL url)的实现:
protected void doUnregister(URL url) { try { zkClient.delete(toUrlPath(url)); } catch (Throwable e) { throw new RpcException("Failed to unregister " + url + " to zookeeper " + getUrl() + ", cause: " + e.getMessage(), e); } }
6 关闭服务的过程:
关闭服务主要设计的类和方法:
com.alibaba.dubbo.remoting.transport.netty.NettyServer
|
com.alibaba.dubbo.remoting.transport.AbstractServer
主要关闭方法为AbstractServer的public void close(int timeout)方法:
public void close(int timeout) {
ExecutorUtil.gracefulShutdown(executor, timeout);
close();
}
其中executor为AbstractServer的成员变量,为server的执行任务的线程池。
Dubbo线程池优雅关闭的方法代码如下:
/** * executor 服务线程池 * timeout 关闭超时时间---> */ public static void gracefulShutdown(Executor executor, int timeout) { if (!(executor instanceof ExecutorService) || isShutdown(executor)) { return; } final ExecutorService es = (ExecutorService) executor; try { es.shutdown(); // 禁止提交新的任务 } catch (SecurityException ex2) { return; } catch (NullPointerException ex2) { return; } try { if (!es.awaitTermination(timeout, TimeUnit.MILLISECONDS)) { es.shutdownNow(); } } catch (InterruptedException ex) { es.shutdownNow(); Thread.currentThread().interrupt(); } if (!isShutdown(es)) { newThreadToCloseExecutor(es); } }
8 dubbox和dubbo检测优雅停机demo
测试目标: 1 服务关闭后,消费端不能访问到该节点的服务。
2 服务关闭,服务端未执行完成的线程,能等待其执行完成后再关闭,客户端能收到正常的返回..而不是exception,或者其他错误的响应。
代码实现:
//服务端接口
public interface ProviderDemoService { public String sayHello(); }
//服务端接口的实现:在接到请求的时候,系统关闭(kill pid),任务线程睡眠较长时间
package com.zhanglang.dubbo.a_provider.service.impl; import java.text.SimpleDateFormat;import java.util.Date; import org.springframework.stereotype.Component; import com.alibaba.dubbo.config.annotation.Service;import com.zhanglang.dubbo.a_provider.service.ProviderDemoService; @Service@Componentpublic class ProviderDemoServiceImpl implements ProviderDemoService{ public String sayHello() { System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date()) + " method is called."); Thread exitsThread = new Thread(new Runnable() { public void run() { System.exit(0);//此命令相当于 kill pid } }); exitsThread.start(); try{ Thread.sleep(20000);//模拟一个执行时间较长的过程 } catch (Exception e){ System.out.println(e.getMessage()); } System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date()) + " thread has weakup!!! "); return " hello Word "; } public class ShutDownHoog implements Runnable { public void run() { } } public static void main(String[] args) { Runtime.getRuntime().addShutdownHook( //此线程会在kill pid后执行 new Thread(new Runnable() { @Override public void run() { //释放资源,检测未完结线程等... System.out.println("hook running..."); } } )); System.exit(0);//此命令相当于 kill pid } }
//客户端实现
设计思想:客户端并发启动两个线程,检测第二个线程执行过程是否还能访问到第一个线程访问到的provider。
* 如果 第二个线程访问返回No available Provider 异常信息,则代表provider已经下线..实现了consumer端的优雅停机
import java.text.SimpleDateFormat;
import java.util.Date;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.stereotype.Component;
import com.alibaba.dubbo.config.annotation.Reference;
import com.zhanglang.dubbo.a_provider.service.ProviderDemoService;
/**
* Hello world!
*
*/
@Component
public class ConsumerService implements InitializingBean{
@Reference
private ProviderDemoService providerDemoService;
public void firstCall(){
System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date())+" first call start ");
String response = providerDemoService.sayHello();
System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date())+" first call response : " + response);
}
public void secondCall(){
System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date())+" second call start ");
String response = null;
try{
response = providerDemoService.sayHello();
}catch(Exception e){
System.out.println("异常信息:"+e.getMessage());
response =" 第二次访问异常 ,异常信息:"+e.getMessage();
}
System.out.println(new SimpleDateFormat("[yyyy-MM-dd HH:mm:ss]").format(new Date()) + " second call response : " + response);
}
public void setProviderDemoService(ProviderDemoService providerDemoService) {
this.providerDemoService = providerDemoService;
}
public void afterPropertiesSet() throws Exception {
/**
* 客户端并发启动两个线程,检测第二个线程执行过程是否还能访问到第一个线程访问到的provider。
* 如果 第二个线程访问返回No available Provider 异常信息,则代表provider已经下线..实现了consumer端的优雅停机
*
*/
Thread firstCallThread = new Thread(new Runnable() {
public void run() {
firstCall();
}
},"firstCallThread");
firstCallThread.start();
//暂停1秒钟,看还能不能访问到原来ip的服务者
Thread.sleep(1000);
Thread secondCallThread = new Thread(new Runnable() {
public void run() {
secondCall();
}
},"secondCallThread");
secondCallThread.start();
}
}
Dubbox-2.8.4控制台打印日志:
[2017-12-14 16:08:04] first call start
[2017-12-14 16:08:05] second call start
异常信息:Forbid consumer 192.168.2.104 access service com.zhanglang.dubbo.a_provider.service.ProviderDemoService from registry 127.0.0.1:2181 use dubbo version 2.8.4, Please check registry access list (whitelist/blacklist).
[2017-12-14 16:08:05] second call response : 第二次访问异常 ,异常信息:Forbid consumer 192.168.2.104 access service com.zhanglang.dubbo.a_provider.service.ProviderDemoService from registry 127.0.0.1:2181 use dubbo version 2.8.4, Please check registry access list (whitelist/blacklist).
第一个请求隔了25秒后返回:Waiting server-side response timeout by scan timer. start time: 2017-12-14 16:08:04.715, end time: 2017-12-14 16:08:29.735, client elapsed: 49 ms, server elapsed: 24971 ms, timeout: 25000 ms
结果:dubbox2.8.4不能实现优雅关闭
Dubbo-2.5.6控制台打印日志:
[2017-12-14 16:14:46] first call start
[2017-12-14 16:14:47] second call start
[2017-12-14 16:14:47] second call response : 第二次访问异常 ,异常信息:No provider available from registry 127.0.0.1:2181 for service com.zhanglang.dubbo.a_provider.service.ProviderDemoService on consumer 192.168.2.104
[2017-12-14 16:14:51] first call response : hello Word
结果:dubbo2.5.6版本能实现优雅关闭
System.setProperty("dubbo.service.shutdown.wait", "25000");