HttpClient源码解析系列:第二篇:极简版实现

时间:2024-04-06 07:21:47
极简版的实现,核心架构的原初模型

    从MinimalHttpClient从名字可以看出,是一个极简可以用的版本,是核心设计的原初模型。所以我们就从最精简的开始分析。
    HttpClient源码解析系列:第二篇:极简版实现
    核心元素只有三个,一个参数 params,一个执行器 requestExecutor,一个连接管理器 connManager。
HttpClientConnectionManager(@since 4.3): 
连接池组件,管理连接的整个生命周期。连接在连接池中创建、复用以及移除。Connection manager封装了对连接池的具体操作,比如向连接池租用和归还连接。
Connection被创建出来后处于闲置状态,由连接池管理,调用时会校验是否是open状态,不是的话会进行connect。connect的过程就是 基于不同schema(主要是http和https)创建不同的socket连接(ssl和plain)并且将http请求(连接)绑定到socket。同时连接也会因为心跳或者过期等原因被close变成stale状态,直至被下一次get到时或者连接满时被清理出去。
同时连接池还能对连接进行限流–全局和单route连接数。

ClientExecChain:
代表一次完整的调用执行过程,它是一个包装类,类似于java io类,每个包装类完成一个特定的功能,多层嵌套完成一个系统性的功能,比如处理协议范畴的例如cookie、报文解析等,又比如处理自动重试的,等等。

再来看核心的执行方法
    @Override
    protected CloseableHttpResponse doExecute(
            final HttpHost target,
            final HttpRequest request,
            final HttpContext context) throws IOException, ClientProtocolException {
        Args.notNull(target, "Target host");
        Args.notNull(request, "HTTP request");
        HttpExecutionAware execAware = null;
        if (request instanceof HttpExecutionAware) {
            execAware = (HttpExecutionAware) request;
        }
        try {
            final HttpRequestWrapper wrapper = HttpRequestWrapper.wrap(request);
            final HttpClientContext localcontext = HttpClientContext.adapt(
                context != null ? context : new BasicHttpContext());
            final HttpRoute route = new HttpRoute(target);
            RequestConfig config = null;
            if (request instanceof Configurable) {
                config = ((Configurable) request).getConfig();
            }
            if (config != null) {
                localcontext.setRequestConfig(config);
            }
            return this.requestExecutor.execute(route, wrapper, localcontext, execAware);
        } catch (final HttpException httpException) {
            throw new ClientProtocolException(httpException);
        }
    }
HttpExecutionAware:一个类可能发生blocking I/O可以继承这个接口,这样当blocking I/O被取消的时候,继承这个接口的类会收到通知。
HttpRequestWrapper:HttpRequest的包装类,可以保证在修改Request的时候原始请求对象不会被改变。
HttpRoute:指示对方服务器地址。HttpRoutePlanner用来创建HttpRoute。后者代表客户端request的对端服务器,主要包含rout的host以及proxy信息。
RequestConfig:一些与HTTP请求相关的基础设置。
HttpClient源码解析系列:第二篇:极简版实现
最终,落实到 requestExecutor 去执行。
public MinimalHttpClient(
            final HttpClientConnectionManager connManager) {
        super();
        this.connManager = Args.notNull(connManager, "HTTP connection manager");
        this.requestExecutor = newMinimalClientExec(
                new HttpRequestExecutor(),
                connManager,
                DefaultConnectionReuseStrategy.INSTANCE,
                DefaultConnectionKeepAliveStrategy.INSTANCE);
        this.params = new BasicHttpParams();
    }

    MinimalClientExec:是ClientExecChain的一个实现,只是封装了最基本的HTTP过程,提供最直接的客户端服务器交互,不支持代理,不支持在各种情况下的重试(重定向,权限校验,IO异常等)。
    所以,我们在往下一层,看 MinimalClientExec 是如何执行的,由于代码很长,略过非核心的代码用省略代替。
    private final HttpRequestExecutor requestExecutor;
    private final HttpClientConnectionManager connManager;
    private final ConnectionReuseStrategy reuseStrategy;
    private final ConnectionKeepAliveStrategy keepAliveStrategy;
    private final HttpProcessor httpProcessor;

    public MinimalClientExec(
            final HttpRequestExecutor requestExecutor,
            final HttpClientConnectionManager connManager,
            final ConnectionReuseStrategy reuseStrategy,
            final ConnectionKeepAliveStrategy keepAliveStrategy) {
        Args.notNull(requestExecutor, "HTTP request executor");
        Args.notNull(connManager, "Client connection manager");
        Args.notNull(reuseStrategy, "Connection reuse strategy");
        Args.notNull(keepAliveStrategy, "Connection keep alive strategy");
        this.httpProcessor = new ImmutableHttpProcessor(
                new RequestContent(),
                new RequestTargetHost(),
                new RequestClientConnControl(),
                new RequestUserAgent(VersionInfo.getUserAgent(
                        "Apache-HttpClient", "org.apache.http.client", getClass())));
        this.requestExecutor    = requestExecutor;
        this.connManager        = connManager;
        this.reuseStrategy      = reuseStrategy;
        this.keepAliveStrategy  = keepAliveStrategy;
    }
    public CloseableHttpResponse execute(){
。。。。
        final ConnectionRequest connRequest = connManager.requestConnection(route, null);
。。。。
        final HttpClientConnection managedConn = connRequest.get(timeout > 0 ? timeout : 0, TimeUnit.MILLISECONDS);
。。。。
            context.setAttribute(HttpCoreContext.HTTP_TARGET_HOST, target);
            context.setAttribute(HttpCoreContext.HTTP_REQUEST, request);
            context.setAttribute(HttpCoreContext.HTTP_CONNECTION, managedConn);
            context.setAttribute(HttpClientContext.HTTP_ROUTE, route);
。。。。
            httpProcessor.process(request, context);
            final HttpResponse response = requestExecutor.execute(request, managedConn, context);
            httpProcessor.process(response, context);
。。。。
            final HttpEntity entity = response.getEntity();
。。。。
}
ConnectionRequest:是由ConnectionManager来管理的Request。
HttpClientConnection:Http连接,用于做请求发送和响应收取。
HttpContext:存储KV型的数据。主要用于HTTP请求过程中,多个逻辑过程之间的数据共享。
ImmutableHttpProcessor:是HttpProcessor的一个实现,从这个接口的名字就可以看出来,为的是处理Http协议。其中包含了多个协议拦截器,分开处理HTTP协议中的不同部分,这是责任链模式的一个典范实现。单纯到ImmutableHttpProcessor,就是一系列HttpRequestInterceptor 和 HttpResponseInterceptor。然后按照次序调用这些拦截器,处理Request或者Response这两个过程。

    除了上面说说的,外围的管理和处理,最核心的就在于Http请求是如何发出的,如何拿到最原始的数据返回。这就是 HttpRequestExecutor 做的事情。其核心流程是基于 blocking (classic) I/O 模型的。同样,我们略去非核心的部分,只留下核心代码。
public HttpResponse execute(){
。。。
            HttpResponse response = doSendRequest(request, conn, context);
            if (response == null) {
                response = doReceiveResponse(request, conn, context);
            }
。。。
}
protected HttpResponse doSendRequest(
            final HttpRequest request,
            final HttpClientConnection conn,
            final HttpContext context) throws IOException, HttpException {
。。。。
        conn.sendRequestHeader(request);
        if (request instanceof HttpEntityEnclosingRequest) {
            // Check for expect-continue handshake. We have to flush the
            // headers and wait for an 100-continue response to handle it.
            // If we get a different response, we must not send the entity.
            boolean sendentity = true;
            final ProtocolVersion ver =
                request.getRequestLine().getProtocolVersion();
            if (((HttpEntityEnclosingRequest) request).expectContinue() &&
                !ver.lessEquals(HttpVersion.HTTP_1_0)) {

                conn.flush();
                // As suggested by RFC 2616 section 8.2.3, we don't wait for a
                // 100-continue response forever. On timeout, send the entity.
                if (conn.isResponseAvailable(this.waitForContinue)) {
                    response = conn.receiveResponseHeader();
                    if (canResponseHaveBody(request, response)) {
                        conn.receiveResponseEntity(response);
                    }
                    final int status = response.getStatusLine().getStatusCode();
                    if (status < 200) {
                        if (status != HttpStatus.SC_CONTINUE) {
                            throw new ProtocolException(
                                    "Unexpected response: " + response.getStatusLine());
                        }
                        // discard 100-continue
                        response = null;
                    } else {
                        sendentity = false;
                    }
                }
            }
            if (sendentity) {
                conn.sendRequestEntity((HttpEntityEnclosingRequest) request);
            }
        }
        conn.flush();
        context.setAttribute(HttpCoreContext.HTTP_REQ_SENT, Boolean.TRUE);
        return response;
    }
protected HttpResponse doReceiveResponse(
            final HttpRequest request,
            final HttpClientConnection conn,
            final HttpContext context) throws HttpException, IOException {
。。。。。
        while (response == null || statusCode < HttpStatus.SC_OK) {
            response = conn.receiveResponseHeader();
            if (canResponseHaveBody(request, response)) {
                conn.receiveResponseEntity(response);
            }
            statusCode = response.getStatusLine().getStatusCode();
        } // while intermediate response
        return response;
    }
通过上面的代码,可以发现,最终发出请求和收取请求内容,是 HttpClientConnection 来完成的。Executor只是封装了这个过程,并完成了外围处理。至于 HttpClientConnection 的功能设计和底层Socket实现,之后再讨论。

    到这里,其实Http请求的发出收取处理管理等工作的最基础版本(Minimal)就已经非常清楚了。最后理一下MinimalHttpClient最相关的核心类:
HttpClient源码解析系列:第二篇:极简版实现
    可以看到,一个简单的 发送-收取 过程被分成了众多的类来分别处理。看上去很多,实际上也就三个步骤:管理请求(ConnectionManager),执行请求(ReqeustExecutor),处理请求(HttpProcessor)。
    而每个步骤都有不少相关的东西,但是主线逻辑是非常清晰的。这里基本把最核心的执行链讲完了,下面是分开讨论其他问题。