如何在多个内核上线性扩展NodeJS?

时间:2021-01-21 20:12:51

I am doing a quick performance test for NodeJS vs. Java. The simple use case chosen is querying a single table in MySQL database. The initial results were as follows:

我正在为NodeJS和Java做一个快速的性能测试。选择的简单用例是查询MySQL数据库中的单个表。初步结果如下:

Platform                      | DB Connections | CPU Usage | Memory Usage  | Requests/second
==============================|================|===========|===============|================
Node 0.10/MySQL               | 20             |  34%      |  57M          | 1295
JBoss EAP 6.2/JPA             | 20             | 100%      | 525M          | 4622
Spring 3.2.6/JDBC/Tomcat 7.0  | 20             | 100%      | 860M          | 4275

Note that Node's CPU and memory usage are way lower than Java but the throughput is also about a third! Then I realized that Java was utilizing all four cores on my CPU, whereas Node was running on only one core. So I changed the Node code to incorporate the cluster module and now it was utilizing all four cores. Here are the new results:

请注意,Node的CPU和内存使用率低于Java,但吞吐量也约为三分之一!然后我意识到Java在我的CPU上使用了所有四个核心,而Node只在一个核心上运行。所以我改变了Node代码以合并集群模块,现在它正在使用所有四个核心。以下是新结果:

Platform                      | DB Connections | CPU Usage | Memory Usage  | Requests/second
==============================|================|===========|===============|================
Node 0.10/MySQL (quad core)   | 20 (5 x 4)     | 100%      | 228M (57 x 4) | 2213

Note that the CPU and memory usage have now gone up proportionately but the throughput has only gone up by 70%. I was expecting a four fold increase, exceeding the Java throughput. How can I account for the descrepancy? What can I do to increase the throughput linearly?

请注意,CPU和内存使用量现已成比例增加,但吞吐量仅增加了70%。我期待增加四倍,超过Java吞吐量。我如何解释这种不足?我该怎么做才能线性增加吞吐量?

Here's the code for utilizing multiple cores:

这是使用多个核心的代码:

if (Cluster.isMaster) {
    var numCPUs = require("os").cpus().length;
    for (var i = 0; i < numCPUs; i++) {
        Cluster.fork();
    }

    Cluster.on("exit", function(worker, code, signal) {
        Cluster.fork();
    });
}
else {
    // Create an express app
    var app = Express();
    app.use(Express.json());
    app.use(enableCORS);
    app.use(Express.urlencoded());

    // Add routes

    // GET /orders
    app.get('/orders', OrderResource.findAll);

    // Create an http server and give it the
    // express app to handle http requests
    var server = Http.createServer(app);
    server.listen(8080, function() {
        console.log('Listening on port 8080');
    });
}

I am using the node-mysql driver for querying the database. The connection pool is set to 5 connections per core, however that makes no difference. If I set this number to 1 or 20, I get approximately the same throughput!

我使用node-mysql驱动程序来查询数据库。连接池设置为每个核心5个连接,但这没有区别。如果我将此数字设置为1或20,我的吞吐量大致相同!

var pool = Mysql.createPool({
    host: 'localhost',
    user: 'bfoms_javaee',
    password: 'bfoms_javaee',
    database: 'bfoms_javaee',
    connectionLimit: 5
});

exports.findAll = function(req, res) {
    pool.query('SELECT * FROM orders WHERE symbol="GOOG"', function(err, rows, fields) {
        if (err) throw err;
        res.send(rows);
    });
};

2 个解决方案

#1


2  

From what I see, you aren't comparing just platforms but also the frameworks. You probably want to remove the framework effect and implement a plain HTTP server. For instance, all those middlewares in Express app add up to the latency. Also, did you make sure the Java libraries do not cache the frequently requested data which would significantly improve the performance?

从我看来,你不只是比较平台而是框架。您可能希望删除框架效果并实现普通的HTTP服务器。例如,Express应用程序中的所有这些中间件都会增加延迟。此外,您是否确保Java库不会缓存频繁请求的数据,从而显着提高性能?

Other things to consider is that the built-in http module in Node (thus, any library built on top of it, including node-mysql) maintains an internal connection pool via the Agent class (not to confuse with the MySQL connection pool) so that it can utilize HTTP keep-alives. This helps increase the performance when you're running many requests to the same server instead of opening a TCP connection, making an HTTP request, getting a response, closing the TCP connection, and repeating. Thus, the TCP connections can be reused.

其他要考虑的事情是Node中的内置http模块(因此,构建在它之上的任何库,包括node-mysql)通过Agent类维护一个内部连接池(不要与MySQL连接池混淆)所以它可以利用HTTP keep-alives。当您向同一服务器运行许多请求而不是打开TCP连接,发出HTTP请求,获取响应,关闭TCP连接以及重复时,这有助于提高性能。因此,可以重用TCP连接。

By default, the HTTP Agent will only open 5 simultaneous connections to a single host, like your MySQL server. You can change this easily as follows:

默认情况下,HTTP代理只会打开5个同时连接到单个主机的连接,例如MySQL服务器。您可以按如下方式轻松更改:

var http = require('http');
http.globalAgent.maxSockets = 20;

Considering these changes, see what improvement you can get.

考虑到这些变化,请看看您可以获得哪些改进。

Other ideas is to verify that the MySQL connection pool is properly used by checking MySQL logs on when connections get opened and when closed. If they get opened often, you may need to increase the idle timeout value in node-mysql.

其他想法是通过在连接打开和关闭时检查MySQL登录来验证MySQL连接池是否正确使用。如果它们经常打开,您可能需要增加node-mysql中的空闲超时值。

#2


1  

Try setting the environment variable export NODE_CLUSTER_SCHED_POLICY="rr". As per this blog post.

尝试设置环境变量export NODE_CLUSTER_SCHED_POLICY =“rr”。根据这篇博客文章。

#1


2  

From what I see, you aren't comparing just platforms but also the frameworks. You probably want to remove the framework effect and implement a plain HTTP server. For instance, all those middlewares in Express app add up to the latency. Also, did you make sure the Java libraries do not cache the frequently requested data which would significantly improve the performance?

从我看来,你不只是比较平台而是框架。您可能希望删除框架效果并实现普通的HTTP服务器。例如,Express应用程序中的所有这些中间件都会增加延迟。此外,您是否确保Java库不会缓存频繁请求的数据,从而显着提高性能?

Other things to consider is that the built-in http module in Node (thus, any library built on top of it, including node-mysql) maintains an internal connection pool via the Agent class (not to confuse with the MySQL connection pool) so that it can utilize HTTP keep-alives. This helps increase the performance when you're running many requests to the same server instead of opening a TCP connection, making an HTTP request, getting a response, closing the TCP connection, and repeating. Thus, the TCP connections can be reused.

其他要考虑的事情是Node中的内置http模块(因此,构建在它之上的任何库,包括node-mysql)通过Agent类维护一个内部连接池(不要与MySQL连接池混淆)所以它可以利用HTTP keep-alives。当您向同一服务器运行许多请求而不是打开TCP连接,发出HTTP请求,获取响应,关闭TCP连接以及重复时,这有助于提高性能。因此,可以重用TCP连接。

By default, the HTTP Agent will only open 5 simultaneous connections to a single host, like your MySQL server. You can change this easily as follows:

默认情况下,HTTP代理只会打开5个同时连接到单个主机的连接,例如MySQL服务器。您可以按如下方式轻松更改:

var http = require('http');
http.globalAgent.maxSockets = 20;

Considering these changes, see what improvement you can get.

考虑到这些变化,请看看您可以获得哪些改进。

Other ideas is to verify that the MySQL connection pool is properly used by checking MySQL logs on when connections get opened and when closed. If they get opened often, you may need to increase the idle timeout value in node-mysql.

其他想法是通过在连接打开和关闭时检查MySQL登录来验证MySQL连接池是否正确使用。如果它们经常打开,您可能需要增加node-mysql中的空闲超时值。

#2


1  

Try setting the environment variable export NODE_CLUSTER_SCHED_POLICY="rr". As per this blog post.

尝试设置环境变量export NODE_CLUSTER_SCHED_POLICY =“rr”。根据这篇博客文章。