
时间:2022-07-23 02:48:03

I have written a Node.js application that writes lots of records to a PostgreSQL 9.6 database. Unfortunately, it feels quite slow. To be able to test things I have created a short but complete program that reproduces the scenario:

我编写了一个Node.js应用程序,它将大量记录写入PostgreSQL 9.6数据库。不幸的是,感觉很慢。为了能够测试我创建了一个简短但完整的程序来重现场景:

'use strict';

const async = require('async'),
      pg = require('pg'),
      uuid = require('uuidv4');

const pool = new pg.Pool({
  protocol: 'pg',
  user: 'golo',
  host: 'localhost',
  port: 5432,
  database: 'golo'

const records = [];

for (let i = 0; i < 10000; i++) {
  records.push({ id: uuid(), revision: i, data: { foo: 'bar', bar: 'baz' }, flag: true });

pool.connect((err, database, close) => {
  if (err) {
    /* eslint-disable no-console */
    return console.log(err);
    /* eslint-enable no-console */

      "position" bigserial NOT NULL,
      "id" uuid NOT NULL,
      "revision" integer NOT NULL,
      "data" jsonb NOT NULL,
      "flag" boolean NOT NULL,

      CONSTRAINT "foo_pk" PRIMARY KEY("position"),
      CONSTRAINT "foo_index_id_revision" UNIQUE ("id", "revision")
  `, errQuery => {
    if (errQuery) {
      /* eslint-disable no-console */
      return console.log(errQuery);
      /* eslint-enable no-console */

      beginTransaction (done) {
        /* eslint-disable no-console */
        /* eslint-enable no-console */
        database.query('BEGIN', done);
      saveRecords (done) {
        async.eachSeries(records, (record, doneEach) => {
            name: 'save',
            text: `
              INSERT INTO "foo"
                ("id", "revision", "data", "flag")
                ($1, $2, $3, $4) RETURNING position;
            values: [ record.id, record.revision, record.data, record.flag ]
          }, (errQuery2, result) => {
            if (errQuery2) {
              return doneEach(errQuery2);

            record.position = Number(result.rows[0].position);
        }, done);
      commitTransaction (done) {
        database.query('COMMIT', done);
    }, errSeries => {
      /* eslint-disable no-console */
      /* eslint-enable no-console */
      if (errSeries) {
        return database.query('ROLLBACK', errRollback => {

          if (errRollback) {
            /* eslint-disable no-console */
            return console.log(errRollback);
            /* eslint-enable no-console */
          /* eslint-disable no-console */
          /* eslint-enable no-console */

      /* eslint-disable no-console */
      /* eslint-enable no-console */

The performance I get for inserting 10.000 rows is 2.5 seconds. This is not bad, but also not great. What can I do to improve speed?


Some thoughts that I had so far:


  • Use prepared statements. As you can see I have done this, this speeded up things by ~30 %.
  • 使用准备好的陈述正如你所看到的,我已经做到了这一点,这加速了大约30%。
  • Insert multiple rows at once using a single INSERT command. Unfortunately, this is not possible, as in reality, the number of records that need to be written varies from call to call and a varying number of arguments makes it impossible to use prepared statements.
  • 使用单个INSERT命令一次插入多行。不幸的是,这是不可能的,因为实际上,需要写入的记录数量因呼叫而异,并且不同数量的参数使得无法使用预准备语句。
  • Use COPY instead of INSERT: I can't use this, since this happens at runtime, not at initialization time.
  • 使用COPY代替INSERT:我不能使用它,因为这发生在运行时,而不是在初始化时。
  • Use text instead of jsonb: Didn't change a thing.
  • 使用文本而不是jsonb:没有改变一件事。
  • Use json instead of jsonb: Didn't change a thing either.
  • 使用json而不是jsonb:也没有改变一件事。

A few more notes on the data that happens in reality:


  • The revision is not necessarily increasing. This is just a number.
  • 修订不一定会增加。这只是一个数字。
  • The flag is not always true, it can be true and false as well.
  • 标志并不总是正确的,它也可以是真的和假的。
  • Of course the data field contains different data, too.
  • 当然,数据字段也包含不同的数据。

So in the end it comes down to:


  • What possibilities are there to significantly speed up multiple single calls to INSERT?
  • 什么样的可能性可以显着加快对INSERT的多次单次调用?

1 个解决方案



Insert multiple rows at once using a single INSERT command. Unfortunately, this is not possible, as in reality, the number of records that need to be written varies from call to call and a varying number of arguments makes it impossible to use prepared statements.


This is the right answer, followed by an invalid counter-argument.


You can generate your multi-row inserts in a loop, with some 1000 - 10,000 records per query, depending on the size of the records.

您可以在循环中生成多行插入,每个查询大约1000 - 10,000条记录,具体取决于记录的大小。

And you do not need prepared statements for this at all.


See this article I wrote about the same issues: Performance Boost.

看到这篇文章我写了同样的问题:Performance Boost。

Following the article, my code was able to insert 10,000 records in under 50ms.


A related question: Multi-row insert with pg-promise.




Insert multiple rows at once using a single INSERT command. Unfortunately, this is not possible, as in reality, the number of records that need to be written varies from call to call and a varying number of arguments makes it impossible to use prepared statements.


This is the right answer, followed by an invalid counter-argument.


You can generate your multi-row inserts in a loop, with some 1000 - 10,000 records per query, depending on the size of the records.

您可以在循环中生成多行插入,每个查询大约1000 - 10,000条记录,具体取决于记录的大小。

And you do not need prepared statements for this at all.


See this article I wrote about the same issues: Performance Boost.

看到这篇文章我写了同样的问题:Performance Boost。

Following the article, my code was able to insert 10,000 records in under 50ms.


A related question: Multi-row insert with pg-promise.
