如何在Node.js中使用bigquery API获取800万条Google Cloud记录?

时间:2021-05-10 15:25:04

I am querying the Google cloud Data using Bigquery.

我正在使用Bigquery查询Google云数据。

When i am running the query it return about 8 millions of row. But it throws error :

当我运行查询时,它返回大约800万行。但它会引发错误:

Response too large to return

回复太大,无法返回

How i can get all 8 million records,can anybody help.

如何获得所有800万条记录,任何人都可以提供帮助。

2 个解决方案

#1


2  

1. What is the maximum size of Big Query Response?

1.大查询响应的最大大小是多少?

As it's mentioned on Quota-policy queries maximum response size: 128 MB compressed (unlimited when returning large query results)

正如在配额策略查询中提到的最大响应大小:128 MB压缩(返回大型查询结果时无限制)

2. How do we select all the records in Query Request not in 'Export Method'?

2.我们如何在“导出方法”中选择查询请求中的所有记录?

If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.

如果计划运行可能返回较大结果的查询,则可以在作业配置中将allowLargeResults设置为true。

Queries that return large results will take longer to execute, even if the result set is small, and are subject to additional limitations:

返回大结果的查询将花费更长的时间来执行,即使结果集很小,并且受到其他限制:

  • You must specify a destination table.
  • 您必须指定目标表。
  • You can't specify a top-level ORDER BY, TOP or LIMIT clause. Doing so negates the benefit of using allowLargeResults, because the query output can no longer be computed in parallel.
  • 您不能指定*ORDER BY,TOP或LIMIT子句。这样做会否定使用allowLargeResults的好处,因为无法再并行计算查询输出。
  • Window functions can return large query results only if used in conjunction with a PARTITION BY clause.
  • 仅当与PARTITION BY子句一起使用时,窗口函数才能返回大型查询结果。

Read more about how to paginate to get the results here and also read from the BigQuery Analytics book, the pages that start with page 200, where it is explained how Jobs::getQueryResults is working together with the maxResults parameter and int's blocking mode.

阅读有关如何分页以获取结果的更多信息,并阅读BigQuery Analytics书籍,从第200页开始的页面,其中解释了Jobs :: getQueryResults如何与maxResults参数和int的阻塞模式一起工作。

Update:

更新:

Query Result Size Limitations - Sometimes, it is hard to know what 128 MB of compressed data means.

查询结果大小限制 - 有时,很难知道128 MB的压缩数据意味着什么。

When you run a normal query in BigQuery, the response size is limited to 128 MB of compressed data. Sometimes, it is hard to know what 128 MB of compressed data means. Does it get compressed 2x? 10x? The results are compressed within their respective columns, which means the compression ratio tends to be very good. For example, if you have one column that is the name of a country, there will likely be only a few different values. When you have only a few distinct values, this means that there isn’t a lot of unique information, and the column will generally compress well. If you return encrypted blobs of data, they will likely not compress well because they will be mostly random. (This is explained on the book linked above on page 220)

在BigQuery中运行普通查询时,响应大小限制为128 MB的压缩数据。有时,很难知道128 MB的压缩数据意味着什么。它被压缩2倍? 10倍?结果在各自的列内压缩,这意味着压缩比趋于非常好。例如,如果您有一列是国家/地区的名称,则可能只有几个不同的值。当您只有几个不同的值时,这意味着没有很多独特的信息,并且该列通常会很好地压缩。如果您返回加密的数据blob,它们可能无法很好地压缩,因为它们通常是随机的。 (这在第220页上面链接的书中有解释)

#2


1  

try this,

尝试这个,

Under the query window, there is an button 'Show Options', click that and then you will see some options,

在查询窗口下,有一个“显示选项”按钮,单击该按钮,然后您将看到一些选项,

  1. select or create a new destination table;
  2. 选择或创建新的目标表;
  3. click the 'Allow Large Results'
  4. 点击“允许大结果”

run your query, and see whether it works.

运行您的查询,看看它是否有效。

#1


2  

1. What is the maximum size of Big Query Response?

1.大查询响应的最大大小是多少?

As it's mentioned on Quota-policy queries maximum response size: 128 MB compressed (unlimited when returning large query results)

正如在配额策略查询中提到的最大响应大小:128 MB压缩(返回大型查询结果时无限制)

2. How do we select all the records in Query Request not in 'Export Method'?

2.我们如何在“导出方法”中选择查询请求中的所有记录?

If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.

如果计划运行可能返回较大结果的查询,则可以在作业配置中将allowLargeResults设置为true。

Queries that return large results will take longer to execute, even if the result set is small, and are subject to additional limitations:

返回大结果的查询将花费更长的时间来执行,即使结果集很小,并且受到其他限制:

  • You must specify a destination table.
  • 您必须指定目标表。
  • You can't specify a top-level ORDER BY, TOP or LIMIT clause. Doing so negates the benefit of using allowLargeResults, because the query output can no longer be computed in parallel.
  • 您不能指定*ORDER BY,TOP或LIMIT子句。这样做会否定使用allowLargeResults的好处,因为无法再并行计算查询输出。
  • Window functions can return large query results only if used in conjunction with a PARTITION BY clause.
  • 仅当与PARTITION BY子句一起使用时,窗口函数才能返回大型查询结果。

Read more about how to paginate to get the results here and also read from the BigQuery Analytics book, the pages that start with page 200, where it is explained how Jobs::getQueryResults is working together with the maxResults parameter and int's blocking mode.

阅读有关如何分页以获取结果的更多信息,并阅读BigQuery Analytics书籍,从第200页开始的页面,其中解释了Jobs :: getQueryResults如何与maxResults参数和int的阻塞模式一起工作。

Update:

更新:

Query Result Size Limitations - Sometimes, it is hard to know what 128 MB of compressed data means.

查询结果大小限制 - 有时,很难知道128 MB的压缩数据意味着什么。

When you run a normal query in BigQuery, the response size is limited to 128 MB of compressed data. Sometimes, it is hard to know what 128 MB of compressed data means. Does it get compressed 2x? 10x? The results are compressed within their respective columns, which means the compression ratio tends to be very good. For example, if you have one column that is the name of a country, there will likely be only a few different values. When you have only a few distinct values, this means that there isn’t a lot of unique information, and the column will generally compress well. If you return encrypted blobs of data, they will likely not compress well because they will be mostly random. (This is explained on the book linked above on page 220)

在BigQuery中运行普通查询时,响应大小限制为128 MB的压缩数据。有时,很难知道128 MB的压缩数据意味着什么。它被压缩2倍? 10倍?结果在各自的列内压缩,这意味着压缩比趋于非常好。例如,如果您有一列是国家/地区的名称,则可能只有几个不同的值。当您只有几个不同的值时,这意味着没有很多独特的信息,并且该列通常会很好地压缩。如果您返回加密的数据blob,它们可能无法很好地压缩,因为它们通常是随机的。 (这在第220页上面链接的书中有解释)

#2


1  

try this,

尝试这个,

Under the query window, there is an button 'Show Options', click that and then you will see some options,

在查询窗口下,有一个“显示选项”按钮,单击该按钮,然后您将看到一些选项,

  1. select or create a new destination table;
  2. 选择或创建新的目标表;
  3. click the 'Allow Large Results'
  4. 点击“允许大结果”

run your query, and see whether it works.

运行您的查询,看看它是否有效。