使用Google BigQuery背靠背运行多个查询

时间:2022-11-24 15:34:30

I'm currently working a project where I'm using Google Big Query to pull data from spreadsheets. I'm VERY new to SQL, so I apologize. I'm currently using the following code

我目前正在开展一个项目,我正在使用Google Big Query从电子表格中提取数据。我对SQL很新,所以我道歉。我目前正在使用以下代码

Select *
From my_data
Where T1 > 1000
And T2 > 2000

So keeping the Select and From the same, I want to be able to run multiple queries where I can just keep changing the values I'm looking for between t1 and t2. Around 50 different values. I'd like for BigQuery to run through these 50 different values back to back. Is there a way to do this? Thanks!

因此,保持Select和From相同,我希望能够运行多个查询,我可以在t1和t2之间不断更改我正在寻找的值。大约50个不同的值。我希望BigQuery能够背靠背地运行这50个不同的值。有没有办法做到这一点?谢谢!

3 个解决方案

#1


2  

I'm VERY new to SQL

我对SQL非常陌生

... and I assume to BigQuery either ..., so

...我假设BigQuery要么......,所以

Below is one of the options for new users who are not familiar yet with BigQuery API and/or different clients rather than BigQuery Web UI.

以下是不熟悉BigQuery API和/或不同客户端而非BigQuery Web UI的新用户的选项之一。

BigQuery Mate adds parameters feature to BigQuery Web UI

BigQuery Mate为BigQuery Web UI添加了参数功能

What you need to do is

你需要做的是

  1. Save you query as below using Save Query Button

    使用“保存查询按钮”保存查询,如下所示

    使用Google BigQuery背靠背运行多个查询

Notice <var_t1> and <var_t2>
Those are the parameters identifyable by BigQuery Mate

注意 这些是BigQuery Mate可识别的参数

  1. Now you can set those parameters
    Click QB Mate and then Parameters to get to below form

    现在您可以设置这些参数单击QB Mate,然后单击Parameters以转到下面的表单

    使用Google BigQuery背靠背运行多个查询

  2. Now you can set parameters with whatever values you want to run with;
    Click on Replace Parameters OK Button and those values will appear in Editor. For example

    现在,您可以使用要运行的任何值设置参数;单击替换参数确定按钮,这些值将显示在编辑器中。例如

    使用Google BigQuery背靠背运行多个查询

After OK is clicked you get

单击确定后,您将获得

使用Google BigQuery背靠背运行多个查询

So now you can run your query

所以现在你可以运行你的查询了

  1. To Run another round with new parameters, you need to load again your saved query to the editor by clicking on Edit Query Button
  2. 要使用新参数运行另一轮,您需要通过单击编辑查询按钮再次将保存的查询加载到编辑器

使用Google BigQuery背靠背运行多个查询

and now repeat settings parameters and so on

现在重复设置参数等

You can find BigQuery Mate Chrome extension here

您可以在此处找到BigQuery Mate Chrome扩展程序

Disclaimer: I am the author an the only developer of this tool

免责声明:我是作者,是该工具的唯一开发者

#2


1  

You may be interested in running parameterized queries. The idea would be to have a single query string, e.g.:

您可能对运行参数化查询感兴趣。想法是有一个查询字符串,例如:

SELECT *
FROM YourTable
WHERE t1 > @t1_min AND
  t2 > @t2_min;

You would execute this multiple times, where each time you bind different values of the t1_min and t2_min parameters. The exact logic would depend on the API through which you are using the client libraries, and there are language-specific examples in the first link that I provided.

您将多次执行此操作,每次绑定t1_min和t2_min参数的不同值时。确切的逻辑取决于您使用客户端库的API,并且在我提供的第一个链接中有特定于语言的示例。

#3


0  

If you are not concerned about sql-injection and just want to iteratively swap out parameters in queries, you might want to look into the mustache templating language (available in R as 'whisker').

如果您不关心sql-injection并且只想在查询中迭代地交换参数,您可能需要查看胡子模板语言(在R中可用作'whisker')。

If you are using R, you can iterate/automate this type of query with the condusco R package. Here's a complete R script that will accomplish this kind of iterative query using both whisker and condusco:

如果您使用的是R,则可以使用condusco R包迭代/自动执行此类查询。这是一个完整的R脚本,它将使用whisker和condusco完成这种迭代查询:

library(bigrquery)
library(condusco)
library(whisker)

# create a simple function that will create a query
# using {{{mustache}}} placeholders for any parameters

create_results_table <- function(params){

  destination_table <- '{{{dataset_id}}}.{{{table_prefix}}}_results_{{{year_low}}}_{{{year_high}}}'

  query <- '
    SELECT *
    FROM `bigquery-public-data.samples.gsod`
    WHERE year > {{{year_low}}}
      AND year <= {{{year_high}}}
  '


  # use whisker to swap out {{{mustache}}} placeholders with parameters
  query_exec(
    whisker.render(query,params),
    project=whisker.render('{{{project}}}', params),
    destination_table = whisker.render(destination_table,params),
    use_legacy_sql = FALSE
  )

}

# create an invocation query to provide sets of parameters to create_results_table

invocation_query <- '
  SELECT
    "<YOUR PROJECT HERE>" as project,
    "<YOUR DATASET_ID HERE>" as dataset_id,
    "<YOUR TABLE PREFIX HERE>" as table_prefix,
    num as year_low,
    num+1 as year_high
  FROM `bigquery-public-data.common_us.num_999999`
  WHERE num BETWEEN 1992 AND 1995
'

# call condusco's run_pipeline_gbq to iteratively run create_results_table over invocation_query's results

run_pipeline_gbq(
  create_results_table,
  invocation_query,
  project = '<YOUR PROJECT HERE>',
  use_legacy_sql = FALSE
)

#1


2  

I'm VERY new to SQL

我对SQL非常陌生

... and I assume to BigQuery either ..., so

...我假设BigQuery要么......,所以

Below is one of the options for new users who are not familiar yet with BigQuery API and/or different clients rather than BigQuery Web UI.

以下是不熟悉BigQuery API和/或不同客户端而非BigQuery Web UI的新用户的选项之一。

BigQuery Mate adds parameters feature to BigQuery Web UI

BigQuery Mate为BigQuery Web UI添加了参数功能

What you need to do is

你需要做的是

  1. Save you query as below using Save Query Button

    使用“保存查询按钮”保存查询,如下所示

    使用Google BigQuery背靠背运行多个查询

Notice <var_t1> and <var_t2>
Those are the parameters identifyable by BigQuery Mate

注意 这些是BigQuery Mate可识别的参数

  1. Now you can set those parameters
    Click QB Mate and then Parameters to get to below form

    现在您可以设置这些参数单击QB Mate,然后单击Parameters以转到下面的表单

    使用Google BigQuery背靠背运行多个查询

  2. Now you can set parameters with whatever values you want to run with;
    Click on Replace Parameters OK Button and those values will appear in Editor. For example

    现在,您可以使用要运行的任何值设置参数;单击替换参数确定按钮,这些值将显示在编辑器中。例如

    使用Google BigQuery背靠背运行多个查询

After OK is clicked you get

单击确定后,您将获得

使用Google BigQuery背靠背运行多个查询

So now you can run your query

所以现在你可以运行你的查询了

  1. To Run another round with new parameters, you need to load again your saved query to the editor by clicking on Edit Query Button
  2. 要使用新参数运行另一轮,您需要通过单击编辑查询按钮再次将保存的查询加载到编辑器

使用Google BigQuery背靠背运行多个查询

and now repeat settings parameters and so on

现在重复设置参数等

You can find BigQuery Mate Chrome extension here

您可以在此处找到BigQuery Mate Chrome扩展程序

Disclaimer: I am the author an the only developer of this tool

免责声明:我是作者,是该工具的唯一开发者

#2


1  

You may be interested in running parameterized queries. The idea would be to have a single query string, e.g.:

您可能对运行参数化查询感兴趣。想法是有一个查询字符串,例如:

SELECT *
FROM YourTable
WHERE t1 > @t1_min AND
  t2 > @t2_min;

You would execute this multiple times, where each time you bind different values of the t1_min and t2_min parameters. The exact logic would depend on the API through which you are using the client libraries, and there are language-specific examples in the first link that I provided.

您将多次执行此操作,每次绑定t1_min和t2_min参数的不同值时。确切的逻辑取决于您使用客户端库的API,并且在我提供的第一个链接中有特定于语言的示例。

#3


0  

If you are not concerned about sql-injection and just want to iteratively swap out parameters in queries, you might want to look into the mustache templating language (available in R as 'whisker').

如果您不关心sql-injection并且只想在查询中迭代地交换参数,您可能需要查看胡子模板语言(在R中可用作'whisker')。

If you are using R, you can iterate/automate this type of query with the condusco R package. Here's a complete R script that will accomplish this kind of iterative query using both whisker and condusco:

如果您使用的是R,则可以使用condusco R包迭代/自动执行此类查询。这是一个完整的R脚本,它将使用whisker和condusco完成这种迭代查询:

library(bigrquery)
library(condusco)
library(whisker)

# create a simple function that will create a query
# using {{{mustache}}} placeholders for any parameters

create_results_table <- function(params){

  destination_table <- '{{{dataset_id}}}.{{{table_prefix}}}_results_{{{year_low}}}_{{{year_high}}}'

  query <- '
    SELECT *
    FROM `bigquery-public-data.samples.gsod`
    WHERE year > {{{year_low}}}
      AND year <= {{{year_high}}}
  '


  # use whisker to swap out {{{mustache}}} placeholders with parameters
  query_exec(
    whisker.render(query,params),
    project=whisker.render('{{{project}}}', params),
    destination_table = whisker.render(destination_table,params),
    use_legacy_sql = FALSE
  )

}

# create an invocation query to provide sets of parameters to create_results_table

invocation_query <- '
  SELECT
    "<YOUR PROJECT HERE>" as project,
    "<YOUR DATASET_ID HERE>" as dataset_id,
    "<YOUR TABLE PREFIX HERE>" as table_prefix,
    num as year_low,
    num+1 as year_high
  FROM `bigquery-public-data.common_us.num_999999`
  WHERE num BETWEEN 1992 AND 1995
'

# call condusco's run_pipeline_gbq to iteratively run create_results_table over invocation_query's results

run_pipeline_gbq(
  create_results_table,
  invocation_query,
  project = '<YOUR PROJECT HERE>',
  use_legacy_sql = FALSE
)