在BigQuery中将行转换为列(Pivot实现)

时间:2021-06-22 15:33:32

I want to generate a new table and place all key value pairs with keys as column names and values as their respective values using BigQuery.

我想生成一个新表,并使用BigQuery将所有键值对作为列名和值作为各自的值。

Example:

**Key**                  **Value**
channel_title           Mahendra Guru    
youtube_id              ugEGMG4-MdA  
channel_id              UCiDKcjKocimAO1tV    
examId                  72975611-4a5e-11e5   
postId                  1189e340-b08f 

channel_title           Ab Live  
youtube_id              3TNbtTwLY0U  
channel_id              UCODeKM_D6JLf8jJt    
examId                  72975611-4a5e-11e5   
postId                  0c3e6590-afeb

I want to convert it to:

我想将其转换为:

**channel_title   youtube_id   channel_id         examId               postId**
Mahendra Guru   ugEGMG4-MdA  UCiDKcjKocimAO1tV  72975611-4a5e-11e5   1189e340-b08f
Ab Live         3TNbtTwLY0U  UCODeKM_D6JLf8jJt  72975611-4a5e-11e5   0c3e6590-afeb

How to do it using BigQuery?

如何使用BigQuery做到这一点?

1 个解决方案

#1


9  

BigQuery does not support yet pivoting functions
You still can do this in BigQuery using below approach

BigQuery不支持旋转功能你仍然可以使用以下方法在BigQuery中执行此操作

But first, in addition to two columns in input data you must have one more column that would specify groups of rows in input that needs to be combined into one row in output

但首先,除了输入数据中的两列之外,还必须有一列可以指定输入中需要在输出中组合成一行的行组

So, I assume your input table (yourTable) looks like below

所以,我假设您的输入表(yourTable)如下所示

**id**  **Key**                  **Value**
   1    channel_title           Mahendra Guru    
   1    youtube_id              ugEGMG4-MdA  
   1    channel_id              UCiDKcjKocimAO1tV    
   1    examId                  72975611-4a5e-11e5   
   1    postId                  1189e340-b08f 

   2    channel_title           Ab Live  
   2    youtube_id              3TNbtTwLY0U  
   2    channel_id              UCODeKM_D6JLf8jJt    
   2    examId                  72975611-4a5e-11e5   
   2    postId                  0c3e6590-afeb  

So, first you should run below query

所以,首先你应该运行以下查询

SELECT 'SELECT id, ' + 
   GROUP_CONCAT_UNQUOTED(
      'MAX(IF(key = "' + key + '", value, NULL)) as [' + key + ']'
   ) 
   + ' FROM yourTable GROUP BY id ORDER BY id'
FROM (
  SELECT key 
  FROM yourTable
  GROUP BY key
  ORDER BY key
) 

Result of above query will be string that (if to format) will look like below

以上查询的结果将是(如果要格式化)将如下所示的字符串

SELECT 
  id, 
  MAX(IF(key = "channel_id", value, NULL)) AS [channel_id],
  MAX(IF(key = "channel_title", value, NULL)) AS [channel_title],
  MAX(IF(key = "examId", value, NULL)) AS [examId],
  MAX(IF(key = "postId", value, NULL)) AS [postId],
  MAX(IF(key = "youtube_id", value, NULL)) AS [youtube_id] 
FROM yourTable 
GROUP BY id 
ORDER BY id

you should now copy above result (note: you don't really need to format it - i did it for presenting only) and run it as normal query

你现在应该复制上面的结果(注意:你真的不需要格式化它 - 我只是为了呈现它)并将其作为普通查询运行

Result will be as you would expected

结果将如您所料

id  channel_id          channel_title   examId              postId          youtube_id   
1   UCiDKcjKocimAO1tV   Mahendra Guru   72975611-4a5e-11e5  1189e340-b08f   ugEGMG4-MdA  
2   UCODeKM_D6JLf8jJt   Ab Live         72975611-4a5e-11e5  0c3e6590-afeb   3TNbtTwLY0U  

Please note: you can skip Step 1 if you can construct proper query (as in step 2) by yourself and number of fields small and constant or if it is one time deal. But Step 1 just helper step that makes it for you, so you can create it fast any time!

请注意:如果您可以自己构建正确的查询(如步骤2),并且字段数小且不变或者是一次性交易,则可以跳过步骤1。但是第1步只是帮助你的步骤,所以你可以随时快速创建它!

If you are interested - you can see more about pivoting in my other posts.

如果您有兴趣 - 您可以在我的其他帖子中看到有关旋转的更多信息。

How to scale Pivoting in BigQuery?
Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.
You can also see below as simplified examples (if above one is too complex/verbose):
How to transpose rows to columns with large amount of the data in BigQuery/SQL?
How to create dummy variable columns for thousands of categories in Google BigQuery?
Pivot Repeated fields in BigQuery

如何在BigQuery中扩展透视?请注意 - 每张桌子有10K列的限制 - 所以你受限于10K组织。您还可以在下面看到简化示例(如果上面的一个太复杂/冗长):如何将行转换为BigQuery / SQL中包含大量数据的列?如何在Google BigQuery中为数千个类别创建虚拟变量列?旋转BigQuery中的重复字段

#1


9  

BigQuery does not support yet pivoting functions
You still can do this in BigQuery using below approach

BigQuery不支持旋转功能你仍然可以使用以下方法在BigQuery中执行此操作

But first, in addition to two columns in input data you must have one more column that would specify groups of rows in input that needs to be combined into one row in output

但首先,除了输入数据中的两列之外,还必须有一列可以指定输入中需要在输出中组合成一行的行组

So, I assume your input table (yourTable) looks like below

所以,我假设您的输入表(yourTable)如下所示

**id**  **Key**                  **Value**
   1    channel_title           Mahendra Guru    
   1    youtube_id              ugEGMG4-MdA  
   1    channel_id              UCiDKcjKocimAO1tV    
   1    examId                  72975611-4a5e-11e5   
   1    postId                  1189e340-b08f 

   2    channel_title           Ab Live  
   2    youtube_id              3TNbtTwLY0U  
   2    channel_id              UCODeKM_D6JLf8jJt    
   2    examId                  72975611-4a5e-11e5   
   2    postId                  0c3e6590-afeb  

So, first you should run below query

所以,首先你应该运行以下查询

SELECT 'SELECT id, ' + 
   GROUP_CONCAT_UNQUOTED(
      'MAX(IF(key = "' + key + '", value, NULL)) as [' + key + ']'
   ) 
   + ' FROM yourTable GROUP BY id ORDER BY id'
FROM (
  SELECT key 
  FROM yourTable
  GROUP BY key
  ORDER BY key
) 

Result of above query will be string that (if to format) will look like below

以上查询的结果将是(如果要格式化)将如下所示的字符串

SELECT 
  id, 
  MAX(IF(key = "channel_id", value, NULL)) AS [channel_id],
  MAX(IF(key = "channel_title", value, NULL)) AS [channel_title],
  MAX(IF(key = "examId", value, NULL)) AS [examId],
  MAX(IF(key = "postId", value, NULL)) AS [postId],
  MAX(IF(key = "youtube_id", value, NULL)) AS [youtube_id] 
FROM yourTable 
GROUP BY id 
ORDER BY id

you should now copy above result (note: you don't really need to format it - i did it for presenting only) and run it as normal query

你现在应该复制上面的结果(注意:你真的不需要格式化它 - 我只是为了呈现它)并将其作为普通查询运行

Result will be as you would expected

结果将如您所料

id  channel_id          channel_title   examId              postId          youtube_id   
1   UCiDKcjKocimAO1tV   Mahendra Guru   72975611-4a5e-11e5  1189e340-b08f   ugEGMG4-MdA  
2   UCODeKM_D6JLf8jJt   Ab Live         72975611-4a5e-11e5  0c3e6590-afeb   3TNbtTwLY0U  

Please note: you can skip Step 1 if you can construct proper query (as in step 2) by yourself and number of fields small and constant or if it is one time deal. But Step 1 just helper step that makes it for you, so you can create it fast any time!

请注意:如果您可以自己构建正确的查询(如步骤2),并且字段数小且不变或者是一次性交易,则可以跳过步骤1。但是第1步只是帮助你的步骤,所以你可以随时快速创建它!

If you are interested - you can see more about pivoting in my other posts.

如果您有兴趣 - 您可以在我的其他帖子中看到有关旋转的更多信息。

How to scale Pivoting in BigQuery?
Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.
You can also see below as simplified examples (if above one is too complex/verbose):
How to transpose rows to columns with large amount of the data in BigQuery/SQL?
How to create dummy variable columns for thousands of categories in Google BigQuery?
Pivot Repeated fields in BigQuery

如何在BigQuery中扩展透视?请注意 - 每张桌子有10K列的限制 - 所以你受限于10K组织。您还可以在下面看到简化示例(如果上面的一个太复杂/冗长):如何将行转换为BigQuery / SQL中包含大量数据的列?如何在Google BigQuery中为数千个类别创建虚拟变量列?旋转BigQuery中的重复字段