Redshift - 用于从键值对中提取值的SQL脚本

时间:2022-08-10 23:04:43

I have a columns containing JSON data as below. I am trying to extract values corresponding to each key pair in the column. Could anyone advice how could I do using SQL

我有一个包含JSON数据的列,如下所示。我试图提取列中每个密钥对对应的值。任何人都可以建议我如何使用SQL

 [{"id": 101, "id1": {"key": "SaleId", "type": "identifier", "regex": null}, "id2": {"key": Name, "type": "identifier", "regex": null}, "id3": {"key": null, "type": "identifier", "regex": null}}]

Key values are id1, id2, id3

键值为id1,id2,id3

Expected output:

id1 : SaleId
id2 : Name
id3 : null

I am using Redshift. Thanks

我正在使用Redshift。谢谢

3 个解决方案

#1


0  

I don't know anything about Redshift, so this might not Work. It Works in JavaScript:

我对Redshift一无所知,所以这可能不起作用。它适用于JavaScript:

/"(id\d)":\s\{"key": "?(\w+)"?/g

You will then have to extract Group 1, containing the id and Group 2, containing the key.

然后,您必须提取包含ID的组1,其中包含密钥。

The regex starts by matching a double quote, then creating a Group with the Word 'id' followed by a digit, a colon, a Space, a left curly brace, a double quote, the Word 'key', a colon, a Space, an optional double quote. Finally it creates a Group with one or more Word characters, followed by an optional double quote.

正则表达式首先匹配双引号,然后创建一个具有单词'id'后跟数字,冒号,空格,左大括号,双引号,单词'键',冒号,空格的组,一个可选的双引号。最后,它创建一个包含一个或多个Word字符的组,后跟可选的双引号。

As I said, I don't know Redshift, for instance, you might have to escape double quotes.

正如我所说,我不知道Redshift,例如,你可能不得不逃避双引号。

#2


0  

You can do what you need like this

你可以像这样做你需要的

with t as
    (
    select '[{"id": 101, ' ||
           '"id1": {"key": "SaleId", "type": "identifier", "regex": "null"}, ' ||
           '"id2": {"key": "Name", "type": "identifier", "regex": "null"}, ' ||
           '"id3": {"key": "null", "type": "identifier", "regex": "null"}}]' as str
    )
select 'id1:' || json_extract_path_text(substring(str,2,length(str)-2),'id1','key'),
       'id2:' || json_extract_path_text(substring(str,2,length(str)-2),'id2','key'),
       'id3:' || json_extract_path_text(substring(str,2,length(str)-2),'id3','key')
from t;

#3


0  

The JSON string in your example is invalid because Name is not in double quotes.

示例中的JSON字符串无效,因为Name不是双引号。

Assuming this is a typo and this is meant to be a valid JSON string, then you can use JSON functions to extract the values you need from the column.

假设这是一个拼写错误并且这是一个有效的JSON字符串,那么您可以使用JSON函数从列中提取所需的值。

Example (I have added quotes around "Name"):

示例(我在“名称”周围添加了引号):

create temp table jsontest (myjsonstring varchar(1000))
;
insert into jsontest(myjsonstring) 
    values ('[{"id": 101, "id1": {"key": "SaleId", "type": "identifier", "regex": null}, "id2": {"key": "Name", "type": "identifier", "regex": null}, "id3": {"key": null, "type": "identifier", "regex": null}}]')
;
select 'id1', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id1', 'key') from jsontest
union all
select 'id2', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id2', 'key') from jsontest
union all
select 'id3', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id3', 'key') from jsontest
;

#1


0  

I don't know anything about Redshift, so this might not Work. It Works in JavaScript:

我对Redshift一无所知,所以这可能不起作用。它适用于JavaScript:

/"(id\d)":\s\{"key": "?(\w+)"?/g

You will then have to extract Group 1, containing the id and Group 2, containing the key.

然后,您必须提取包含ID的组1,其中包含密钥。

The regex starts by matching a double quote, then creating a Group with the Word 'id' followed by a digit, a colon, a Space, a left curly brace, a double quote, the Word 'key', a colon, a Space, an optional double quote. Finally it creates a Group with one or more Word characters, followed by an optional double quote.

正则表达式首先匹配双引号,然后创建一个具有单词'id'后跟数字,冒号,空格,左大括号,双引号,单词'键',冒号,空格的组,一个可选的双引号。最后,它创建一个包含一个或多个Word字符的组,后跟可选的双引号。

As I said, I don't know Redshift, for instance, you might have to escape double quotes.

正如我所说,我不知道Redshift,例如,你可能不得不逃避双引号。

#2


0  

You can do what you need like this

你可以像这样做你需要的

with t as
    (
    select '[{"id": 101, ' ||
           '"id1": {"key": "SaleId", "type": "identifier", "regex": "null"}, ' ||
           '"id2": {"key": "Name", "type": "identifier", "regex": "null"}, ' ||
           '"id3": {"key": "null", "type": "identifier", "regex": "null"}}]' as str
    )
select 'id1:' || json_extract_path_text(substring(str,2,length(str)-2),'id1','key'),
       'id2:' || json_extract_path_text(substring(str,2,length(str)-2),'id2','key'),
       'id3:' || json_extract_path_text(substring(str,2,length(str)-2),'id3','key')
from t;

#3


0  

The JSON string in your example is invalid because Name is not in double quotes.

示例中的JSON字符串无效,因为Name不是双引号。

Assuming this is a typo and this is meant to be a valid JSON string, then you can use JSON functions to extract the values you need from the column.

假设这是一个拼写错误并且这是一个有效的JSON字符串,那么您可以使用JSON函数从列中提取所需的值。

Example (I have added quotes around "Name"):

示例(我在“名称”周围添加了引号):

create temp table jsontest (myjsonstring varchar(1000))
;
insert into jsontest(myjsonstring) 
    values ('[{"id": 101, "id1": {"key": "SaleId", "type": "identifier", "regex": null}, "id2": {"key": "Name", "type": "identifier", "regex": null}, "id3": {"key": null, "type": "identifier", "regex": null}}]')
;
select 'id1', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id1', 'key') from jsontest
union all
select 'id2', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id2', 'key') from jsontest
union all
select 'id3', json_extract_path_text(json_extract_array_element_text(myjsonstring, 0) , 'id3', 'key') from jsontest
;