如何在Python中使用正则表达式来返回这些数字?

时间:2021-05-03 20:15:22

I have the following code stored as a string variable in Python. How can I use regex, along with re.findall('', text), to parse out the five 9-digit numbers (all starting with "305...") under "attributeLookup" lookup in the below code?

我将以下代码存储为Python中的字符串变量。我如何使用正则表达式和re.findall('',text)来解析下面代码中“attributeLookup”查找下的五个9位数字(全部以“305 ...”开头)?

var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };

4 个解决方案

#1


Here is a way to do it. First parse your string to get the json object (everything inside the most outer braces). Then decode the json object using the json module and access what you need.

这是一种方法。首先解析你的字符串以获取json对象(最外部大括号内的所有内容)。然后使用json模块解码json对象并访问您需要的内容。

astr = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import re
import json
pat = re.compile('^[^\{]*(\{.*\});.*$', re.MULTILINE|re.DOTALL)
json_str = pat.match(astr).group(1)
d = json.loads(json_str)

for x in d['attributeDefinition']['attributeLookup']:
    print x[1]
# 305557121
# 305557187
# 305557696
# 305557344
# 305696435

#2


You can just use the built in json library to parse it. I've assumed you've got rid of the Javascript already:

您可以使用内置的json库来解析它。我假设你已经摆脱了Javascript:

import json

input = """{
"attributeDefinition":{
    "defaultSku":305557121,
    "attributeListing":[{ 
        "label":"Finish",
                "defaultIndex":0,
                "options":[
                    "White::f33b4086",
                    "Beige::8e0900fa",
                    "Blue::3c3a4707",
                    "Orange::1d8cb503",
                    "Spring Green::dd5e599a"
                 ]
        }],
        "attributeLookup":[
        [0,305557121],
        [1,305557187],
        [2,305557696],
        [3,305557344],
        [4,305696435]
        ]
    }
}"""

data = json.loads(input)

# Get a list you can do stuff with. This gives you:
# [[0, 305557121], [1, 305557187], [2, 305557696], [3, 305557344], [4, 305696435]]
els = data['attributeDefinition']['attributeLookup']

for el in els:
    # Each el looks like: [0, 305557121]
    print(el[1])

#3


string = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import json
data = json.loads(string.split('=', 1)[1].strip(';'))
for d in data['attributeDefinition']['attributeLookup']:
    print(d[1])

Don't know why you want to use regex. Do you also take your car to visit your neighbour?

不知道你为什么要使用正则表达式。你也带你的车去拜访你的邻居吗?

#4


in the findall you want to select the digits 0 to 9 over 9 characters like this. This still would be better using the json module rather than storing as a string.

在findall中你想要选择9个字符之间的数字0到9,就像这样。使用json模块而不是存储为字符串仍然会更好。

I really useful tester for python regex can be found here

我真的很有用python正则表达式的测试人员可以在这里找到

http://pythex.org/

re.findall('[0-9]{9}', PRO_META_JSON.split('attributeLookup')[1])

#1


Here is a way to do it. First parse your string to get the json object (everything inside the most outer braces). Then decode the json object using the json module and access what you need.

这是一种方法。首先解析你的字符串以获取json对象(最外部大括号内的所有内容)。然后使用json模块解码json对象并访问您需要的内容。

astr = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import re
import json
pat = re.compile('^[^\{]*(\{.*\});.*$', re.MULTILINE|re.DOTALL)
json_str = pat.match(astr).group(1)
d = json.loads(json_str)

for x in d['attributeDefinition']['attributeLookup']:
    print x[1]
# 305557121
# 305557187
# 305557696
# 305557344
# 305696435

#2


You can just use the built in json library to parse it. I've assumed you've got rid of the Javascript already:

您可以使用内置的json库来解析它。我假设你已经摆脱了Javascript:

import json

input = """{
"attributeDefinition":{
    "defaultSku":305557121,
    "attributeListing":[{ 
        "label":"Finish",
                "defaultIndex":0,
                "options":[
                    "White::f33b4086",
                    "Beige::8e0900fa",
                    "Blue::3c3a4707",
                    "Orange::1d8cb503",
                    "Spring Green::dd5e599a"
                 ]
        }],
        "attributeLookup":[
        [0,305557121],
        [1,305557187],
        [2,305557696],
        [3,305557344],
        [4,305696435]
        ]
    }
}"""

data = json.loads(input)

# Get a list you can do stuff with. This gives you:
# [[0, 305557121], [1, 305557187], [2, 305557696], [3, 305557344], [4, 305696435]]
els = data['attributeDefinition']['attributeLookup']

for el in els:
    # Each el looks like: [0, 305557121]
    print(el[1])

#3


string = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import json
data = json.loads(string.split('=', 1)[1].strip(';'))
for d in data['attributeDefinition']['attributeLookup']:
    print(d[1])

Don't know why you want to use regex. Do you also take your car to visit your neighbour?

不知道你为什么要使用正则表达式。你也带你的车去拜访你的邻居吗?

#4


in the findall you want to select the digits 0 to 9 over 9 characters like this. This still would be better using the json module rather than storing as a string.

在findall中你想要选择9个字符之间的数字0到9,就像这样。使用json模块而不是存储为字符串仍然会更好。

I really useful tester for python regex can be found here

我真的很有用python正则表达式的测试人员可以在这里找到

http://pythex.org/

re.findall('[0-9]{9}', PRO_META_JSON.split('attributeLookup')[1])