解析json并搜索它

I have this code

我有这段代码

import json
from pprint import pprint
json_data=open('bookmarks.json')
jdata = json.load(json_data)
pprint (jdata)
json_data.close()

How can I search through it for u'uri': u'http:?

如何搜索u'uri': u'http:?

5 个解决方案

#1

As json.loads simply returns a dict, you can use the operators that apply to dicts:

为json。加载只返回一个命令，您可以使用适用于命令的操作符:

>>> jdata = json.load('{"uri": "http:", "foo", "bar"}')
>>> 'uri' in jdata       # Check if 'uri' is in jdata's keys
True
>>> jdata['uri']         # Will return the value belonging to the key 'uri'
u'http:'

Edit: to give an idea regarding how to loop through the data, consider the following example:

编辑:关于如何循环数据，请考虑以下示例:

>>> import json
>>> jdata = json.loads(open ('bookmarks.json').read())
>>> for c in jdata['children'][0]['children']:
...     print 'Title: {}, URI: {}'.format(c.get('title', 'No title'),
                                          c.get('uri', 'No uri'))
...
Title: Recently Bookmarked, URI: place:folder=BOOKMARKS_MENU(...)
Title: Recent Tags, URI: place:sort=14&type=6&maxResults=10&queryType=1
Title: , URI: No uri
Title: Mozilla Firefox, URI: No uri

Inspecting the jdata data structure will allow you to navigate it as you wish. The pprint call you already have is a good starting point for this.

检查jdata数据结构将允许您按自己的意愿导航它。pprint调用已经是一个很好的起点。

Edit2: Another attempt. This gets the file you mentioned in a list of dictionaries. With this, I think you should be able to adapt it to your needs.

Edit2:另一个尝试。这将获取您在字典列表中提到的文件。有了这个，我认为你应该能够适应你的需要。

>>> def build_structure(data, d=[]):
...     if 'children' in data:
...         for c in data['children']:
...             d.append({'title': c.get('title', 'No title'),
...                                      'uri': c.get('uri', None)})
...             build_structure(c, d)
...     return d
...
>>> pprint.pprint(build_structure(jdata))
[{'title': u'Bookmarks Menu', 'uri': None},
 {'title': u'Recently Bookmarked',
  'uri':   u'place:folder=BOOKMARKS_MENU&folder=UNFILED_BOOKMARKS&(...)'},
 {'title': u'Recent Tags',
  'uri':   u'place:sort=14&type=6&maxResults=10&queryType=1'},
 {'title': u'', 'uri': None},
 {'title': u'Mozilla Firefox', 'uri': None},
 {'title': u'Help and Tutorials',
  'uri':   u'http://www.mozilla.com/en-US/firefox/help/'},
 (...)
}]

To then "search through it for u'uri': u'http:'", do something like this:

然后“搜索u'uri': u'http:'”，执行如下操作:

for c in build_structure(jdata):
    if c['uri'].startswith('http:'):
        print 'Started with http'

#2

ObjectPath is a library that provides ability to query JSON and nested structures of dicts and lists. For example, you can search for all attributes called "foo" regardless how deep they are by using $..foo.

ObjectPath是一个库，它提供了查询JSON和dicts和list嵌套结构的能力。例如，您可以使用$. foo搜索所有被称为“foo”的属性，不管它们有多深。

While the documentation focuses on the command line interface, you can perform the queries programmatically by using the package's Python internals. The example below assumes you've already loaded the data into Python data structures (dicts & lists). If you're starting with a JSON file or string you just need to use load or loads from the json module first.

虽然文档主要关注命令行接口，但您可以通过使用包的Python内部元素以编程方式执行查询。下面的示例假设您已经将数据加载到Python数据结构中(dicts & list)。如果您从JSON文件或字符串开始，首先需要从JSON模块使用加载或加载。

import objectpath

data = [
    {'foo': 1, 'bar': 'a'},
    {'foo': 2, 'bar': 'b'},
    {'NoFooHere': 2, 'bar': 'c'},
    {'foo': 3, 'bar': 'd'},
]

tree_obj = objectpath.Tree(data)

tuple(tree_obj.execute('$..foo'))
# returns: (1, 2, 3)

Notice that it just skipped elements that lacked a "foo" attribute, such as the third item in the list. You can also do much more complex queries, which makes ObjectPath handy for deeply nested structures (e.g. finding where x has y that has z: $.x.y.z). I refer you to the documentation and tutorial for more information.

注意，它只是跳过了缺少“foo”属性的元素，比如列表中的第三个项目。您还可以执行更复杂的查询，这使得ObjectPath对于深度嵌套的结构非常方便(例如，找到x有y的地方，它有z: $.x.y.z)。有关更多信息，请参阅文档和教程。

#3

Seems there's a typo (missing colon) in the JSON dict provided by jro.

看起来在jro提供的JSON中有一个错码(缺少冒号)。

The correct syntax would be: jdata = json.load('{"uri": "http:", "foo": "bar"}')

正确的语法应该是:jdata = json。负载(' {“uri”:“http:”、“foo”:“酒吧”}”)

This cleared it up for me when playing with the code.

这让我在玩代码的时候清楚了。

#4

Functions to search through and print dicts, like JSON. *made in python 3

搜索和打印命令的函数，如JSON。python 3 *

Search:

搜索:

def pretty_search(dict_or_list, key_to_search, search_for_first_only=False):
    """
    Give it a dict or a list of dicts and a dict key (to get values of),
    it will search through it and all containing dicts and arrays
    for all values of dict key you gave, and will return you set of them
    unless you wont specify search_for_first_only=True

    :param dict_or_list: 
    :param key_to_search: 
    :param search_for_first_only: 
    :return: 
    """
    search_result = set()
    if isinstance(dict_or_list, dict):
        for key in dict_or_list:
            key_value = dict_or_list[key]
            if key == key_to_search:
                if search_for_first_only:
                    return key_value
                else:
                    search_result.add(key_value)
            if isinstance(key_value, dict) or isinstance(key_value, list) or isinstance(key_value, set):
                _search_result = pretty_search(key_value, key_to_search, search_for_first_only)
                if _search_result and search_for_first_only:
                    return _search_result
                elif _search_result:
                    for result in _search_result:
                        search_result.add(result)
    elif isinstance(dict_or_list, list) or isinstance(dict_or_list, set):
        for element in dict_or_list:
            if isinstance(element, list) or isinstance(element, set) or isinstance(element, dict):
                _search_result = pretty_search(element, key_to_search, search_result)
                if _search_result and search_for_first_only:
                    return _search_result
                elif _search_result:
                    for result in _search_result:
                        search_result.add(result)
    return search_result if search_result else None

Print:

打印:

def pretty_print(dict_or_list, print_spaces=0):
    """
    Give it a dict key (to get values of),
    it will return you a pretty for print version
    of a dict or a list of dicts you gave.

    :param dict_or_list: 
    :param print_spaces: 
    :return: 
    """
    pretty_text = ""
    if isinstance(dict_or_list, dict):
        for key in dict_or_list:
            key_value = dict_or_list[key]
            if isinstance(key_value, dict):
                key_value = pretty_print(key_value, print_spaces + 1)
                pretty_text += "\t" * print_spaces + "{}:\n{}\n".format(key, key_value)
            elif isinstance(key_value, list) or isinstance(key_value, set):
                pretty_text += "\t" * print_spaces + "{}:\n".format(key)
                for element in key_value:
                    if isinstance(element, dict) or isinstance(element, list) or isinstance(element, set):
                        pretty_text += pretty_print(element, print_spaces + 1)
                    else:
                        pretty_text += "\t" * (print_spaces + 1) + "{}\n".format(element)
            else:
                pretty_text += "\t" * print_spaces + "{}: {}\n".format(key, key_value)
    elif isinstance(dict_or_list, list) or isinstance(dict_or_list, set):
        for element in dict_or_list:
            if isinstance(element, dict) or isinstance(element, list) or isinstance(element, set):
                pretty_text += pretty_print(element, print_spaces + 1)
            else:
                pretty_text += "\t" * print_spaces + "{}\n".format(element)
    else:
        pretty_text += str(dict_or_list)
    if print_spaces == 0:
        print(pretty_text)
    return pretty_text

#5

You can use jsonpipe if you just need the output (and more comfortable with command line):

如果只需要输出(更适合命令行)，可以使用jsonpipe:

cat bookmarks.json | jsonpipe |grep uri

#1