I'm attempting to parse a JSON response from Foursquare. It's nested in a way that I cannot figure out. Here's a copy of the entire JSON.
我正在尝试解析来自Foursquare的JSON响应。它是嵌套的,我不知道。这是整个JSON的副本。
Here's a snippet of of the JSON:
以下是JSON的一个片段:
{
"meta": {
"code": 200,
"requestId": "58cab8bc4434b959e2f68a69"
},
"response": {
"categories": [
{
"categories": [
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "56aa371be4b08b9a8d5734db",
"name": "Amphitheater",
"pluralName": "Amphitheaters",
"shortName": "Amphitheater"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/aquarium_",
"suffix": ".png"
},
"id": "4fceea171983d5d06c3e9823",
"name": "Aquarium",
"pluralName": "Aquariums",
"shortName": "Aquarium"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/arcade_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d1e1931735",
"name": "Arcade",
"pluralName": "Arcades",
"shortName": "Arcade"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/artgallery_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d1e2931735",
"name": "Art Gallery",
"pluralName": "Art Galleries",
"shortName": "Art Gallery"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/bowling_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d1e4931735",
"name": "Bowling Alley",
"pluralName": "Bowling Alleys",
"shortName": "Bowling Alley"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/casino_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d17c941735",
"name": "Casino",
"pluralName": "Casinos",
"shortName": "Casino"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79e7",
"name": "Circus",
"pluralName": "Circuses",
"shortName": "Circus"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/comedyclub_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d18e941735",
"name": "Comedy Club",
"pluralName": "Comedy Clubs",
"shortName": "Comedy Club"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/musicvenue_",
"suffix": ".png"
},
"id": "5032792091d4c4b30a586d5c",
"name": "Concert Hall",
"pluralName": "Concert Halls",
"shortName": "Concert Hall"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/performingarts_dancestudio_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79ef",
"name": "Country Dance Club",
"pluralName": "Country Dance Clubs",
"shortName": "Country Dance Club"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79e8",
"name": "Disc Golf",
"pluralName": "Disc Golf Courses",
"shortName": "Disc Golf"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "56aa371be4b08b9a8d573532",
"name": "Exhibit",
"pluralName": "Exhibits",
"shortName": "Exhibit"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d1f1931735",
"name": "General Entertainment",
"pluralName": "General Entertainment",
"shortName": "Entertainment"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/racetrack_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79ea",
"name": "Go Kart Track",
"pluralName": "Go Kart Tracks",
"shortName": "Go Kart"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/historicsite_",
"suffix": ".png"
},
"id": "4deefb944765f83613cdba6e",
"name": "Historic Site",
"pluralName": "Historic Sites",
"shortName": "Historic Site"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/nightlife/karaoke_",
"suffix": ".png"
},
"id": "5744ccdfe4b0c0459246b4bb",
"name": "Karaoke Box",
"pluralName": "Karaoke Boxes",
"shortName": "Karaoke"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79e6",
"name": "Laser Tag",
"pluralName": "Laser Tag Places",
"shortName": "Laser Tag"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/historicsite_",
"suffix": ".png"
},
"id": "5642206c498e4bfca532186c",
"name": "Memorial Site",
"pluralName": "Memorial Sites",
"shortName": "Memorial Site"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/parks_outdoors/golfcourse_",
"suffix": ".png"
},
"id": "52e81612bcbc57f1066b79eb",
"name": "Mini Golf",
"pluralName": "Mini Golf Courses",
"shortName": "Mini Golf"
},
{
"categories": [
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
"suffix": ".png"
},
"id": "56aa371be4b08b9a8d5734de",
"name": "Drive-in Theater",
"pluralName": "Drive-in Theaters",
"shortName": "Drive-in Theater"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d17e941735",
"name": "Indie Movie Theater",
"pluralName": "Indie Movie Theaters",
"shortName": "Indie Movies"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d180941735",
"name": "Multiplex",
"pluralName": "Multiplexes",
"shortName": "Cineplex"
}
],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d17f941735",
"name": "Movie Theater",
"pluralName": "Movie Theaters",
"shortName": "Movie Theater"
},
{
"categories": [
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_art_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d18f941735",
"name": "Art Museum",
"pluralName": "Art Museums",
"shortName": "Art Museum"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/nightlife/stripclub_",
"suffix": ".png"
},
"id": "559acbe0498e472f1a53fa23",
"name": "Erotic Museum",
"pluralName": "Erotic Museums",
"shortName": "Erotic Museum"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_history_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d190941735",
"name": "History Museum",
"pluralName": "History Museums",
"shortName": "History Museum"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_planetarium_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d192941735",
"name": "Planetarium",
"pluralName": "Planetariums",
"shortName": "Planetarium"
},
{
"categories": [],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_science_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d191941735",
"name": "Science Museum",
"pluralName": "Science Museums",
"shortName": "Science Museum"
}
],
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_",
"suffix": ".png"
},
"id": "4bf58dd8d48988d181941735",
"name": "Museum",
"pluralName": "Museums",
"shortName": "Museum"
},
"icon": {
"prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
"suffix": ".png"
},
"id": "4d4b7104d754a06370d81259",
"name": "Arts & Entertainment",
"pluralName": "Arts & Entertainment",
"shortName": "Arts & Entertainment"
},
My code pulls the first hierarchy, which is always listed below it's sub-categories.
我的代码提取了第一个层次结构,它总是列在它的子类别下面。
import urllib.request
import json
import sqlite3
from key import ID, SECRET
CLIENT_ID = ID
CLIENT_SECRET = SECRET
v = '20170315'
url = 'https://api.foursquare.com/v2/venues/categories?client_id='+ CLIENT_ID +'&client_secret=' + SECRET + '&v=' + v
contents = urllib.request.urlopen(url).read()
parsed = json.loads(contents)
clean = parsed['response']['categories']
my_list = [i['name'] for i in clean]
print(my_list)
Output:
输出:
['Arts & Entertainment', 'College & University', 'Event', 'Food', 'Nightlife Spot', 'Outdoors & Recreation', 'Professional & Other Places', 'Residence', 'Shop & Service', 'Travel & Transport']
I'm having trouble parsing to get the sub-categories. I'm trying to pull id
and name
for all categories, sub or not.
我在解析子类别时遇到了麻烦。我试着拉出所有类别的id和名称。
1 个解决方案
#1
1
If a data structure is recursively nested, a recursive function is often the easiest way to parse it:
如果数据结构是递归嵌套的,递归函数通常是最简单的解析方法:
def get_categories(data):
result = {}
for cat in data:
result[cat['id']] = cat['name']
if cat['categories']:
result.update(get_categories(cat['categories']))
return result
This returns a dictionary of id: name
key/value pairs, recursively calling itself and updating result
with any subcategories it finds along the way.
它返回一个id字典:name键/值对,递归地调用自己,并使用沿途找到的任何子类别更新结果。
The if
check is not strictly necessary, since calling the function with an empty list would simply return an empty dictionary, but it saves a lot of pointless recursive calls, so ought to improve performance.
if检查并不是绝对必要的,因为使用空列表调用函数只会返回一个空字典,但是它节省了许多无意义的递归调用,因此应该可以提高性能。
Here's how you'd use it:
你可以这样使用:
categories = get_categories(parsed['response']['categories'])
… and here's the result:
结果是:
>>> from pprint import pprint
>>> pprint(categories)
{'4bf58dd8d48988d100941735': 'Meeting Room',
'4bf58dd8d48988d100951735': 'Pet Store',
'4bf58dd8d48988d101941735': 'Martial Arts Dojo',
# ...
'57558b36e4b065ecebd306da': 'Savoyard Restaurant',
'57558b36e4b065ecebd306dd': 'Truck Stop',
'589ddde98ae3635c072819ee': 'Duty-free Shop'}
#1
1
If a data structure is recursively nested, a recursive function is often the easiest way to parse it:
如果数据结构是递归嵌套的,递归函数通常是最简单的解析方法:
def get_categories(data):
result = {}
for cat in data:
result[cat['id']] = cat['name']
if cat['categories']:
result.update(get_categories(cat['categories']))
return result
This returns a dictionary of id: name
key/value pairs, recursively calling itself and updating result
with any subcategories it finds along the way.
它返回一个id字典:name键/值对,递归地调用自己,并使用沿途找到的任何子类别更新结果。
The if
check is not strictly necessary, since calling the function with an empty list would simply return an empty dictionary, but it saves a lot of pointless recursive calls, so ought to improve performance.
if检查并不是绝对必要的,因为使用空列表调用函数只会返回一个空字典,但是它节省了许多无意义的递归调用,因此应该可以提高性能。
Here's how you'd use it:
你可以这样使用:
categories = get_categories(parsed['response']['categories'])
… and here's the result:
结果是:
>>> from pprint import pprint
>>> pprint(categories)
{'4bf58dd8d48988d100941735': 'Meeting Room',
'4bf58dd8d48988d100951735': 'Pet Store',
'4bf58dd8d48988d101941735': 'Martial Arts Dojo',
# ...
'57558b36e4b065ecebd306da': 'Savoyard Restaurant',
'57558b36e4b065ecebd306dd': 'Truck Stop',
'589ddde98ae3635c072819ee': 'Duty-free Shop'}