I have the following .json
file, which have some lists like values in some elements:
我有以下.json文件,它有一些列表,如某些元素中的值:
{
"paciente": [
{
"id": 1234,
"nombre": "Pablo",
"sesion": [
{
"id": 12345,
"juego": [
{
"nombre": "bonzo",
"nivel": [
{
"id": 1234,
"nombre": "caida libre"
}
],
"___léeme___": "El array 'iteraciones' contiene las vitorias o derrotas con el tiempo en segundos de cada iteración",
"iteraciones": [
{
"victoria": true,
"tiempo": 120
},
{
"victoria": false,
"tiempo": 232
}
]
}
],
"segmento": [
{
"id": 12345,
"nombre": "Hombro",
"movimiento": [
{
"id": 12,
"nombre": "flexion",
"metricas": [
{
"min": 12,
"max": 34,
"media": 23,
"moda": 20
}
]
}
]
}
]
}
]
},
{
"id": 156,
"nombre": "Bernardo",
"sesion": [
{
"id": 456,
"juego": [
{
"nombre": "Rita",
"nivel": [
{
"id": 1,
"nombre": "NAVEGANDO"
}
],
"___léeme___": "El array 'iteraciones' contiene las vitorias o derrotas con el tiempo en segundos de cada iteración",
"iteraciones": [
{
"victoria": true,
"tiempo": 120
},
{
"victoria": false,
"tiempo": 232
}
]
}
],
"segmento": [
{
"id": 12345,
"nombre": "Escapula",
"movimiento": [
{
"id": 12,
"nombre": "Protracción",
"metricas": [
{
"min": 12,
"max": 34,
"media": 23,
"moda": 20
}
]
}
]
}
]
}
]
}
]
}
From my script, I want to go through it's different nested elements for get specific information
从我的脚本中,我想通过它的不同嵌套元素来获取特定信息
import json
with open('myfile.json') as data_file:
data = json.loads(data_file.read())
patient_id = data["paciente"][0]["id"]
patient_name = data["paciente"][0]["nombre"]
id_session = data["paciente"][0]["sesion"][0]["id"]
game_session = data["paciente"][0]["sesion"][0]["juego"][0]["nombre"]
level_game = data["paciente"][0]["sesion"][0]["juego"][0]["nivel"][0]["nombre"]
iterations = data["paciente"][0]["sesion"][0]["juego"][0]["iteraciones"]
iterations_victory = data["paciente"][0]["sesion"][0]["juego"][0]["iteraciones"][0]["victoria"]
iterations_time = data["paciente"][0]["sesion"][0]["juego"][0]["iteraciones"][0]["tiempo"]
iterations_victory1 = data["paciente"][0]["sesion"][0]["juego"][0]["iteraciones"][1]["victoria"]
iterations_time1 = data["paciente"][0]["sesion"][0]["juego"][0]["iteraciones"][1]["tiempo"]
segment = data["paciente"][0]["sesion"][0]["segmento"][0]["nombre"]
movement = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["nombre"]
#metrics = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["metricas"]
metric_min = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["metricas"][0]["min"]
metric_max = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["metricas"][0]["max"]
metric_average = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["metricas"][0]["media"]
metric_moda = data["paciente"][0]["sesion"][0]["segmento"][0]["movimiento"][0]["metricas"][0]["moda"]
print(
'Patient ID:', patient_id,'\n',
'Patient Name:', patient_name, '\n',
'Session:','\n',
' Id Session:',id_session,'\n',
' Game:', game_session, '\n',
' Level:', level_game, '\n',
' Iterations:', len(iterations),'\n',
' Victory:', iterations_victory, '\n',
' Time:', iterations_time, '\n',
' Victory:', iterations_victory1, '\n',
' Time:', iterations_time1, '\n',
' Affected Segment:', segment, '\n',
' Movement:', movement, '\n',
' Metrics:','\n',
' Minimum:', metric_min, '\n'
' Maximum:', metric_max, '\n'
' Average:', metric_average, '\n'
' Moda/Trend:', metric_moda, '\n'
)
This is my output:
这是我的输出:
Patient ID: 1234
Patient Name: Pablo
Session:
Id Session: 12345
Game: bonzo
Level: caida libre
Iterations: 2
Victory: True
Time: 120
Victory: False
Time: 232
Affected Segment: Hombro
Movement: flexion
Metrics:
Minimum: 12
Maximum: 34
Average: 23
Moda/Trend: 20
[Finished in 0.0s]
Is it possible to optimize this code? How to can I make this code more readable or short?
是否可以优化此代码?如何才能使这段代码更具可读性或更短?
I would like especially when I will have query for more of one element (just in case of that exist) in the lists/arrays like as segment, movement, iterations, games, etc
我特别喜欢在列表/数组中查询更多的一个元素(只是在存在的情况下),如段,移动,迭代,游戏等
Any orientation is welcome.
欢迎任何方向。
2 个解决方案
#1
1
Note that you are omitting the second patient record in your data (Bernardo), and that you assume there are always exactly two iterations. This might not always be true.
请注意,您省略了数据中的第二个患者记录(Bernardo),并且您认为总是有两次迭代。这可能并非总是如此。
When you look for speed, your code is close to the best you can get, but for the above reasons, you would probably do good to add some tests and loops to make sure you cover all data, and not more.
当您寻找速度时,您的代码接近您可以获得的最佳值,但由于上述原因,您可能会添加一些测试和循环以确保覆盖所有数据,而不是更多。
Here is a function you could use to print the data in your format, based on a template you pass it. The template lists all labels you want to use for the keys you want to print the values for. In order to avoid ambiguity, the template needs both the key and the parent key of the elements of interest.
这是一个可用于根据您传递的模板以您的格式打印数据的功能。该模板列出了要用于要为其打印值的键的所有标签。为了避免歧义,模板需要感兴趣元素的键和父键。
As the function needs to visit the keys in order, OrderedDict
is used instead of dict
:
由于函数需要按顺序访问键,因此使用OrderedDict而不是dict:
import json
from collections import OrderedDict
data = json.loads(data, object_pairs_hook=OrderedDict)
def pretty(template, item, parentName='', name='', indent=0):
label = template.get(parentName + '/' + name)
if label:
label = ' ' * indent + label + ': '
if isinstance(item, list):
label += str(len(item))
elif not isinstance(item, OrderedDict):
label += str(item)
print(label)
if isinstance(item, list):
for value in item:
pretty(template, value, parentName + '[]', name, indent)
elif isinstance(item, OrderedDict):
for key, value in item.items():
pretty(template, value, name, key, indent+1)
template = {
"paciente/id": "Patient ID",
"paciente/nombre": "Patient Name",
"paciente/sesion": "Sessions",
"sesion/id": "Id Session",
"juego/nombre": "Game",
"nivel/nombre": "Level",
"juego/iteraciones": "Iterations",
"iteraciones/victoria": "Victory",
"iteraciones/tiempo": "Time",
"segmento/nombre": "Affected Segment",
"movimiento/nombre": "Movement",
"movimiento/metricas": "Metrics",
"metricas/min": "Minimum",
"metricas/max": "Maximum",
"metricas/media": "Average",
"metricas/moda": "Moda/Trend"
}
pretty(template, data)
The output is:
输出是:
Patient ID: 1234
Patient Name: Pablo
Sessions: 1
Id Session: 12345
Game: bonzo
Level: caida libre
Iterations: 2
Victory: True
Time: 120
Victory: False
Time: 232
Affected Segment: Hombro
Movement: flexion
Metrics: 1
Minimum: 12
Maximum: 34
Average: 23
Moda/Trend: 20
Patient ID: 156
Patient Name: Bernardo
Sessions: 1
Id Session: 456
Game: Rita
Level: NAVEGANDO
Iterations: 2
Victory: True
Time: 120
Victory: False
Time: 232
Affected Segment: Escapula
Movement: Protracción
Metrics: 1
Minimum: 12
Maximum: 34
Average: 23
Moda/Trend: 20
#2
1
Depending on what else your program is doing, it may or may not matter if you speed the code up. You should use the profile
or cProfile
module to find out where your script is spending its time and work on those.
根据您的程序正在执行的其他操作,如果您加快代码的速度,可能会或可能没有关系。您应该使用配置文件或cProfile模块来找出脚本花费时间的位置并对其进行处理。
Regardless, you could save some processing time by removing all the redundant indexing operations by using temporary variable to hold the result. You can think of this simple as the removal of common prefixes. It's relatively easy if you've got a good code editor.
无论如何,通过使用临时变量来保存结果,可以通过删除所有冗余索引操作来节省一些处理时间。您可以将此简单视为删除公共前缀。如果你有一个好的代码编辑器,这是相对容易的。
Although it may not be shorter or more readable code, it likely will execute faster (although there is some overhead involved).
虽然它可能不是更短或更易读的代码,但它可能会更快地执行(尽管涉及一些开销)。
Here's what I'm describing:
这就是我所描述的:
import json
with open('myfile.json') as data_file:
data = json.loads(data_file.read())
patient0_data = data["paciente"][0]
patient_id = patient0_data["id"]
patient_name = patient0_data["nombre"]
patient0_data_sesion0 = patient0_data["sesion"][0]
id_session = patient0_data_sesion0["id"]
patient0_data_sesion0_juego0 = patient0_data_sesion0["juego"][0]
game_session = patient0_data_sesion0_juego0["nombre"]
level_game = patient0_data_sesion0_juego0["nivel"][0]["nombre"]
iterations = patient0_data_sesion0_juego0["iteraciones"]
patient0_data_sesion0_juego0_iteraciones = patient0_data_sesion0_juego0["iteraciones"]
iterations_victory = patient0_data_sesion0_juego0_iteraciones[0]["victoria"]
iterations_time = patient0_data_sesion0_juego0_iteraciones[0]["tiempo"]
iterations_victory1 = patient0_data_sesion0_juego0_iteraciones[1]["victoria"]
iterations_time1 = patient0_data_sesion0_juego0_iteraciones[1]["tiempo"]
patient0_data_sesion0_segmento0 = patient0_data_sesion0["segmento"][0]
segment = patient0_data_sesion0_segmento0["nombre"]
patient0_data_sesion0_segmento0_movimiento0 = (
patient0_data_sesion0_segmento0["movimiento"][0])
movement = patient0_data_sesion0_segmento0_movimiento0["nombre"]
#metrics = patient0_data_sesion0_segmento0_movimiento0["metricas"]
patient0_data_sesion0_segmento0_movimiento0_metricas0 = (
patient0_data_sesion0_segmento0["movimiento"][0]["metricas"][0])
metric_min = patient0_data_sesion0_segmento0_movimiento0_metricas0["min"]
metric_max = patient0_data_sesion0_segmento0_movimiento0_metricas0["max"]
metric_average = patient0_data_sesion0_segmento0_movimiento0_metricas0["media"]
metric_moda = patient0_data_sesion0_segmento0_movimiento0_metricas0["moda"]
print(
'Patient ID:', patient_id,'\n',
'Patient Name:', patient_name, '\n',
'Session:','\n',
' Id Session:',id_session,'\n',
' Game:', game_session, '\n',
' Level:', level_game, '\n',
' Iterations:', len(iterations),'\n',
' Victory:', iterations_victory, '\n',
' Time:', iterations_time, '\n',
' Victory:', iterations_victory1, '\n',
' Time:', iterations_time1, '\n',
' Affected Segment:', segment, '\n',
' Movement:', movement, '\n',
' Metrics:','\n',
' Minimum:', metric_min, '\n'
' Maximum:', metric_max, '\n'
' Average:', metric_average, '\n'
' Moda/Trend:', metric_moda, '\n'
)
#1
1
Note that you are omitting the second patient record in your data (Bernardo), and that you assume there are always exactly two iterations. This might not always be true.
请注意,您省略了数据中的第二个患者记录(Bernardo),并且您认为总是有两次迭代。这可能并非总是如此。
When you look for speed, your code is close to the best you can get, but for the above reasons, you would probably do good to add some tests and loops to make sure you cover all data, and not more.
当您寻找速度时,您的代码接近您可以获得的最佳值,但由于上述原因,您可能会添加一些测试和循环以确保覆盖所有数据,而不是更多。
Here is a function you could use to print the data in your format, based on a template you pass it. The template lists all labels you want to use for the keys you want to print the values for. In order to avoid ambiguity, the template needs both the key and the parent key of the elements of interest.
这是一个可用于根据您传递的模板以您的格式打印数据的功能。该模板列出了要用于要为其打印值的键的所有标签。为了避免歧义,模板需要感兴趣元素的键和父键。
As the function needs to visit the keys in order, OrderedDict
is used instead of dict
:
由于函数需要按顺序访问键,因此使用OrderedDict而不是dict:
import json
from collections import OrderedDict
data = json.loads(data, object_pairs_hook=OrderedDict)
def pretty(template, item, parentName='', name='', indent=0):
label = template.get(parentName + '/' + name)
if label:
label = ' ' * indent + label + ': '
if isinstance(item, list):
label += str(len(item))
elif not isinstance(item, OrderedDict):
label += str(item)
print(label)
if isinstance(item, list):
for value in item:
pretty(template, value, parentName + '[]', name, indent)
elif isinstance(item, OrderedDict):
for key, value in item.items():
pretty(template, value, name, key, indent+1)
template = {
"paciente/id": "Patient ID",
"paciente/nombre": "Patient Name",
"paciente/sesion": "Sessions",
"sesion/id": "Id Session",
"juego/nombre": "Game",
"nivel/nombre": "Level",
"juego/iteraciones": "Iterations",
"iteraciones/victoria": "Victory",
"iteraciones/tiempo": "Time",
"segmento/nombre": "Affected Segment",
"movimiento/nombre": "Movement",
"movimiento/metricas": "Metrics",
"metricas/min": "Minimum",
"metricas/max": "Maximum",
"metricas/media": "Average",
"metricas/moda": "Moda/Trend"
}
pretty(template, data)
The output is:
输出是:
Patient ID: 1234
Patient Name: Pablo
Sessions: 1
Id Session: 12345
Game: bonzo
Level: caida libre
Iterations: 2
Victory: True
Time: 120
Victory: False
Time: 232
Affected Segment: Hombro
Movement: flexion
Metrics: 1
Minimum: 12
Maximum: 34
Average: 23
Moda/Trend: 20
Patient ID: 156
Patient Name: Bernardo
Sessions: 1
Id Session: 456
Game: Rita
Level: NAVEGANDO
Iterations: 2
Victory: True
Time: 120
Victory: False
Time: 232
Affected Segment: Escapula
Movement: Protracción
Metrics: 1
Minimum: 12
Maximum: 34
Average: 23
Moda/Trend: 20
#2
1
Depending on what else your program is doing, it may or may not matter if you speed the code up. You should use the profile
or cProfile
module to find out where your script is spending its time and work on those.
根据您的程序正在执行的其他操作,如果您加快代码的速度,可能会或可能没有关系。您应该使用配置文件或cProfile模块来找出脚本花费时间的位置并对其进行处理。
Regardless, you could save some processing time by removing all the redundant indexing operations by using temporary variable to hold the result. You can think of this simple as the removal of common prefixes. It's relatively easy if you've got a good code editor.
无论如何,通过使用临时变量来保存结果,可以通过删除所有冗余索引操作来节省一些处理时间。您可以将此简单视为删除公共前缀。如果你有一个好的代码编辑器,这是相对容易的。
Although it may not be shorter or more readable code, it likely will execute faster (although there is some overhead involved).
虽然它可能不是更短或更易读的代码,但它可能会更快地执行(尽管涉及一些开销)。
Here's what I'm describing:
这就是我所描述的:
import json
with open('myfile.json') as data_file:
data = json.loads(data_file.read())
patient0_data = data["paciente"][0]
patient_id = patient0_data["id"]
patient_name = patient0_data["nombre"]
patient0_data_sesion0 = patient0_data["sesion"][0]
id_session = patient0_data_sesion0["id"]
patient0_data_sesion0_juego0 = patient0_data_sesion0["juego"][0]
game_session = patient0_data_sesion0_juego0["nombre"]
level_game = patient0_data_sesion0_juego0["nivel"][0]["nombre"]
iterations = patient0_data_sesion0_juego0["iteraciones"]
patient0_data_sesion0_juego0_iteraciones = patient0_data_sesion0_juego0["iteraciones"]
iterations_victory = patient0_data_sesion0_juego0_iteraciones[0]["victoria"]
iterations_time = patient0_data_sesion0_juego0_iteraciones[0]["tiempo"]
iterations_victory1 = patient0_data_sesion0_juego0_iteraciones[1]["victoria"]
iterations_time1 = patient0_data_sesion0_juego0_iteraciones[1]["tiempo"]
patient0_data_sesion0_segmento0 = patient0_data_sesion0["segmento"][0]
segment = patient0_data_sesion0_segmento0["nombre"]
patient0_data_sesion0_segmento0_movimiento0 = (
patient0_data_sesion0_segmento0["movimiento"][0])
movement = patient0_data_sesion0_segmento0_movimiento0["nombre"]
#metrics = patient0_data_sesion0_segmento0_movimiento0["metricas"]
patient0_data_sesion0_segmento0_movimiento0_metricas0 = (
patient0_data_sesion0_segmento0["movimiento"][0]["metricas"][0])
metric_min = patient0_data_sesion0_segmento0_movimiento0_metricas0["min"]
metric_max = patient0_data_sesion0_segmento0_movimiento0_metricas0["max"]
metric_average = patient0_data_sesion0_segmento0_movimiento0_metricas0["media"]
metric_moda = patient0_data_sesion0_segmento0_movimiento0_metricas0["moda"]
print(
'Patient ID:', patient_id,'\n',
'Patient Name:', patient_name, '\n',
'Session:','\n',
' Id Session:',id_session,'\n',
' Game:', game_session, '\n',
' Level:', level_game, '\n',
' Iterations:', len(iterations),'\n',
' Victory:', iterations_victory, '\n',
' Time:', iterations_time, '\n',
' Victory:', iterations_victory1, '\n',
' Time:', iterations_time1, '\n',
' Affected Segment:', segment, '\n',
' Movement:', movement, '\n',
' Metrics:','\n',
' Minimum:', metric_min, '\n'
' Maximum:', metric_max, '\n'
' Average:', metric_average, '\n'
' Moda/Trend:', metric_moda, '\n'
)