I have a data frame in this format:
我有这种格式的数据框:
vid points
0 1 [[0,1], [0,2, [0,3]]
1 2 [[1,2], [1,4], [1,9], [1,7]]
2 3 [[2,1], [2,3], [2,8]]
3 4 [[3,2], [3,4], [3,5],[3,6]]
Each row is trajectory data, and I have to find distance between the trajectories with a function func_dist
, like this:
每一行都是轨迹数据,我必须使用函数func_dist找到轨迹之间的距离,如下所示:
x = df.iloc[0]["points"].tolist()
y = df.iloc[3]["points"].tolist()
func_dist(x, y)
I have a list l
of indices for trajectories of interest..
我有一个感兴趣的轨迹索引列表l ..
l = [0,1,3]
I must find the distance between all the possible pairs of trajectories; in the case above, this is 0-1, 0-3, and 1-3. I know how to generate a list of pairs using
我必须找到所有可能的轨迹对之间的距离;在上面的例子中,这是0-1,0-3和1-3。我知道如何使用生成对列表
pairsets = list(itertools.combinations(l, 2))
which returns
[(0,1), (0,3), (1,3)]
Since the list may have over 100 indices, I am trying to automate this process and store the distances calculated between each pair in a new_df
data frame.
由于列表可能有超过100个索引,我试图自动化此过程并将每对之间计算的距离存储在new_df数据框中。
I tried the following code for distance computation:
我尝试了以下代码进行距离计算:
for pair in pairsets:
a, b = [m[0] for m in pairssets], [n[1] for n in pairsets]
for i in a:
x = df.iloc[i]["points"].tolist()
for j in b:
y = df.iloc[j]["points"].tolist()
dist = func_dist(x, y)
But it calculates only the last pair, 1-3. How to calculate all of the pairs and create a new data frame like this:
但它只计算最后一对,1-3。如何计算所有对并创建一个新的数据框,如下所示:
traj1 traj2 distance
0 1 some_val
0 3 some_val
1 3 some_val
1 个解决方案
#1
1
This is simply a matter of handling your indices properly. For each pair, you grab the two indices, assign your data sets, and compute the distance.
这只是正确处理您的指数的问题。对于每对,您获取两个索引,分配数据集并计算距离。
dist_table = []
for pair in pairsets:
i, j = pair
x = df.iloc[i]["points"].tolist()
y = df.iloc[j]["points"].tolist()
dist = func_dist(x, y)
dist_table.append( [i, j, dist] )
You can combine the first two lines:
您可以组合前两行:
for i, j in pairsets:
The dist_table
gives you a 2D list that you should be able to convert to a new data frame with a simple PANDAS call.
dist_table为您提供了一个2D列表,您应该能够通过简单的PANDAS调用将其转换为新的数据帧。
Does that get you moving?
这会让你感动吗?
#1
1
This is simply a matter of handling your indices properly. For each pair, you grab the two indices, assign your data sets, and compute the distance.
这只是正确处理您的指数的问题。对于每对,您获取两个索引,分配数据集并计算距离。
dist_table = []
for pair in pairsets:
i, j = pair
x = df.iloc[i]["points"].tolist()
y = df.iloc[j]["points"].tolist()
dist = func_dist(x, y)
dist_table.append( [i, j, dist] )
You can combine the first two lines:
您可以组合前两行:
for i, j in pairsets:
The dist_table
gives you a 2D list that you should be able to convert to a new data frame with a simple PANDAS call.
dist_table为您提供了一个2D列表,您应该能够通过简单的PANDAS调用将其转换为新的数据帧。
Does that get you moving?
这会让你感动吗?