如何循环pandas数据帧?

时间:2022-01-20 18:17:31

I have a python function which works on sequence of coordinates (trajectory data). It requires data to be in the following format.

我有一个python函数,它在坐标序列(轨迹数据)上工作。它要求数据采用以下格式。

#items = [Item(x1, y1), Item(x2, y2), Item(x3, y3), Item(x4, y4)]
items = [Item(0.5, 0.5), Item(-0.5, 0.5), Item(-0.5, -0.5), Item(0.5, -0.5)]

It is also required to find the xmin, ymin, xmax, ymax from the above items and specify it for a bounding box as below.

还需要从上面的项中找到xmin,ymin,xmax,ymax,并为下面的边界框指定它。

 spindex = pyqtree.Index(bbox=[-1, -1, 1, 1])
                        #bbox = [xmin,ymin,xmax,ymax]

Now, the items are inserted as below.

现在,项目插入如下。

 #Inserting items
 for item in items:
     spindex.insert(item, item.bbox)

As we can see for now all the above operations are performed on a single sequence of coordinates specified in items. I require to perform the above steps on a data frame with multiple trajectories, each having multiple sequence of points and identified by an id vid.

正如我们现在所看到的,所有上述操作都是在项目中指定的单个坐标序列上执行的。我需要在具有多个轨迹的数据帧上执行上述步骤,每个轨迹具有多个点序列并由id vid标识。

The sample df is as follows:

样本df如下:

   vid       x         y
0  1         2         3
1  1         3         4
2  1         5         6
3  2         7         8 
4  2         9        10
5  3         11       12
6  3         13       14
7  3         15       16
8  3         17       18

In the above data frame, x, y are the coordinate data and all the points belonging to the same “vid" forms one separate trajectory; so it can be observed rows(0-2) belonging to voyage id (vid) = 1 is one trajectory, while points belonging to vid = 2 is another trajectory and so on.

在上面的数据框中,x,y是坐标数据,属于同一“vid”的所有点形成一个单独的轨迹;因此可以观察到属于voyage id(vid)= 1的行(0-2)是一个轨迹,而属于vid = 2的点是另一个轨迹,依此类推。

The above data can be transformed as the following df too (only if required):

以上数据也可以转换为以下df(仅在需要时):

    vid        (x,y)
0   1          [ (2,3),(3,4), (5,6) ]
1   2          [ (7,8),(9,10) ]
2   3          [ (11,12),(13,14),(15,16),(17,18) ]

I want to create a way to loop over the df and maybe groupby them with vid and get all the coordinates as items and find xmin,xmax,ymin,ymax and insert them as shown above for each of the trajectories in the df.

我想创建一种循环遍历df的方法,并且可能使用vid将它们组合在一起并将所有坐标作为项目并找到xmin,xmax,ymin,ymax,并如上所示为df中的每个轨迹插入它们。

I have a code something like this, but it doesn't works

我有一个像这样的代码,但它不起作用

for group in df.groupby('vid'):
bbox = [ group['x'].min(), group['y'].min(), group['x'].max(), group['y'].max() ]
spindex.insert(group['vid'][0], bbox)

Please Help.

1 个解决方案

#1


1  

Gourpby return ((gkeys), grouped_dataframe)
Modify your codes to following:

Gourpby return((gkeys),grouped_dataframe)将您的代码修改为以下内容:

for g in df.groupby('vid'):
   vid = g[0]
   g_df = g[1]
   bbox = [ g_df['x'].min(), g_df['y'].min(), g_df['x'].max(), g_df['y'].max() ]
   spindex.insert(vid, bbox)

#1


1  

Gourpby return ((gkeys), grouped_dataframe)
Modify your codes to following:

Gourpby return((gkeys),grouped_dataframe)将您的代码修改为以下内容:

for g in df.groupby('vid'):
   vid = g[0]
   g_df = g[1]
   bbox = [ g_df['x'].min(), g_df['y'].min(), g_df['x'].max(), g_df['y'].max() ]
   spindex.insert(vid, bbox)