用R进行运动数据分析;航班和时间二次抽样

I want to analyse angles in movement of animals. I have tracking data that has 10 recordings per second. The data per recording consists of the position (x,y) of the animal, the angle and distance relative to the previous recording and furthermore includes speed and acceleration. I want to analyse the speed an animal has while making a particular angle, however since the temporal resolution of my data is so high, each turn consists of a number of minute angles.

我想分析动物运动的角度。我有跟踪数据，每秒有10次记录。每次记录的数据包括动物的位置(x,y)、相对于先前记录的角度和距离，还包括速度和加速度。我想分析动物在做特定角度时的速度，但是由于我的数据的时间分辨率是如此之高，所以每次转弯都包含了许多分钟的角度。

I figured there are two possible ways to work around this problem for both of which I do not know how to achieve such a thing in R and help would be greatly appreciated.

我认为有两种可能的方法可以解决这个问题，我不知道如何在R中实现这样的事情，非常感谢您的帮助。

The first: Reducing my temporal resolution by a certain factor. However, this brings the disadvantage of losing possibly important parts of the data. Despite this, how would I be able to automatically subsample for example every 3rd or 10th recording of my data set?

第一，将时间分辨率降低一个特定的因素。然而，这带来了丢失数据中可能重要部分的缺点。尽管如此，我如何能够自动地进行子样本，例如，每3或10次记录我的数据集?

The second: By converting straight movement into so called 'flights'; rule based aggregation of steps in approximately the same direction, separated by acute turns (see the figure). A flight between two points ends when the perpendicular distance from the main direction of that flight is larger than x, a value that can be arbitrarily set. Does anyone have any idea how to do that with the xy coordinate positional data that I have?

第二，将直线运动转化为所谓的“飞行”;以规则为基础的步骤聚合，大致沿着相同的方向，由急转分隔(参见图)。两个点之间的飞行在距离飞行的主方向的垂直距离大于x时结束，x是一个可以任意设置的值，有人知道怎么用xy坐标位置数据来做吗?

用R进行运动数据分析;航班和时间二次抽样

1 个解决方案

#1

It sounds like there are three potential things you might want help with: the algorithm, the math, or R syntax.

听起来好像有三种潜在的东西你可能需要帮助:算法，数学，或R语法。

The algorithm you need may depend on the specifics of your data. For example, how much data do you have? What format is it in? Is it in 2D or 3D? One possibility is to iterate through your data set. With each new point, you need to check all the previous points to see if they fall within your desired column. If the data set is large, however, this might be really slow. Worst case scenario, all the data points are in a single flight segment, meaning you would check the first point the same number of times as you have data points, the second point one less, etc. The means n + (n-1) + (n-2) + ... + 1 = n(n-1)/2 operations. That's O(n^2); the operating time could have quadratic growth with respect to the size of your data set. Hence, you may need something more sophisticated.

您需要的算法可能取决于数据的细节。例如，你有多少数据?它是什么格式?是2D还是3D?一种可能是遍历数据集，对于每个新点，您需要检查前面的所有点，看看它们是否属于所需的列。但是，如果数据集很大，这可能会很慢。最坏的情况是，所有的数据点都在一个飞行段中，这意味着你会检查第一个点的次数和你有数据点的次数相同，第二个点的次数少，等等。+ 1 = n(n-1)/2个操作。这是O(n ^ 2);操作时间可以是数据集大小的二次增长，因此，您可能需要更复杂的东西。

The math to check whether a point is within your desired column of x is pretty straightforward, although maybe more sophisticated math could help inform a better algorithm. One approach would be to use vector arithmetic. To take an example, suppose you have points A, B, and C. Your goal is to see if B falls in a column of width x around the vector from A to C. To do this, find the vector v orthogonal to C, then look at whether the magnitude of the scalar projection of the vector from A to B onto v is less than x. There is lots of literature available for help with this sort of thing, here is one example.

检验一个点是否在你想要的x列内的数学方法是非常简单的，尽管也许更复杂的数学方法可以帮助你找到更好的算法。一种方法是使用矢量算法。举一个例子,假设你有分,B,B和C,你的目标是看看宽度x的一个列向量的下降从A到C。要做到这一点,找到向量v正交于C,然后查看是否标量的投影向量的大小从A到B在v小于x。有大量的文献可以帮助这类东西,这里是一个例子。

I think this is where I might start (with a boolean function for an individual point), since it seems like an R function to determine this would be convenient. Then another function that takes a set of points and calculates the vector v and calls the first function for each point in the set. Then run some data and see how long it takes.

我认为这是我可能开始的地方(对于单个点来说是一个布尔函数)，因为它看起来像一个R函数，以确定它是否方便。然后另一个函数取一组点，计算向量v，并调用集合中的每个点的第一个函数，然后运行一些数据，看看需要多长时间。

I'm afraid I won't be of much help with R syntax, although it is on my list of things I'd like to learn. I checked out the manual for R last night and it had plenty of useful examples. I believe this is very doable, even for an R novice like myself. It might be kind of slow if you have a big data set. However, with something that works, it might also be easier to acquire help from people with more knowledge and experience to optimize it.

恐怕我对R语法没有多大帮助，尽管它在我想学习的东西列表中。我昨晚查了一下R手册，里面有很多有用的例子。我相信这是非常可行的，即使对于像我这样的新手来说也是如此。如果你有一个大的数据集，可能会有点慢，但是，有了一些有用的东西，从拥有更多知识和经验的人那里获得帮助来优化它可能也会更容易。

Two quick clarifying points in case they are helpful:

两个快速澄清的要点，如果有帮助:

The above suggestion is just to start with the data for a single animal, so when I talk about growth of data I'm talking about the average data sample size for a single animal. If that is slow, you'll probably need to fix that first. Then you'll need to potentially analyze/optimize an algorithm for processing multiple animals afterwards.
上面的建议只是从单个动物的数据开始，所以当我谈到数据增长时，我指的是单个动物的平均数据样本大小。如果速度太慢，您可能需要首先修复它。然后，您需要潜在地分析/优化一个算法来处理多个动物。
I'm implicitly assuming that the definition of flight segment is the largest subset of contiguous data points where no "sub" flight segment violates the column rule. That is to say, I think I could come up with an example where a set of points satisfies your rule of falling within a column of width x around the vector to the last point, but if you looked at the column of width x around the vector to the second to last point, one point wouldn't meet the criteria anymore. Depending on how you define the flight segment then (e.g. if you want it to be the largest possible set of points that meet your condition and don't care about what happens inside), you may need something different (e.g. work backwards instead of forwards).
我隐式地假设飞行段的定义是连续数据点的最大子集，其中没有“子”飞行段违反列规则。也就是说,我认为我能想出一个例子,一组下降的点满足您的规则在一个列向量宽度x的最后一点,但是如果你看着周围的列宽x向量倒数第二点,一个点不符合标准了。取决于你如何定义飞行段(例如，如果你想让它成为满足你的条件的最大的点集，而不关心里面发生了什么)，你可能需要一些不同的东西(例如，向后而不是向前)。

#1

It sounds like there are three potential things you might want help with: the algorithm, the math, or R syntax.

听起来好像有三种潜在的东西你可能需要帮助:算法，数学，或R语法。

Two quick clarifying points in case they are helpful:

两个快速澄清的要点，如果有帮助:

The above suggestion is just to start with the data for a single animal, so when I talk about growth of data I'm talking about the average data sample size for a single animal. If that is slow, you'll probably need to fix that first. Then you'll need to potentially analyze/optimize an algorithm for processing multiple animals afterwards.
上面的建议只是从单个动物的数据开始，所以当我谈到数据增长时，我指的是单个动物的平均数据样本大小。如果速度太慢，您可能需要首先修复它。然后，您需要潜在地分析/优化一个算法来处理多个动物。
I'm implicitly assuming that the definition of flight segment is the largest subset of contiguous data points where no "sub" flight segment violates the column rule. That is to say, I think I could come up with an example where a set of points satisfies your rule of falling within a column of width x around the vector to the last point, but if you looked at the column of width x around the vector to the second to last point, one point wouldn't meet the criteria anymore. Depending on how you define the flight segment then (e.g. if you want it to be the largest possible set of points that meet your condition and don't care about what happens inside), you may need something different (e.g. work backwards instead of forwards).
我隐式地假设飞行段的定义是连续数据点的最大子集，其中没有“子”飞行段违反列规则。也就是说,我认为我能想出一个例子,一组下降的点满足您的规则在一个列向量宽度x的最后一点,但是如果你看着周围的列宽x向量倒数第二点,一个点不符合标准了。取决于你如何定义飞行段(例如，如果你想让它成为满足你的条件的最大的点集，而不关心里面发生了什么)，你可能需要一些不同的东西(例如，向后而不是向前)。

秒客网

用R进行运动数据分析;航班和时间二次抽样

1 个解决方案

#1

#1

相关文章