如何有效地计算移动标准差?

时间:2021-12-15 12:38:14

Below you can see my C# method to calculate Bollinger Bands for each point (moving average, up band, down band).

在下面,您可以看到我的c#方法,用于计算每个点(移动平均、上带、下带)的带。

As you can see this method uses 2 for loops to calculate the moving standard deviation using the moving average. It used to contain an additional loop to calculate the moving average over the last n periods. This one I could remove by adding the new point value to total_average at the beginning of the loop and removing the i - n point value at the end of the loop.

正如你所看到的,这个方法使用2个for循环使用移动平均来计算移动标准差。它过去包含一个额外的循环来计算过去n个周期的移动平均。我可以通过在循环开始时将新的点值添加到total_average并在循环结束时删除I - n点值来删除它。

My question now is basically: Can I remove the remaining inner loop in a similar way I managed with the moving average?

我现在的问题基本上是:我能以与移动平均线类似的方式移除剩余的内部循环吗?

    public static void AddBollingerBands(SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)
    {
        double total_average = 0;

        for (int i = 0; i < data.Count(); i++)
        {
            total_average += data.Values[i]["close"];

            if (i >= period - 1)
            {
                double total_bollinger = 0;
                double average = total_average / period;

                for (int x = i; x > (i - period); x--)
                {
                    total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);
                }

                double stdev = Math.Sqrt(total_bollinger / period);

                data.Values[i]["bollinger_average"] = average;
                data.Values[i]["bollinger_top"] = average + factor * stdev;
                data.Values[i]["bollinger_bottom"] = average - factor * stdev;

                total_average -= data.Values[i - period + 1]["close"];
            }
        }
    }

4 个解决方案

#1


21  

The answer is yes, you can. In the mid-80's I developed just such an algorithm (probably not original) in FORTRAN for a process monitoring and control application. Unfortunately, that was over 25 years ago and I do not remember the exact formulas, but the technique was an extension of the one for moving averages, with second order calculations instead of just linear ones.

答案是肯定的,你可以。在80年代中期,我在FORTRAN上开发了一种用于过程监控应用程序的算法(可能不是原创的)。不幸的是,那是25年前的事了,我不记得确切的公式,但这个技术是移动平均线的延伸,用二阶计算而不是线性计算。


After looking at your code some, I am think that I can suss out how I did it back then. Notice how your inner loop is making a Sum of Squares?:

看了你的代码之后,我想我可以知道我当时是怎么做的。注意到你的内部循环是如何产生平方和的吗?

            for (int x = i; x > (i - period); x--)
            {
                total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);
            }

in much the same way that your average must have originally had a Sum of Values? The only two differences are the order (its power 2 instead of 1) and that you are subtracting the average each value before you square it. Now that might look inseparable, but in fact they can be separated:

在很大程度上,你的平均值最初一定有一个值的和?唯一的两个区别是顺序(它的幂是2而不是1),在平方之前你要减去每个值的平均值。这看起来是不可分割的,但实际上它们是可以分离的:

SUM(i=1; n){ (v[i] - k)^2 }

is

SUM(i=1..n){v[i]^2 -2*v[i]*k + k^2}

which becomes

这就变成了

SUM(i=1..n){v[i]^2 -2*v[i]*k} + k^2*n

which is

这是

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]*k} + k^2*n

which is also

这也是

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]}*k + k^2*n

Now the first term is just a Sum of Squares, you handle that in the same way that you do the sum of Values for the average. The last term (k^2*n) is just the average squared times the period. Since you divide the result by the period anyway, you can just add the new average squared without the extra loop.

第一项是平方的和,处理它的方法和平均值的值的和一样。最后一学期(k ^ 2 * n)只是平均平方乘以。既然你把结果除以周期,你只需要加上新的平均值的平方而不需要额外的循环。

Finally, in the second term (SUM(-2*v[i]) * k), since SUM(v[i]) = total = k*n you can then change it into this:

最后,在第二项(-2*v[i]) * k中,由于SUM(v[i]) = total = k*n,则可以将其改为:

-2 * k * k * n

or just -2*k^2*n, which is -2 times the average squared, once the period (n) is divided out again. So the final combined formula is:

还是2 * k ^ 2 * n,2倍的平均平方,一旦期(n)分裂出来。所以最终的组合公式是:

SUM(i=1..n){v[i]^2} - n*k^2

or

SUM(i=1..n){values[i]^2} - period*(average^2)

(be sure to check the validity of this, since I am deriving it off the top of my head)

(一定要检查它的正确性,因为我是从头顶上推导出来的)

And incorporating into your code should look something like this:

在代码中合并应该是这样的:

public static void AddBollingerBands(ref SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)
{
    double total_average = 0;
    double total_squares = 0;

    for (int i = 0; i < data.Count(); i++)
    {
        total_average += data.Values[i]["close"];
        total_squares += Math.Pow(data.Values[i]["close"], 2);

        if (i >= period - 1)
        {
            double total_bollinger = 0;
            double average = total_average / period;

            double stdev = Math.Sqrt((total_squares - Math.Pow(total_average,2)/period) / period);
            data.Values[i]["bollinger_average"] = average;
            data.Values[i]["bollinger_top"] = average + factor * stdev;
            data.Values[i]["bollinger_bottom"] = average - factor * stdev;

            total_average -= data.Values[i - period + 1]["close"];
            total_squares -= Math.Pow(data.Values[i - period + 1]["close"], 2);
        }
    }
}

#2


26  

The problem with approaches that calculate the sum of squares is that it and the square of sums can get quite large, and the calculation of their difference may introduce a very large error, so let's think of something better. For why this is needed, see the Wikipedia article on Algorithms for computing variance and John Cook on Theoretical explanation for numerical results)

计算平方和的方法的问题是它和和的平方可以变得很大,计算它们的差异可能会引入一个很大的误差,所以让我们想个更好的方法。要了解为什么需要这样做,请参阅Wikipedia关于计算方差算法的文章,以及关于数值结果的理论解释的John Cook)

First, instead of calculating the stddev let's focus on the variance. Once we have the variance, stddev is just the square root of the variance.

首先,我们不计算stddev,而是关注方差。一旦有了方差,stddev就是方差的平方根。

Suppose the data are in an array called x; rolling an n-sized window by one can be thought of as removing the value of x[0] and adding the value of x[n]. Let's denote the averages of x[0]..x[n-1] and x[1]..x[n] by µ and µ’ respectively. The difference between the variances of x[0]..x[n-1] and x[1]..x[n] is, after canceling out some terms and applying (a²-b²) = (a+b)(a-b):

假设数据在一个名为x的数组中;将一个n大小的窗口逐个滚动可以被认为是删除x[0]的值并添加x的值[n]。我们来表示x[0]的平均值。x[n]和[1]. .分别由µ和µx[n]。x[0]的方差之差。x[n]和[1]. .x[n],之后取消了一些术语和应用(²- b²)=(a + b)(a - b):

Var[x[1],..,x[n]] - Var[x[0],..,x[n-1]] 
= (\sum_1^n x[i]² - n µ’²)/(n-1) - (\sum_0^{n-1} x[i]² - n µ²)/(n-1)
= (x[n]² - x[0]² - n(µ’² - µ²))/(n-1) 
= (x[n]-µ’ + x[0]-µ)(x[n]-x[0])/(n-1)

Therefore the variance is perturbed by something that doesn't require you to maintain the sum of squares, which is better for numerical accuracy.

因此方差会被一些不需要保持平方和的东西扰乱,这对数值精度更好。

You can calculate the mean and variance once in the beginning with a proper algorithm (Welford's method). After that, every time you have to replace a value in the window x[0] by another x[n] you update the average and variance like this:

您可以在开始时使用合适的算法(Welford方法)计算平均值和方差一次。之后,每次需要将窗口x[0]中的值替换为另一个x[n]时,就会更新平均值和方差,如下所示:

new_Avg = Avg + (x[n]-x[0])/n
new_Var = Var + (x[n]-new_Avg + x[0]-Avg)(x[n] - x[0])/(n-1)
new_StdDev = sqrt(new_Var)

#3


1  

I've used commons-math (and contributed to that library!) for something very similar to this. It's open-source, porting to C# should be easy as store-bought pie (have you tried making a pie from scratch!?). Check it out: http://commons.apache.org/math/api-3.1.1/index.html. They have a StandardDeviation class. Go to town!

我使用了公共数学(并为那个图书馆做出了贡献!)它是开源的,移植到c#应该很容易,就像商店里买的馅饼(你试过从头开始做馅饼吗?)检查一下:http://commons.apache.org/math/api-3.1.1/index.html。他们有一个标准偏差等级。去小镇!

#4


0  

Most important information has already been given above --- but maybe this is still of general interest.

上面已经给出了最重要的信息——但也许这仍然是大家感兴趣的。

A tiny Java library to calculate moving average and standard deviation is available here: https://github.com/tools4j/meanvar

这里有一个计算移动平均和标准偏差的小Java库:https://github.com/tools4j/meanvar

The implementation is based on a variant of Welford's method mentioned above. Methods to remove and replace values have been derived that can be used for moving value windows.

该实现基于上述Welford方法的一个变体。提取和替换值的方法可以用于移动值窗口。

#1


21  

The answer is yes, you can. In the mid-80's I developed just such an algorithm (probably not original) in FORTRAN for a process monitoring and control application. Unfortunately, that was over 25 years ago and I do not remember the exact formulas, but the technique was an extension of the one for moving averages, with second order calculations instead of just linear ones.

答案是肯定的,你可以。在80年代中期,我在FORTRAN上开发了一种用于过程监控应用程序的算法(可能不是原创的)。不幸的是,那是25年前的事了,我不记得确切的公式,但这个技术是移动平均线的延伸,用二阶计算而不是线性计算。


After looking at your code some, I am think that I can suss out how I did it back then. Notice how your inner loop is making a Sum of Squares?:

看了你的代码之后,我想我可以知道我当时是怎么做的。注意到你的内部循环是如何产生平方和的吗?

            for (int x = i; x > (i - period); x--)
            {
                total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);
            }

in much the same way that your average must have originally had a Sum of Values? The only two differences are the order (its power 2 instead of 1) and that you are subtracting the average each value before you square it. Now that might look inseparable, but in fact they can be separated:

在很大程度上,你的平均值最初一定有一个值的和?唯一的两个区别是顺序(它的幂是2而不是1),在平方之前你要减去每个值的平均值。这看起来是不可分割的,但实际上它们是可以分离的:

SUM(i=1; n){ (v[i] - k)^2 }

is

SUM(i=1..n){v[i]^2 -2*v[i]*k + k^2}

which becomes

这就变成了

SUM(i=1..n){v[i]^2 -2*v[i]*k} + k^2*n

which is

这是

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]*k} + k^2*n

which is also

这也是

SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]}*k + k^2*n

Now the first term is just a Sum of Squares, you handle that in the same way that you do the sum of Values for the average. The last term (k^2*n) is just the average squared times the period. Since you divide the result by the period anyway, you can just add the new average squared without the extra loop.

第一项是平方的和,处理它的方法和平均值的值的和一样。最后一学期(k ^ 2 * n)只是平均平方乘以。既然你把结果除以周期,你只需要加上新的平均值的平方而不需要额外的循环。

Finally, in the second term (SUM(-2*v[i]) * k), since SUM(v[i]) = total = k*n you can then change it into this:

最后,在第二项(-2*v[i]) * k中,由于SUM(v[i]) = total = k*n,则可以将其改为:

-2 * k * k * n

or just -2*k^2*n, which is -2 times the average squared, once the period (n) is divided out again. So the final combined formula is:

还是2 * k ^ 2 * n,2倍的平均平方,一旦期(n)分裂出来。所以最终的组合公式是:

SUM(i=1..n){v[i]^2} - n*k^2

or

SUM(i=1..n){values[i]^2} - period*(average^2)

(be sure to check the validity of this, since I am deriving it off the top of my head)

(一定要检查它的正确性,因为我是从头顶上推导出来的)

And incorporating into your code should look something like this:

在代码中合并应该是这样的:

public static void AddBollingerBands(ref SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)
{
    double total_average = 0;
    double total_squares = 0;

    for (int i = 0; i < data.Count(); i++)
    {
        total_average += data.Values[i]["close"];
        total_squares += Math.Pow(data.Values[i]["close"], 2);

        if (i >= period - 1)
        {
            double total_bollinger = 0;
            double average = total_average / period;

            double stdev = Math.Sqrt((total_squares - Math.Pow(total_average,2)/period) / period);
            data.Values[i]["bollinger_average"] = average;
            data.Values[i]["bollinger_top"] = average + factor * stdev;
            data.Values[i]["bollinger_bottom"] = average - factor * stdev;

            total_average -= data.Values[i - period + 1]["close"];
            total_squares -= Math.Pow(data.Values[i - period + 1]["close"], 2);
        }
    }
}

#2


26  

The problem with approaches that calculate the sum of squares is that it and the square of sums can get quite large, and the calculation of their difference may introduce a very large error, so let's think of something better. For why this is needed, see the Wikipedia article on Algorithms for computing variance and John Cook on Theoretical explanation for numerical results)

计算平方和的方法的问题是它和和的平方可以变得很大,计算它们的差异可能会引入一个很大的误差,所以让我们想个更好的方法。要了解为什么需要这样做,请参阅Wikipedia关于计算方差算法的文章,以及关于数值结果的理论解释的John Cook)

First, instead of calculating the stddev let's focus on the variance. Once we have the variance, stddev is just the square root of the variance.

首先,我们不计算stddev,而是关注方差。一旦有了方差,stddev就是方差的平方根。

Suppose the data are in an array called x; rolling an n-sized window by one can be thought of as removing the value of x[0] and adding the value of x[n]. Let's denote the averages of x[0]..x[n-1] and x[1]..x[n] by µ and µ’ respectively. The difference between the variances of x[0]..x[n-1] and x[1]..x[n] is, after canceling out some terms and applying (a²-b²) = (a+b)(a-b):

假设数据在一个名为x的数组中;将一个n大小的窗口逐个滚动可以被认为是删除x[0]的值并添加x的值[n]。我们来表示x[0]的平均值。x[n]和[1]. .分别由µ和µx[n]。x[0]的方差之差。x[n]和[1]. .x[n],之后取消了一些术语和应用(²- b²)=(a + b)(a - b):

Var[x[1],..,x[n]] - Var[x[0],..,x[n-1]] 
= (\sum_1^n x[i]² - n µ’²)/(n-1) - (\sum_0^{n-1} x[i]² - n µ²)/(n-1)
= (x[n]² - x[0]² - n(µ’² - µ²))/(n-1) 
= (x[n]-µ’ + x[0]-µ)(x[n]-x[0])/(n-1)

Therefore the variance is perturbed by something that doesn't require you to maintain the sum of squares, which is better for numerical accuracy.

因此方差会被一些不需要保持平方和的东西扰乱,这对数值精度更好。

You can calculate the mean and variance once in the beginning with a proper algorithm (Welford's method). After that, every time you have to replace a value in the window x[0] by another x[n] you update the average and variance like this:

您可以在开始时使用合适的算法(Welford方法)计算平均值和方差一次。之后,每次需要将窗口x[0]中的值替换为另一个x[n]时,就会更新平均值和方差,如下所示:

new_Avg = Avg + (x[n]-x[0])/n
new_Var = Var + (x[n]-new_Avg + x[0]-Avg)(x[n] - x[0])/(n-1)
new_StdDev = sqrt(new_Var)

#3


1  

I've used commons-math (and contributed to that library!) for something very similar to this. It's open-source, porting to C# should be easy as store-bought pie (have you tried making a pie from scratch!?). Check it out: http://commons.apache.org/math/api-3.1.1/index.html. They have a StandardDeviation class. Go to town!

我使用了公共数学(并为那个图书馆做出了贡献!)它是开源的,移植到c#应该很容易,就像商店里买的馅饼(你试过从头开始做馅饼吗?)检查一下:http://commons.apache.org/math/api-3.1.1/index.html。他们有一个标准偏差等级。去小镇!

#4


0  

Most important information has already been given above --- but maybe this is still of general interest.

上面已经给出了最重要的信息——但也许这仍然是大家感兴趣的。

A tiny Java library to calculate moving average and standard deviation is available here: https://github.com/tools4j/meanvar

这里有一个计算移动平均和标准偏差的小Java库:https://github.com/tools4j/meanvar

The implementation is based on a variant of Welford's method mentioned above. Methods to remove and replace values have been derived that can be used for moving value windows.

该实现基于上述Welford方法的一个变体。提取和替换值的方法可以用于移动值窗口。