What does the following error:
下面的错误是什么?
Warning: overflow encountered in exp
in scipy/numpy using Python generally mean? I'm computing a ratio in log form, i.e. log(a) + log(b) and then taking the exponent of the result, using exp, and using a sum with logsumexp, as follows:
在scipy/numpy中使用Python通常意味着什么?我计算一个对数形式的比率,即log(a) + log(b),然后取其结果的指数,用exp,用logsumexp求和,如下:
c = log(a) + log(b)
c = c - logsumexp(c)
some values in the array b are intentionally set to 0. Their log will be -Inf.
数组b中的一些值故意设为0。他们的日志是-Inf。
What could be the cause of this warning? thanks.
这个警告的原因是什么?谢谢。
5 个解决方案
#1
22
In your case, it means that b
is very small somewhere in your array, and you're getting a number (a/b
or exp(log(a) - log(b))
) that is too large for whatever dtype (float32, float64, etc) the array you're using to store the output is.
在您的示例中,这意味着b在数组中的某个位置非常小,并且您得到了一个数字(a/b或exp(log(a) - log(b))),对于任何dtype (float32、float64等),您所使用的用于存储输出的数组都太大了。
Numpy can be configured to
Numpy可以配置为。
- Ignore these sorts of errors,
- 忽略这些错误,
- Print the error, but not raise a warning to stop the execution (the default)
- 打印错误,但不发出警告以停止执行(默认)
- Log the error,
- 记录错误,
- Raise a warning
- 提高一个警告
- Raise an error
- 提高一个错误
- Call a user-defined function
- 调用一个用户定义的函数
See numpy.seterr
to control how it handles having under/overflows, etc in floating point arrays.
看到numpy。seterr控制在浮点数组中如何处理/溢出等。
#2
8
When you need to deal with exponential, you quickly go into under/over flow since the function grows so quickly. A typical case is statistics, where summing exponentials of various amplitude is quite common. Since the numbers are very big/smalls, one generally takes the log to stay in a "reasonable" range, the so-called log domain:
当你需要处理指数的时候,你很快就会进入下/溢出流,因为函数增长得如此之快。一个典型的例子是统计,在这里,各种振幅的求和指数是相当普遍的。由于这些数字非常大/小,所以人们通常会将日志记录在一个“合理”的范围内,即所谓的日志域:
exp(-a) + exp(-b) -> log(exp(-a) + exp(-b))
Problems still arise because exp(-a) will still underflows up. For example, exp(-1000) is already below the smallest number you can represent as a double. So for example:
仍然会出现问题,因为exp(-a)仍然会出现不足。例如,exp(-1000)已经低于您可以表示为double的最小值。举个例子:
log(exp(-1000) + exp(-1000))
gives -inf (log (0 + 0)), even though you can expect something like -1000 by hand (-1000 + log(2)). The function logsumexp does it better, by extracting the max of the number set, and taking it out of the log:
给出-inf (log(0 + 0)),即使你可以期望得到-1000 (-1000 + log(2))。函数logsumexp可以更好地实现它,通过提取数据集的最大值,并将其从日志中取出:
log(exp(a) + exp(b)) = m + log(exp(a-m) + exp(b-m))
It does not avoid underflow totally (if a and b are vastly different for example), but it avoids most precision issues in the final result
它不能完全避免流(如果a和b有很大的不同),但是它避免了最终结果中最精确的问题。
#3
3
I think you can use this method to solve this problem:
我认为你可以用这个方法来解决这个问题:
Normalized
归一化
I overcome the problem in this method. Before using this method, the accuracy my classify is :86%. After using this method, the accuracy of my classify is :96%!!! It's great!
first:
Min-Max scaling
我用这种方法克服了这个问题。在使用此方法之前,我的分类精度为:86%。使用此方法后,我的分类准确率为:96%!!!太好了!第一:Min-Max扩展
second:
Z-score standardization
第二:z分数标准化
These are common methods to implement normalization
.
I use the first method. And I alter it. The maximum number is divided by 10. So the maximum number of the result is 10. Then exp(-10) will be not overflow
!
I hope my answer will help you !(^_^)
这些是实现规范化的常用方法。我用第一种方法。我改变它。最大的数除以10。所以结果的最大值是10。然后exp(-10)将不会溢出!我希望我的回答会帮助你!(^ _ ^)
#4
2
Isn't exp(log(a) - log(b))
the same as exp(log(a/b))
which is the same as a/b
?
exp(log(a) - log(b))与exp(log(a/b))相同,与a/b相同?
>>> from math import exp, log
>>> exp(log(100) - log(10))
10.000000000000002
>>> exp(log(1000) - log(10))
99.999999999999957
2010-12-07: If this is so "some values in the array b are intentionally set to 0", then you are essentially dividing by 0. That sounds like a problem.
如果这是“数组b中的某些值被故意设为0”,那么本质上就是除以0。听起来是个问题。
#5
0
In my case, it was due to large values in the data. I had to normalize (divide by 255, because my data was related to images) to get the values scaled down.
在我的例子中,这是由于数据中的大值。我必须标准化(除以255,因为我的数据与图像相关),以使数值缩小。
#1
22
In your case, it means that b
is very small somewhere in your array, and you're getting a number (a/b
or exp(log(a) - log(b))
) that is too large for whatever dtype (float32, float64, etc) the array you're using to store the output is.
在您的示例中,这意味着b在数组中的某个位置非常小,并且您得到了一个数字(a/b或exp(log(a) - log(b))),对于任何dtype (float32、float64等),您所使用的用于存储输出的数组都太大了。
Numpy can be configured to
Numpy可以配置为。
- Ignore these sorts of errors,
- 忽略这些错误,
- Print the error, but not raise a warning to stop the execution (the default)
- 打印错误,但不发出警告以停止执行(默认)
- Log the error,
- 记录错误,
- Raise a warning
- 提高一个警告
- Raise an error
- 提高一个错误
- Call a user-defined function
- 调用一个用户定义的函数
See numpy.seterr
to control how it handles having under/overflows, etc in floating point arrays.
看到numpy。seterr控制在浮点数组中如何处理/溢出等。
#2
8
When you need to deal with exponential, you quickly go into under/over flow since the function grows so quickly. A typical case is statistics, where summing exponentials of various amplitude is quite common. Since the numbers are very big/smalls, one generally takes the log to stay in a "reasonable" range, the so-called log domain:
当你需要处理指数的时候,你很快就会进入下/溢出流,因为函数增长得如此之快。一个典型的例子是统计,在这里,各种振幅的求和指数是相当普遍的。由于这些数字非常大/小,所以人们通常会将日志记录在一个“合理”的范围内,即所谓的日志域:
exp(-a) + exp(-b) -> log(exp(-a) + exp(-b))
Problems still arise because exp(-a) will still underflows up. For example, exp(-1000) is already below the smallest number you can represent as a double. So for example:
仍然会出现问题,因为exp(-a)仍然会出现不足。例如,exp(-1000)已经低于您可以表示为double的最小值。举个例子:
log(exp(-1000) + exp(-1000))
gives -inf (log (0 + 0)), even though you can expect something like -1000 by hand (-1000 + log(2)). The function logsumexp does it better, by extracting the max of the number set, and taking it out of the log:
给出-inf (log(0 + 0)),即使你可以期望得到-1000 (-1000 + log(2))。函数logsumexp可以更好地实现它,通过提取数据集的最大值,并将其从日志中取出:
log(exp(a) + exp(b)) = m + log(exp(a-m) + exp(b-m))
It does not avoid underflow totally (if a and b are vastly different for example), but it avoids most precision issues in the final result
它不能完全避免流(如果a和b有很大的不同),但是它避免了最终结果中最精确的问题。
#3
3
I think you can use this method to solve this problem:
我认为你可以用这个方法来解决这个问题:
Normalized
归一化
I overcome the problem in this method. Before using this method, the accuracy my classify is :86%. After using this method, the accuracy of my classify is :96%!!! It's great!
first:
Min-Max scaling
我用这种方法克服了这个问题。在使用此方法之前,我的分类精度为:86%。使用此方法后,我的分类准确率为:96%!!!太好了!第一:Min-Max扩展
second:
Z-score standardization
第二:z分数标准化
These are common methods to implement normalization
.
I use the first method. And I alter it. The maximum number is divided by 10. So the maximum number of the result is 10. Then exp(-10) will be not overflow
!
I hope my answer will help you !(^_^)
这些是实现规范化的常用方法。我用第一种方法。我改变它。最大的数除以10。所以结果的最大值是10。然后exp(-10)将不会溢出!我希望我的回答会帮助你!(^ _ ^)
#4
2
Isn't exp(log(a) - log(b))
the same as exp(log(a/b))
which is the same as a/b
?
exp(log(a) - log(b))与exp(log(a/b))相同,与a/b相同?
>>> from math import exp, log
>>> exp(log(100) - log(10))
10.000000000000002
>>> exp(log(1000) - log(10))
99.999999999999957
2010-12-07: If this is so "some values in the array b are intentionally set to 0", then you are essentially dividing by 0. That sounds like a problem.
如果这是“数组b中的某些值被故意设为0”,那么本质上就是除以0。听起来是个问题。
#5
0
In my case, it was due to large values in the data. I had to normalize (divide by 255, because my data was related to images) to get the values scaled down.
在我的例子中,这是由于数据中的大值。我必须标准化(除以255,因为我的数据与图像相关),以使数值缩小。