I want to replace outliners from a list. Therefore I define a upper and lower bound. Now every value above upper_bound
and under lower_bound
is replaced with the bound value. My approach was to do this in two steps using a numpy array.
我想从列表中替换outliners。因此我定义了上限和下限。现在,每个高于upper_bound和lower_bound的值都将被绑定值替换。我的方法是使用numpy数组分两步完成。
Now I wonder if it's possible to do this in one step, as I guess it could improve performance and readability.
现在我想知道是否有可能一步到位,因为我猜它可以提高性能和可读性。
Is there a shorter way to do this?
有没有更短的方法来做到这一点?
import numpy as np
lowerBound, upperBound = 3, 7
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
arr[arr > upperBound] = upperBound
arr[arr < lowerBound] = lowerBound
# [3 3 3 3 4 5 6 7 7 7]
print(arr)
2 个解决方案
#1
31
You can use numpy.clip
:
你可以使用numpy.clip:
In [1]: import numpy as np
In [2]: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [3]: lowerBound, upperBound = 3, 7
In [4]: np.clip(arr, lowerBound, upperBound, out=arr)
Out[4]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])
In [5]: arr
Out[5]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])
#2
13
For an alternative that doesn't rely on numpy
, you could always do
对于不依赖于numpy的替代方案,您可以随时使用
arr = [max(lower_bound, min(x, upper_bound)) for x in arr]
If you just wanted to set an upper bound, you could of course write arr = [min(x, upper_bound) for x in arr]
. Or similarly if you just wanted a lower bound, you'd use max
instead.
如果你只想设置一个上限,你当然可以在arr中为ar编写arr = [min(x,upper_bound)]。或者类似地,如果你只想要一个下限,你可以使用max代替。
Here, I've just applied both operations, written together.
在这里,我刚刚应用了两个操作。
Edit: Here's a slightly more in-depth explanation:
编辑:这里有一个更深入的解释:
Given an element x
of the array (and assuming that your upper_bound
is at least as big as your lower_bound
!), you'll have one of three cases:
给定数组的元素x(并假设您的upper_bound至少与lower_bound一样大!),您将遇到以下三种情况之一:
i) x < lower_bound
i)x
ii) x > upper_bound
ii)x> upper_bound
iii) lower_bound <= x <= upper_bound
.
iii)lower_bound <= x <= upper_bound。
In case (i), the max/min
expression first evaluates to max(lower_bound, x)
, which then resolves to lower_bound
.
在情况(i)中,max / min表达式首先计算为max(lower_bound,x),然后解析为lower_bound。
In case (ii), the expression first becomes max(lower_bound, upper_bound)
, which then becomes upper_bound
.
在情况(ii)中,表达式首先变为max(lower_bound,upper_bound),然后变为upper_bound。
In case (iii), we get max(lower_bound, x)
which resolves to just x
.
在情况(iii)中,我们得到max(lower_bound,x),它解析为x。
In all three cases, the output is what we want.
在所有这三种情况下,输出都是我们想要的。
#1
31
You can use numpy.clip
:
你可以使用numpy.clip:
In [1]: import numpy as np
In [2]: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [3]: lowerBound, upperBound = 3, 7
In [4]: np.clip(arr, lowerBound, upperBound, out=arr)
Out[4]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])
In [5]: arr
Out[5]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])
#2
13
For an alternative that doesn't rely on numpy
, you could always do
对于不依赖于numpy的替代方案,您可以随时使用
arr = [max(lower_bound, min(x, upper_bound)) for x in arr]
If you just wanted to set an upper bound, you could of course write arr = [min(x, upper_bound) for x in arr]
. Or similarly if you just wanted a lower bound, you'd use max
instead.
如果你只想设置一个上限,你当然可以在arr中为ar编写arr = [min(x,upper_bound)]。或者类似地,如果你只想要一个下限,你可以使用max代替。
Here, I've just applied both operations, written together.
在这里,我刚刚应用了两个操作。
Edit: Here's a slightly more in-depth explanation:
编辑:这里有一个更深入的解释:
Given an element x
of the array (and assuming that your upper_bound
is at least as big as your lower_bound
!), you'll have one of three cases:
给定数组的元素x(并假设您的upper_bound至少与lower_bound一样大!),您将遇到以下三种情况之一:
i) x < lower_bound
i)x
ii) x > upper_bound
ii)x> upper_bound
iii) lower_bound <= x <= upper_bound
.
iii)lower_bound <= x <= upper_bound。
In case (i), the max/min
expression first evaluates to max(lower_bound, x)
, which then resolves to lower_bound
.
在情况(i)中,max / min表达式首先计算为max(lower_bound,x),然后解析为lower_bound。
In case (ii), the expression first becomes max(lower_bound, upper_bound)
, which then becomes upper_bound
.
在情况(ii)中,表达式首先变为max(lower_bound,upper_bound),然后变为upper_bound。
In case (iii), we get max(lower_bound, x)
which resolves to just x
.
在情况(iii)中,我们得到max(lower_bound,x),它解析为x。
In all three cases, the output is what we want.
在所有这三种情况下,输出都是我们想要的。