I have two numpy masked arrays:
我有两个numpy掩蔽数组:
>>> x
masked_array(data = [1 2 -- 4],
mask = [False False True False],
fill_value = 999999)
>>> y
masked_array(data = [4 -- 0 4],
mask = [False True False False],
fill_value = 999999)
If I try to divide x
by y
, the division operation is not actually performed when one of the operands is masked, so I don't get a divide-by-zero error.
如果我试着用x除以y,除法运算不是在一个操作数被掩盖的情况下执行的,所以我不会得到一个零乘零的错误。
>>> x/y
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 1e+20)
This even works if I define my own division function, div
:
如果我定义了我自己的分部函数div的话,这个方法也能奏效:
>>> def div(a,b):
return a/b
>>> div(x, y)
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 1e+20)
However, if I wrap my function with vectorize
, the function is called on masked values and I get an error:
但是,如果我用vectorize来包装我的函数,函数会被蒙面值调用,我就会得到一个错误:
>>> np.vectorize(div)(x, y)
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/usr/lib64/python3.4/site-packages/numpy/lib/function_base.py", line 1811, in __call__
return self._vectorize_call(func=func, args=vargs)
File "/usr/lib64/python3.4/site-packages/numpy/lib/function_base.py", line 1880, in _vectorize_call
outputs = ufunc(*inputs)
File "<input>", line 2, in div
ZeroDivisionError: division by zero
Is there a way I can call a function with array arguments, and have the function only be executed when all of the arguments are unmasked?
是否有一种方法可以调用一个带有数组参数的函数,并且只在所有参数都被打开时才执行这个函数?
1 个解决方案
#1
9
The problem
Calling the function directly worked because, when you call div(x,y)
, div
's arguments a
and b
become the MaskedArrays x
and y
, and the resulting code for a/b
is x.__div__(y)
(or __truediv__
).
调用函数直接工作是因为,当您调用div(x,y)时,div的参数a和b变成了MaskedArrays x和y, a/b的结果代码是x.__div__(y)(或__truediv__)。
Now, since x
is a MaskedArray, it has the intelligence to perform the division on another MaskedArray, following its rules.
现在,由于x是一个MaskedArray,它有智能在另一个MaskedArray上执行除法,遵循它的规则。
However, when you vectorize it, your div
function is not going to see any MaskedArrays, just scalars, a couple of int
s in this case. So, when it tries a/b
in the third items, it will be 'something' by zero, and you get the error.
但是,当你对它进行矢量化时,你的div函数不会看到任何MaskedArrays,也就是标量,在这个例子中有几个ints。所以,当它在第三项中尝试a/b时,它的值将是0,你会得到误差。
MaskedArray's implementation seems to be based on re-implementing much of Numpy specifically for MaskedArrays. See, for example, that you have both numpy.log
and numpy.ma.log
. Compare running both of them on a MaskedArray that contains negative values. Both actually return a proper MaskedArray, but the plain numpy version also outputs some complains about dividing by zero:
MaskedArray的实现似乎是基于重新实现许多专门针对MaskedArray的Numpy。例如,您有两个numpy。日志和numpy.ma.log。比较一下在包含负值的MaskedArray上运行它们。两者实际上都返回了一个适当的MaskedArray,但是普通的numpy版本也输出了一些关于除以0的抱怨:
In [116]: x = masked_array(data = [-1, 2, 0, 4],
...: mask = [False, False, True, False],
...: fill_value = 999999)
In [117]: numpy.log(x)
/usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
#!/usr/bin/python3
/usr/bin/ipython:1: RuntimeWarning: invalid value encountered in log
#!/usr/bin/python3
Out[117]:
masked_array(data = [-- 0.6931471805599453 -- 1.3862943611198906],
mask = [ True False True False],
fill_value = 999999)
In [118]: numpy.ma.log(x)
Out[118]:
masked_array(data = [-- 0.6931471805599453 -- 1.3862943611198906],
mask = [ True False True False],
fill_value = 999999)
If you run the numpy.log version on a plain list, it will return nan
and inf
for invalid values, not throw an error like the ZeroDivisionError
you're getting.
如果你运行numpy。在一个普通列表上的日志版本,它将返回nan和inf作为无效值,而不会抛出一个错误,就像您得到的ZeroDivisionError一样。
In [138]: a = [1,-1,0]
In [139]: numpy.log(a)
/usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
#!/usr/bin/python3
/usr/bin/ipython:1: RuntimeWarning: invalid value encountered in log
#!/usr/bin/python3
Out[139]: array([ 0., nan, -inf])
Simpler solution
With that, I see two alternatives: first, for the simpler case you listed, you could replace the bad values by a no-op: 1 in div
's case (note that the data is slightly different from yours, as there is a zero you didn't mark as masked):
有了它,我看到了两种选择:首先,对于您列出的更简单的情况,您可以在div中使用no-op: 1替换坏值(注意,数据与您的数据略有不同,因为有一个0您没有标记为mask):
x = masked_array(data = [1, 2, 0, 4],
mask = [False, False, True, False],
fill_value = 999999)
y = masked_array(data = [4, 0, 0, 4],
mask = [False, True, True, False],
fill_value = 999999)
In [153]: numpy.vectorize(div)(x,y.filled(1))
Out[153]:
masked_array(data = [0.25 2.0 -- 1.0],
mask = [False False True False],
fill_value = 999999)
The problem with that approach is that the filled values are listed as non-masked on the result, which is probably not what you want.
这种方法的问题是填充的值在结果中被列出为非掩码,这可能不是您想要的。
Better solution
Now, div
was probably just an example, and you probably want more complex behavior for which there is not a 'no-op' argument. In this case, you can do as Numpy did for log
, and avoid throwing an exception, instead returning a specific value. In this case, numpy.ma.masked
. div
's implementation becomes this:
现在,div可能只是一个例子,你可能想要更复杂的行为,而不是一个“无操作”的参数。在这种情况下,可以像Numpy对log那样,避免抛出异常,而是返回一个特定的值。在这种情况下,numpy.ma.masked。div的实现变得:
In [154]: def div(a,b):
...: try:
...: return a/b
...: except Exception as e:
...: warnings.warn (str(e))
...: return numpy.ma.masked
...:
...:
In [155]: numpy.vectorize(div)(x,y)
/usr/bin/ipython:5: UserWarning: division by zero
start_ipython()
/usr/lib/python3.6/site-packages/numpy/lib/function_base.py:2813: UserWarning: Warning: converting a masked element to nan.
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
Out[155]:
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 999999)
More generic solution
But perhaps you already have the function and do not want to change it, or it is third-party. In that case, you could use a higher-order function:
但是,也许您已经有了这个函数,并且不想更改它,或者它是第三方的。在这种情况下,您可以使用一个高阶函数:
In [164]: >>> def div(a,b):
...: return a/b
...:
In [165]: def masked_instead_of_error (f):
...: def wrapper (*args, **kwargs):
...: try:
...: return f(*args, **kwargs)
...: except:
...: return numpy.ma.masked
...: return wrapper
...:
In [166]: numpy.vectorize(masked_instead_of_error(div))(x,y)
/usr/lib/python3.6/site-packages/numpy/lib/function_base.py:2813: UserWarning: Warning: converting a masked element to nan.
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
Out[166]:
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 999999)
On the implementations above, using warnings might or might not be a good idea. You may also want to restrict the types of exceptions you'll be catching for returning numpy.ma.masked
.
在上面的实现中,使用警告可能是一个好主意。您还可能想要限制返回numpy.ma.掩蔽的异常类型。
Note also that masked_instead_of_error
is ready to be used as a decorator for your functions, so you do not need to use it every time.
还要注意,masked_instead_of_error已经准备好用作函数的decorator,所以您不需要每次都使用它。
#1
9
The problem
Calling the function directly worked because, when you call div(x,y)
, div
's arguments a
and b
become the MaskedArrays x
and y
, and the resulting code for a/b
is x.__div__(y)
(or __truediv__
).
调用函数直接工作是因为,当您调用div(x,y)时,div的参数a和b变成了MaskedArrays x和y, a/b的结果代码是x.__div__(y)(或__truediv__)。
Now, since x
is a MaskedArray, it has the intelligence to perform the division on another MaskedArray, following its rules.
现在,由于x是一个MaskedArray,它有智能在另一个MaskedArray上执行除法,遵循它的规则。
However, when you vectorize it, your div
function is not going to see any MaskedArrays, just scalars, a couple of int
s in this case. So, when it tries a/b
in the third items, it will be 'something' by zero, and you get the error.
但是,当你对它进行矢量化时,你的div函数不会看到任何MaskedArrays,也就是标量,在这个例子中有几个ints。所以,当它在第三项中尝试a/b时,它的值将是0,你会得到误差。
MaskedArray's implementation seems to be based on re-implementing much of Numpy specifically for MaskedArrays. See, for example, that you have both numpy.log
and numpy.ma.log
. Compare running both of them on a MaskedArray that contains negative values. Both actually return a proper MaskedArray, but the plain numpy version also outputs some complains about dividing by zero:
MaskedArray的实现似乎是基于重新实现许多专门针对MaskedArray的Numpy。例如,您有两个numpy。日志和numpy.ma.log。比较一下在包含负值的MaskedArray上运行它们。两者实际上都返回了一个适当的MaskedArray,但是普通的numpy版本也输出了一些关于除以0的抱怨:
In [116]: x = masked_array(data = [-1, 2, 0, 4],
...: mask = [False, False, True, False],
...: fill_value = 999999)
In [117]: numpy.log(x)
/usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
#!/usr/bin/python3
/usr/bin/ipython:1: RuntimeWarning: invalid value encountered in log
#!/usr/bin/python3
Out[117]:
masked_array(data = [-- 0.6931471805599453 -- 1.3862943611198906],
mask = [ True False True False],
fill_value = 999999)
In [118]: numpy.ma.log(x)
Out[118]:
masked_array(data = [-- 0.6931471805599453 -- 1.3862943611198906],
mask = [ True False True False],
fill_value = 999999)
If you run the numpy.log version on a plain list, it will return nan
and inf
for invalid values, not throw an error like the ZeroDivisionError
you're getting.
如果你运行numpy。在一个普通列表上的日志版本,它将返回nan和inf作为无效值,而不会抛出一个错误,就像您得到的ZeroDivisionError一样。
In [138]: a = [1,-1,0]
In [139]: numpy.log(a)
/usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
#!/usr/bin/python3
/usr/bin/ipython:1: RuntimeWarning: invalid value encountered in log
#!/usr/bin/python3
Out[139]: array([ 0., nan, -inf])
Simpler solution
With that, I see two alternatives: first, for the simpler case you listed, you could replace the bad values by a no-op: 1 in div
's case (note that the data is slightly different from yours, as there is a zero you didn't mark as masked):
有了它,我看到了两种选择:首先,对于您列出的更简单的情况,您可以在div中使用no-op: 1替换坏值(注意,数据与您的数据略有不同,因为有一个0您没有标记为mask):
x = masked_array(data = [1, 2, 0, 4],
mask = [False, False, True, False],
fill_value = 999999)
y = masked_array(data = [4, 0, 0, 4],
mask = [False, True, True, False],
fill_value = 999999)
In [153]: numpy.vectorize(div)(x,y.filled(1))
Out[153]:
masked_array(data = [0.25 2.0 -- 1.0],
mask = [False False True False],
fill_value = 999999)
The problem with that approach is that the filled values are listed as non-masked on the result, which is probably not what you want.
这种方法的问题是填充的值在结果中被列出为非掩码,这可能不是您想要的。
Better solution
Now, div
was probably just an example, and you probably want more complex behavior for which there is not a 'no-op' argument. In this case, you can do as Numpy did for log
, and avoid throwing an exception, instead returning a specific value. In this case, numpy.ma.masked
. div
's implementation becomes this:
现在,div可能只是一个例子,你可能想要更复杂的行为,而不是一个“无操作”的参数。在这种情况下,可以像Numpy对log那样,避免抛出异常,而是返回一个特定的值。在这种情况下,numpy.ma.masked。div的实现变得:
In [154]: def div(a,b):
...: try:
...: return a/b
...: except Exception as e:
...: warnings.warn (str(e))
...: return numpy.ma.masked
...:
...:
In [155]: numpy.vectorize(div)(x,y)
/usr/bin/ipython:5: UserWarning: division by zero
start_ipython()
/usr/lib/python3.6/site-packages/numpy/lib/function_base.py:2813: UserWarning: Warning: converting a masked element to nan.
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
Out[155]:
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 999999)
More generic solution
But perhaps you already have the function and do not want to change it, or it is third-party. In that case, you could use a higher-order function:
但是,也许您已经有了这个函数,并且不想更改它,或者它是第三方的。在这种情况下,您可以使用一个高阶函数:
In [164]: >>> def div(a,b):
...: return a/b
...:
In [165]: def masked_instead_of_error (f):
...: def wrapper (*args, **kwargs):
...: try:
...: return f(*args, **kwargs)
...: except:
...: return numpy.ma.masked
...: return wrapper
...:
In [166]: numpy.vectorize(masked_instead_of_error(div))(x,y)
/usr/lib/python3.6/site-packages/numpy/lib/function_base.py:2813: UserWarning: Warning: converting a masked element to nan.
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
Out[166]:
masked_array(data = [0.25 -- -- 1.0],
mask = [False True True False],
fill_value = 999999)
On the implementations above, using warnings might or might not be a good idea. You may also want to restrict the types of exceptions you'll be catching for returning numpy.ma.masked
.
在上面的实现中,使用警告可能是一个好主意。您还可能想要限制返回numpy.ma.掩蔽的异常类型。
Note also that masked_instead_of_error
is ready to be used as a decorator for your functions, so you do not need to use it every time.
还要注意,masked_instead_of_error已经准备好用作函数的decorator,所以您不需要每次都使用它。