在两个相同长度的数组中删除nan和相应的元素

时间:2021-01-03 12:47:36

I have two lists of the same length that I can convert into array to play with the numpy.stats.pearsonr method. Now, some of the elements of these lists are nan, and can thus not be used for that method. The best thing to do in my case is to remove those elements, and the corresponding element in the other list. Is there a practical and pythonic way to do it? Example: I have

我有两个相同长度的列表,我可以将它们转换为数组来使用numpy.stats。pearsonr方法。现在,这些列表中的一些元素是nan,因此不能用于该方法。在我的例子中,最好的做法是删除这些元素,以及其他列表中的相应元素。是否有一种实用的和毕达哥拉斯式的方法?例子:我有

[1 2 nan 4 5 6 ] and [1 nan 3 nan 5 6]

[1 2 nan 4 5 6]和[1 nan 3 nan 5 6]

and in the end I need

最后我需要

[1 5 6 ]

(1 5 6)

[1 5 6 ]

(1 5 6)

(here the number are representative of the position/indices, not of the actual numbers I am dealing with). EDIT: The tricky part here is to have both lists/arrays without nans in one array AND elements corresponding to nans in the other, and vice versa. Although it can certainly be done by manipulating the arrays, I am sure there is a clear and not overcomplicated way to do it in a pythonic way.

(这里的数字代表的是位置/指标,而不是我正在处理的实际数字)。编辑:这里的棘手部分是在一个数组中没有nans的列表/数组,另一个数组中与nans对应的元素,反之亦然。尽管可以通过操作数组来完成,但我确信有一种清晰且不太复杂的方法可以用python语言来完成。

1 个解决方案

#1


5  

The accepted answer to proposed duplicate gets you half-way there. Since you're using Numpy already you should make these into numpy arrays. Then you should generate an indexing expression, and then use it to index these 2 arrays. Here indices will be a new array of bool of same shape where each element is True iff not (respective element in x is nan or respective element in y is nan):

被接受的提议副本的答案可以让你达到一半。既然你已经在使用Numpy,你应该把它们变成Numpy数组。然后你应该生成一个索引表达式,然后用它来索引这两个数组。这里的索引将是一个相同形状的bool的新数组,其中每个元素都是True iff not (x中的每个元素是nan, y中的每个元素是nan):

>>> x
array([  1.,   2.,  nan,   4.,   5.,   6.])
>>> y
array([  1.,  nan,   3.,  nan,   5.,   6.])
>>> indices = np.logical_not(np.logical_or(np.isnan(x), np.isnan(y)))
>>> x = x[indices]
>>> y = y[indices]
>>> x
array([ 1.,  5.,  6.])
>>> y
array([ 1.,  5.,  6.])

Notably, this works for any 2 arrays of same shape.

值得注意的是,这适用于任何两个形状相同的数组。

P.S., if you know that the element type in the operand arrays is boolean, as is the case for arrays returned from isnan here, you can use ~ instead of logical_not and | instead of logical_or: indices = ~(np.isnan(x) | np.isnan(y))

注:如果你知道操作数数组中的元素类型是布尔型的,就像从isnan返回的数组一样,你可以用~代替logical_not和|代替logical_or: index = ~(np.isnan(x) | np.isnan(y))

#1


5  

The accepted answer to proposed duplicate gets you half-way there. Since you're using Numpy already you should make these into numpy arrays. Then you should generate an indexing expression, and then use it to index these 2 arrays. Here indices will be a new array of bool of same shape where each element is True iff not (respective element in x is nan or respective element in y is nan):

被接受的提议副本的答案可以让你达到一半。既然你已经在使用Numpy,你应该把它们变成Numpy数组。然后你应该生成一个索引表达式,然后用它来索引这两个数组。这里的索引将是一个相同形状的bool的新数组,其中每个元素都是True iff not (x中的每个元素是nan, y中的每个元素是nan):

>>> x
array([  1.,   2.,  nan,   4.,   5.,   6.])
>>> y
array([  1.,  nan,   3.,  nan,   5.,   6.])
>>> indices = np.logical_not(np.logical_or(np.isnan(x), np.isnan(y)))
>>> x = x[indices]
>>> y = y[indices]
>>> x
array([ 1.,  5.,  6.])
>>> y
array([ 1.,  5.,  6.])

Notably, this works for any 2 arrays of same shape.

值得注意的是,这适用于任何两个形状相同的数组。

P.S., if you know that the element type in the operand arrays is boolean, as is the case for arrays returned from isnan here, you can use ~ instead of logical_not and | instead of logical_or: indices = ~(np.isnan(x) | np.isnan(y))

注:如果你知道操作数数组中的元素类型是布尔型的,就像从isnan返回的数组一样,你可以用~代替logical_not和|代替logical_or: index = ~(np.isnan(x) | np.isnan(y))