Suppose I have an N*M*X-dimensional array "data", where N and M are fixed, but X is variable for each entry data[n][m].
假设我有一个N*M*X维数组“data”,其中N和M是固定的,但是X是每个条目数据的变量[N][M]。
(Edit: To clarify, I just used np.array() on the 3D python list which I used for reading in the data, so the numpy array is of dimensions N*M and its entries are variable-length lists)
(编辑:为了澄清,我只是在数据中读取的3D python列表中使用了np.array(),所以numpy数组的维度为N*M,其条目为可变长度列表)
I'd now like to compute the average over the X-dimension, so that I'm left with an N*M-dimensional array. Using np.average/mean with the axis-argument doesn't work, so the way I'm doing it right now is just iterating over N and M and appending the manually computed average to a new list, but that just doesn't feel very "python":
现在我想要计算x维的平均值,这样就剩下一个N* m维的数组了。使用np。使用axis-参数的平均/平均值不工作,所以我现在所做的方法只是迭代N和M,并将手工计算的平均值附加到一个新列表中,但这并不是非常“python”:
avgData=[]
for n in data:
temp=[]
for m in n:
temp.append(np.average(m))
avgData.append(temp)
Am I missing something obvious here? I'm trying to freshen up my python skills while I'm at it, so interesting/varied responses are more than welcome! :)
我漏掉了什么明显的东西吗?我正在努力更新我的python技能,所以有趣的/多样的回答是不受欢迎的!:)
Thanks!
谢谢!
2 个解决方案
#1
3
What about using np.vectorize
:
关于使用np.vectorize:
do_avg = np.vectorize(np.average)
data_2d = do_avg(data)
#2
1
data = np.array([[1,2,3],[0,3,2,4],[0,2],[1]]).reshape(2,2)
avg=np.zeros(data.shape)
avg.flat=[np.average(x) for x in data.flat]
print avg
#array([[ 2. , 2.25],
# [ 1. , 1. ]])
This still iterates over the elements of data (nothing un-Pythonic about that). But since there's nothing special about the shape
or axes
of data
, I'm just using data.flat
. While appending to Python list
, with numpy
it is better to assign values to the elements of an existing array.
这仍然遍历数据元素(没有什么是非python的)。但是由于数据的形状和轴没有什么特殊之处,所以我只使用data.flat。当附加到Python列表时,使用numpy时,最好为现有数组的元素赋值。
There are fast numeric methods to work with numpy arrays, but most (if not all) work with simple numeric dtypes
. Here the array elements are object
(either list or array), numpy has to resort to the usual Python iteration and list operations.
使用numpy数组有快速的数字方法,但大多数(如果不是全部)都使用简单的数字类型。在这里,数组元素是对象(列表或数组),numpy必须采用通常的Python迭代和列表操作。
For this small example, this solution is a bit faster than Zwicker's vectorize
. For larger data
the two solutions take about the same time.
对于这个小示例,这个解决方案比Zwicker的矢量化要快一些。对于较大的数据,这两个解决方案花费的时间大约相同。
#1
3
What about using np.vectorize
:
关于使用np.vectorize:
do_avg = np.vectorize(np.average)
data_2d = do_avg(data)
#2
1
data = np.array([[1,2,3],[0,3,2,4],[0,2],[1]]).reshape(2,2)
avg=np.zeros(data.shape)
avg.flat=[np.average(x) for x in data.flat]
print avg
#array([[ 2. , 2.25],
# [ 1. , 1. ]])
This still iterates over the elements of data (nothing un-Pythonic about that). But since there's nothing special about the shape
or axes
of data
, I'm just using data.flat
. While appending to Python list
, with numpy
it is better to assign values to the elements of an existing array.
这仍然遍历数据元素(没有什么是非python的)。但是由于数据的形状和轴没有什么特殊之处,所以我只使用data.flat。当附加到Python列表时,使用numpy时,最好为现有数组的元素赋值。
There are fast numeric methods to work with numpy arrays, but most (if not all) work with simple numeric dtypes
. Here the array elements are object
(either list or array), numpy has to resort to the usual Python iteration and list operations.
使用numpy数组有快速的数字方法,但大多数(如果不是全部)都使用简单的数字类型。在这里,数组元素是对象(列表或数组),numpy必须采用通常的Python迭代和列表操作。
For this small example, this solution is a bit faster than Zwicker's vectorize
. For larger data
the two solutions take about the same time.
对于这个小示例,这个解决方案比Zwicker的矢量化要快一些。对于较大的数据,这两个解决方案花费的时间大约相同。