初始化多个Numpy数组(多个赋值) - 就像MATLAB交易()

时间:2022-08-12 21:26:47

I was unable to find anything describing how to do this, which leads to be believe I'm not doing this in the proper idiomatic Python way. Advice on the 'proper' Python way to do this would also be appreciated.


I have a bunch of variables for a datalogger I'm writing (arbitrary logging length, with a known maximum length). In MATLAB, I would initialize them all as 1-D arrays of zeros of length n, n bigger than the number of entries I would ever see, assign each individual element variable(measurement_no) = data_point in the logging loop, and trim off the extraneous zeros when the measurement was over. The initialization would look like this:

对于我正在编写的数据记录器,我有一堆变量(任意记录长度,已知最大长度)。在MATLAB中,我将它们全部初始化为长度为n的n的一维数组,n大于我将看到的条目数,在记录循环中分配每个单独的元素变量(measurement_no)= data_point,并修剪掉测量结束时,外部为零。初始化将如下所示:

[dData gData cTotalEnergy cResFinal etc] = deal(zeros(n,1));

Is there a way to do this in Python/NumPy so I don't either have to put each variable on its own line:

有没有办法在Python / NumPy中执行此操作,因此我不必将每个变量放在自己的行上:

dData = np.zeros(n)
gData = np.zeros(n)

I would also prefer not just make one big matrix, because keeping track of which column is which variable is unpleasant. Perhaps the solution is to make the (length x numvars) matrix, and assign the column slices out to individual variables?

我也不想只做一个大矩阵,因为跟踪哪个列是哪个变量令人不快。也许解决方案是制作(长度x numvars)矩阵,并将列切片分配给各个变量?

EDIT: Assume I'm going to have a lot of vectors of the same length by the time this is over; e.g., my post-processing takes each log file, calculates a bunch of separate metrics (>50), stores them, and repeats until the logs are all processed. Then I generate histograms, means/maxes/sigmas/etc. for all the various metrics I computed. Since initializing 50+ vectors is clearly not easy in Python, what's the best (cleanest code and decent performance) way of doing this?

编辑:假设在结束时我会有很多相同长度的向量;例如,我的后处理采用每个日志文件,计算一堆单独的度量(> 50),存储它们,并重复,直到所有日志都被处理完毕。然后我生成直方图,平均值/ maxes / sigmas /等。对于我计算的所有各种指标。由于在Python中初始化50多个向量显然不容易,这样做的最佳(最干净的代码和不错的性能)方法是什么?

3 个解决方案



If you're really motivated to do this in a one-liner you could create an (n_vars, ...) array of zeros, then unpack it along the first dimension:


a, b, c = np.zeros((3, 5))
print(a is b)
# False

Another option is to use a list comprehension or a generator expression:


a, b, c = [np.zeros(5) for _ in range(3)]   # list comprehension
d, e, f = (np.zeros(5) for _ in range(3))   # generator expression
print(a is b, d is e)
# False False

Be careful, though! You might think that using the * operator on a list or tuple containing your call to np.zeros() would achieve the same thing, but it doesn't:


h, i, j = (np.zeros(5),) * 3
print(h is i)
# True

This is because the expression inside the tuple gets evaluated first. np.zeros(5) therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5).

这是因为首先评估元组内的表达式。因此,np.zeros(5)只被调用一次,并且重复元组中的每个元素最终都是对同一数组的引用。这就是为什么你不能只使用a = b = c = np.zeros(5)的原因。

Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.




Nothing wrong or un-Pythonic with


dData = np.zeros(n)
gData = np.zeros(n)

You could put them on one line, but there's no particular reason to do so.


dData, gData = np.zeros(n), np.zeros(n)

Don't try dData = gData = np.zeros(n), because a change to dData changes gData (they point to the same object). For the same reason you usually don't want to use x = y = [].

不要尝试dData = gData = np.zeros(n),因为对dData的更改会更改gData(它们指向同一个对象)。出于同样的原因,您通常不想使用x = y = []。

The deal in MATLAB is a convenience, but isn't magical. Here's how Octave implements it


function [varargout] = deal (varargin)
  if (nargin == 0)
    print_usage ();
  elseif (nargin == 1 || nargin == nargout)
    varargout(1:nargout) = varargin;
    error ("deal: nargin > 1 and nargin != nargout");


In contrast to Python, in Octave (and presumably MATLAB)



assigns different objects to the 3 variables.


Notice also how MATLAB talks about deal as a way of assigning contents of cells and structure arrays. http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html

另请注意MATLAB如何将交易作为分配单元格和结构数组内容的方式进行讨论。 http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html



If you put your data in a collections.defaultdict you won't need to do any explicit initialization. Everything will be initialized the first time it is used.


import numpy as np
import collections
n = 100
data = collections.defaultdict(lambda: np.zeros(n))
for i in range(1, n):
    data['g'][i] = data['d'][i - 1]
    # ...



