I was unable to find anything describing how to do this, which leads to be believe I'm not doing this in the proper idiomatic Python way. Advice on the 'proper' Python way to do this would also be appreciated.
我无法找到任何描述如何执行此操作的内容,这导致相信我不是以正确的惯用Python方式执行此操作。关于'正确'Python方法的建议也将受到赞赏。
I have a bunch of variables for a datalogger I'm writing (arbitrary logging length, with a known maximum length). In MATLAB, I would initialize them all as 1-D arrays of zeros of length n, n bigger than the number of entries I would ever see, assign each individual element variable(measurement_no) = data_point
in the logging loop, and trim off the extraneous zeros when the measurement was over. The initialization would look like this:
对于我正在编写的数据记录器,我有一堆变量(任意记录长度,已知最大长度)。在MATLAB中,我将它们全部初始化为长度为n的n的一维数组,n大于我将看到的条目数,在记录循环中分配每个单独的元素变量(measurement_no)= data_point,并修剪掉测量结束时,外部为零。初始化将如下所示:
[dData gData cTotalEnergy cResFinal etc] = deal(zeros(n,1));
Is there a way to do this in Python/NumPy so I don't either have to put each variable on its own line:
有没有办法在Python / NumPy中执行此操作,因此我不必将每个变量放在自己的行上:
dData = np.zeros(n)
gData = np.zeros(n)
etc.
I would also prefer not just make one big matrix, because keeping track of which column is which variable is unpleasant. Perhaps the solution is to make the (length x numvars)
matrix, and assign the column slices out to individual variables?
我也不想只做一个大矩阵,因为跟踪哪个列是哪个变量令人不快。也许解决方案是制作(长度x numvars)矩阵,并将列切片分配给各个变量?
EDIT: Assume I'm going to have a lot of vectors of the same length by the time this is over; e.g., my post-processing takes each log file, calculates a bunch of separate metrics (>50), stores them, and repeats until the logs are all processed. Then I generate histograms, means/maxes/sigmas/etc. for all the various metrics I computed. Since initializing 50+ vectors is clearly not easy in Python, what's the best (cleanest code and decent performance) way of doing this?
编辑:假设在结束时我会有很多相同长度的向量;例如,我的后处理采用每个日志文件,计算一堆单独的度量(> 50),存储它们,并重复,直到所有日志都被处理完毕。然后我生成直方图,平均值/ maxes / sigmas /等。对于我计算的所有各种指标。由于在Python中初始化50多个向量显然不容易,这样做的最佳(最干净的代码和不错的性能)方法是什么?
3 个解决方案
#1
11
If you're really motivated to do this in a one-liner you could create an (n_vars, ...)
array of zeros, then unpack it along the first dimension:
如果您真的有动力在单行中执行此操作,则可以创建一个(n_vars,...)零数组,然后沿第一维解压缩它:
a, b, c = np.zeros((3, 5))
print(a is b)
# False
Another option is to use a list comprehension or a generator expression:
另一种选择是使用列表推导或生成器表达式:
a, b, c = [np.zeros(5) for _ in range(3)] # list comprehension
d, e, f = (np.zeros(5) for _ in range(3)) # generator expression
print(a is b, d is e)
# False False
Be careful, though! You might think that using the *
operator on a list or tuple containing your call to np.zeros()
would achieve the same thing, but it doesn't:
不过要小心!您可能认为在包含对np.zeros()的调用的列表或元组上使用*运算符将实现相同的目的,但它不会:
h, i, j = (np.zeros(5),) * 3
print(h is i)
# True
This is because the expression inside the tuple gets evaluated first. np.zeros(5)
therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5)
.
这是因为首先评估元组内的表达式。因此,np.zeros(5)只被调用一次,并且重复元组中的每个元素最终都是对同一数组的引用。这就是为什么你不能只使用a = b = c = np.zeros(5)的原因。
Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.
除非你真的需要分配大量的空数组变量,并且你真的非常关心使代码紧凑(!),我建议在不同的行上初始化它们以便于阅读。
#2
5
Nothing wrong or un-Pythonic with
没有错误或没有Pythonic与
dData = np.zeros(n)
gData = np.zeros(n)
etc.
You could put them on one line, but there's no particular reason to do so.
你可以将它们放在一条线上,但没有特别的理由这样做。
dData, gData = np.zeros(n), np.zeros(n)
Don't try dData = gData = np.zeros(n)
, because a change to dData
changes gData
(they point to the same object). For the same reason you usually don't want to use x = y = []
.
不要尝试dData = gData = np.zeros(n),因为对dData的更改会更改gData(它们指向同一个对象)。出于同样的原因,您通常不想使用x = y = []。
The deal
in MATLAB is a convenience, but isn't magical. Here's how Octave implements it
MATLAB中的交易很方便,但并不神奇。以下是Octave如何实现它
function [varargout] = deal (varargin)
if (nargin == 0)
print_usage ();
elseif (nargin == 1 || nargin == nargout)
varargout(1:nargout) = varargin;
else
error ("deal: nargin > 1 and nargin != nargout");
endif
endfunction
In contrast to Python, in Octave (and presumably MATLAB)
与Python相反,在Octave中(可能是MATLAB)
one=two=three=zeros(1,3)
assigns different objects to the 3 variables.
为3个变量分配不同的对象。
Notice also how MATLAB talks about deal
as a way of assigning contents of cells and structure arrays. http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html
另请注意MATLAB如何将交易作为分配单元格和结构数组内容的方式进行讨论。 http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html
#3
0
If you put your data in a collections.defaultdict
you won't need to do any explicit initialization. Everything will be initialized the first time it is used.
如果将数据放在collections.defaultdict中,则无需进行任何显式初始化。一切都会在第一次使用时初始化。
import numpy as np
import collections
n = 100
data = collections.defaultdict(lambda: np.zeros(n))
for i in range(1, n):
data['g'][i] = data['d'][i - 1]
# ...
#1
11
If you're really motivated to do this in a one-liner you could create an (n_vars, ...)
array of zeros, then unpack it along the first dimension:
如果您真的有动力在单行中执行此操作,则可以创建一个(n_vars,...)零数组,然后沿第一维解压缩它:
a, b, c = np.zeros((3, 5))
print(a is b)
# False
Another option is to use a list comprehension or a generator expression:
另一种选择是使用列表推导或生成器表达式:
a, b, c = [np.zeros(5) for _ in range(3)] # list comprehension
d, e, f = (np.zeros(5) for _ in range(3)) # generator expression
print(a is b, d is e)
# False False
Be careful, though! You might think that using the *
operator on a list or tuple containing your call to np.zeros()
would achieve the same thing, but it doesn't:
不过要小心!您可能认为在包含对np.zeros()的调用的列表或元组上使用*运算符将实现相同的目的,但它不会:
h, i, j = (np.zeros(5),) * 3
print(h is i)
# True
This is because the expression inside the tuple gets evaluated first. np.zeros(5)
therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5)
.
这是因为首先评估元组内的表达式。因此,np.zeros(5)只被调用一次,并且重复元组中的每个元素最终都是对同一数组的引用。这就是为什么你不能只使用a = b = c = np.zeros(5)的原因。
Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.
除非你真的需要分配大量的空数组变量,并且你真的非常关心使代码紧凑(!),我建议在不同的行上初始化它们以便于阅读。
#2
5
Nothing wrong or un-Pythonic with
没有错误或没有Pythonic与
dData = np.zeros(n)
gData = np.zeros(n)
etc.
You could put them on one line, but there's no particular reason to do so.
你可以将它们放在一条线上,但没有特别的理由这样做。
dData, gData = np.zeros(n), np.zeros(n)
Don't try dData = gData = np.zeros(n)
, because a change to dData
changes gData
(they point to the same object). For the same reason you usually don't want to use x = y = []
.
不要尝试dData = gData = np.zeros(n),因为对dData的更改会更改gData(它们指向同一个对象)。出于同样的原因,您通常不想使用x = y = []。
The deal
in MATLAB is a convenience, but isn't magical. Here's how Octave implements it
MATLAB中的交易很方便,但并不神奇。以下是Octave如何实现它
function [varargout] = deal (varargin)
if (nargin == 0)
print_usage ();
elseif (nargin == 1 || nargin == nargout)
varargout(1:nargout) = varargin;
else
error ("deal: nargin > 1 and nargin != nargout");
endif
endfunction
In contrast to Python, in Octave (and presumably MATLAB)
与Python相反,在Octave中(可能是MATLAB)
one=two=three=zeros(1,3)
assigns different objects to the 3 variables.
为3个变量分配不同的对象。
Notice also how MATLAB talks about deal
as a way of assigning contents of cells and structure arrays. http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html
另请注意MATLAB如何将交易作为分配单元格和结构数组内容的方式进行讨论。 http://www.mathworks.com/company/newsletters/articles/whats-the-big-deal.html
#3
0
If you put your data in a collections.defaultdict
you won't need to do any explicit initialization. Everything will be initialized the first time it is used.
如果将数据放在collections.defaultdict中,则无需进行任何显式初始化。一切都会在第一次使用时初始化。
import numpy as np
import collections
n = 100
data = collections.defaultdict(lambda: np.zeros(n))
for i in range(1, n):
data['g'][i] = data['d'][i - 1]
# ...