I have a numpy array cols2:
我有一个numpy数组cols2:
print(type(cols2))
print(cols2.shape)
<class 'numpy.ndarray'>
(97, 2)
I was trying to get the first column of this 2d numpy array using the first code below, then i got a vector instead of my ideal one column of data. the second code seem to get me the ideal answer, but i am confused what does the second code is doing by adding a bracket outside the zero?
我试图使用下面的第一个代码获得这个2d numpy数组的第一列,然后我得到了一个向量而不是我理想的一列数据。第二个代码似乎让我得到了理想的答案,但我很困惑第二个代码通过在零之外添加一个括号来做什么?
print(type(cols2[:,0]))
print(cols2[:,0].shape)
<class 'numpy.ndarray'>
(97,)
print(type(cols2[:,[0]]))
print(cols2[:,[0]].shape)
<class 'numpy.ndarray'>
(97, 1)
2 个解决方案
#1
3
cols2[:, 0]
specifies that you want to slice out a 1D vector of length 97
from a 2D array. cols2[:, [0]]
specifies that you want to slice out a 2D sub-array of shape (97, 1)
from the 2D array. The square brackets []
make all the difference here.
cols2 [:,0]指定要从2D数组中切出长度为97的1D向量。 cols2 [:,[0]]指定您要从2D数组中切出形状(97,1)的2D子数组。方括号[]在这里有所不同。
v = np.arange(6).reshape(-1, 2)
v[:, 0]
array([0, 2, 4])
v[:, [0]]
array([[0],
[2],
[4]])
The fundamental difference is the extra dimension in the latter command (as you've noted). This is intended behaviour, as implemented in numpy.ndarray.__get/setitem__
and codified in the NumPy documentation.
根本区别在于后一个命令中的额外维度(如您所述)。这是在numpy.ndarray .__ get / setitem__中实现并在NumPy文档中编写的预期行为。
You can also specify cols2[:,0:1]
to the same effect - a column sub-slice.
您还可以将cols2 [:,0:1]指定为相同的效果 - 列子切片。
v[:, 0:1]
array([[0],
[2],
[4]])
For more information, look at the notes on Advanced Indexing in the NumPy docs.
有关更多信息,请查看NumPy文档中有关高级索引的说明。
#2
0
The extra square brackets around 0
in cols2[:, [0]]
adds an extra dimension.
cols2 [:,[0]]中0附加的额外方括号增加了额外的尺寸。
This becomes more clear when you print the results of your code:
打印代码结果时,这一点会变得更加清晰:
A = np.array([[1, 2],
[3, 4],
[5, 6]])
A.shape # (3, 2)
A[:, 0].shape # (3,)
A[:, 0] # array([1, 3, 5])
A[:, [0]]
# array([[1],
# [3],
# [5]])
An n-D numpy array can only use n integers to represent its shape. Therefore, a 1D array is represented by only a single integer. There is no concept of "rows" or "columns" of a 1D array.
n-D numpy数组只能使用n个整数来表示其形状。因此,1D数组仅由单个整数表示。没有1D阵列的“行”或“列”的概念。
You should resist the urge to think of numpy arrays as having rows and columns, but instead consider them as having dimensions and shape. This is a fundamental difference between numpy.array
and numpy.matrix
. In almost all cases, numpy.array
is sufficient.
您应该抵制将numpy数组视为具有行和列的冲动,而是将它们视为具有尺寸和形状。这是numpy.array和numpy.matrix之间的根本区别。在几乎所有情况下,numpy.array就足够了。
#1
3
cols2[:, 0]
specifies that you want to slice out a 1D vector of length 97
from a 2D array. cols2[:, [0]]
specifies that you want to slice out a 2D sub-array of shape (97, 1)
from the 2D array. The square brackets []
make all the difference here.
cols2 [:,0]指定要从2D数组中切出长度为97的1D向量。 cols2 [:,[0]]指定您要从2D数组中切出形状(97,1)的2D子数组。方括号[]在这里有所不同。
v = np.arange(6).reshape(-1, 2)
v[:, 0]
array([0, 2, 4])
v[:, [0]]
array([[0],
[2],
[4]])
The fundamental difference is the extra dimension in the latter command (as you've noted). This is intended behaviour, as implemented in numpy.ndarray.__get/setitem__
and codified in the NumPy documentation.
根本区别在于后一个命令中的额外维度(如您所述)。这是在numpy.ndarray .__ get / setitem__中实现并在NumPy文档中编写的预期行为。
You can also specify cols2[:,0:1]
to the same effect - a column sub-slice.
您还可以将cols2 [:,0:1]指定为相同的效果 - 列子切片。
v[:, 0:1]
array([[0],
[2],
[4]])
For more information, look at the notes on Advanced Indexing in the NumPy docs.
有关更多信息,请查看NumPy文档中有关高级索引的说明。
#2
0
The extra square brackets around 0
in cols2[:, [0]]
adds an extra dimension.
cols2 [:,[0]]中0附加的额外方括号增加了额外的尺寸。
This becomes more clear when you print the results of your code:
打印代码结果时,这一点会变得更加清晰:
A = np.array([[1, 2],
[3, 4],
[5, 6]])
A.shape # (3, 2)
A[:, 0].shape # (3,)
A[:, 0] # array([1, 3, 5])
A[:, [0]]
# array([[1],
# [3],
# [5]])
An n-D numpy array can only use n integers to represent its shape. Therefore, a 1D array is represented by only a single integer. There is no concept of "rows" or "columns" of a 1D array.
n-D numpy数组只能使用n个整数来表示其形状。因此,1D数组仅由单个整数表示。没有1D阵列的“行”或“列”的概念。
You should resist the urge to think of numpy arrays as having rows and columns, but instead consider them as having dimensions and shape. This is a fundamental difference between numpy.array
and numpy.matrix
. In almost all cases, numpy.array
is sufficient.
您应该抵制将numpy数组视为具有行和列的冲动,而是将它们视为具有尺寸和形状。这是numpy.array和numpy.matrix之间的根本区别。在几乎所有情况下,numpy.array就足够了。