numpy数组的索引和切片

时间:2024-10-08 15:03:32

numpy数组的索引和切片

基本切片操作

>>> import numpy as np
>>> arr=np.arange(10)
>>> arr
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> arr[5]
5
>>> arr[5:8]
array([5, 6, 7])

切片赋值操作

1.切片赋一个值对应原来数组中的值也会变

>>> arr[5:8]=12
>>> arr
array([ 0, 1, 2, 3, 4, 12, 12, 12, 8, 9])
>>> import numpy as np
>>> arr=np.arange(10)
>>> arr_slice=arr[5:8]
>>> arr_slice[0]=-1
>>> arr_slice
array([-1, 6, 7])
>>> arr
array([ 0, 1, 2, 3, 4, -1, 6, 7, 8, 9])

2.给数组中所有元素赋值

>>> arr[:]=-1
>>> arr
array([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1])

3.如果想使用复制的方法,使用copy方法

>>> arr_copy=arr[:].copy()
>>> arr_copy
array([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1])
>>> arr_copy[:]=0
>>> arr_copy
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>> arr
array([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1])

高阶数组索引

>>> import numpy as np
>>> arr2d=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> arr2d[2]
array([7, 8, 9])
>>> arr2d[0][2]
3
>>> arr2d[0,2]
3

numpy数组的索引和切片

>>> import numpy as np
>>> arr2d=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> arr2d[2]
array([7, 8, 9])
>>> arr2d[0][2]
3
>>> arr2d[0,2]
3
>>> arr3d=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
>>> arr3d
array([[[ 1, 2, 3],
[ 4, 5, 6]], [[ 7, 8, 9],
[10, 11, 12]]])
>>> arr3d[0]
array([[1, 2, 3],
[4, 5, 6]])
>>> old_values=arr3d[0].copy()
>>> arr3d[0]=42
>>> arr3d
array([[[42, 42, 42],
[42, 42, 42]], [[ 7, 8, 9],
[10, 11, 12]]])
>>> arr3d[1,0]
array([7, 8, 9])
>>> x=arr3d[1]
>>> x
array([[ 7, 8, 9],
[10, 11, 12]])
>>> x[0]
array([7, 8, 9])

高维数组切片

>>> arr2d[:2]
array([[1, 2, 3],
[4, 5, 6]])
>>> arr2d[:2,1:]
array([[2, 3],
[5, 6]])
>>> arr2d[1,:2]
array([4, 5])
>>> arr2d[:2,2]
array([3, 6])
>>> arr2d[:,:1]
array([[1],
[4],
[7]])

numpy数组的索引和切片

布尔型索引

1.假设我们有一个用于存储数据的数组以及一个存储姓名的数组(含有重复项)。在这里,我将使用numpy.random中的randn函数生成一些正态分布的随机数据:

>>> import numpy as np
>>> names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
>>> data=np.random.randn(7,4)#7行4列正太分布随机数组
>>> names
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')
>>> data
array([[ 0.24724057, 2.86939948, -0.82061782, -0.65745818],
[-0.98602372, -0.69305692, -1.44431904, -0.85490816],
[-0.73613349, 0.12700976, -1.00588979, 1.10646269],
[ 1.59110894, 1.68597758, 0.39414277, 2.02308399],
[-1.05607115, -0.50354292, -0.65820553, -0.77610316],
[ 1.72237936, -0.07726577, 1.63462647, -0.41943148],
[ 0.66744687, -1.01756773, -0.59254343, 0.19080575]])

2.假设每个名字都对应data数组中的一行,而我们想要选出对应于名字"Bob"的所有行。跟算术运算一样,数组的比较运算(如==)也是矢量化的。因此,对names和字符串"Bob"的比较运算将会产生一个布尔型数组:

>>> names=='Bob'
array([ True, False, False, True, False, False, False])

3.布尔数组可以用于数组的索引

获取等于'Bob'的行

>>> data[names=='Bob']
array([[ 0.24724057, 2.86939948, -0.82061782, -0.65745818],
[ 1.59110894, 1.68597758, 0.39414277, 2.02308399]])

获取不同于'Bob'的行

>>> data[names!='Bob']
array([[-0.98602372, -0.69305692, -1.44431904, -0.85490816],
[-0.73613349, 0.12700976, -1.00588979, 1.10646269],
[-1.05607115, -0.50354292, -0.65820553, -0.77610316],
[ 1.72237936, -0.07726577, 1.63462647, -0.41943148],
[ 0.66744687, -1.01756773, -0.59254343, 0.19080575]])

4.对布尔索引进行列索引

>>> data[names=='Bob',2:]
array([[-0.82061782, -0.65745818],
[ 0.39414277, 2.02308399]])
>>> data[names=='Bob',3]
array([-0.65745818, 2.02308399])

5.反转条件符

>>> cond=names=='Will'
>>> cond
array([False, False, True, False, True, False, False])
>>> data[~cond]
array([[ 0.24724057, 2.86939948, -0.82061782, -0.65745818],
[-0.98602372, -0.69305692, -1.44431904, -0.85490816],
[ 1.59110894, 1.68597758, 0.39414277, 2.02308399],
[ 1.72237936, -0.07726577, 1.63462647, -0.41943148],
[ 0.66744687, -1.01756773, -0.59254343, 0.19080575]])

6.布尔条件的运算

除此之外,连接符还有|、&之类

>>> mask=(names=='Bob')|(names=='Will')
>>> mask
array([ True, False, True, True, True, False, False])
>>> data[mask]
array([[ 0.24724057, 2.86939948, -0.82061782, -0.65745818],
[-0.73613349, 0.12700976, -1.00588979, 1.10646269],
[ 1.59110894, 1.68597758, 0.39414277, 2.02308399],
[-1.05607115, -0.50354292, -0.65820553, -0.77610316]])

7.条件选取

普通条件选取

>>> data[data<0]=0
>>> data
array([[0.24724057, 2.86939948, 0. , 0. ],
[0. , 0. , 0. , 0. ],
[0. , 0.12700976, 0. , 1.10646269],
[1.59110894, 1.68597758, 0.39414277, 2.02308399],
[0. , 0. , 0. , 0. ],
[1.72237936, 0. , 1.63462647, 0. ],
[0.66744687, 0. , 0. , 0.19080575]])

布尔条件选取

>>> import numpy as np
>>> names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
>>> data=np.random.randn(7,4)#7行4列正太分布随机数组
>>> data
array([[-1.24077681, -0.48320904, 1.22145611, 0.00666619],
[-0.65078721, -0.03482355, 1.74232625, 0.2979584 ],
[-1.51669752, 2.04245014, 0.09453898, -0.85531867],
[-1.51334497, 0.36947066, -0.87016919, 1.35107873],
[-1.11285867, -2.20906849, 0.38269412, 1.85375798],
[ 0.95132554, -1.54193589, 1.98741745, -0.60608077],
[ 0.78902133, 1.41593836, 0.09430052, -0.25057659]])
>>> data[names!='Joe']=7
>>> data
array([[ 7. , 7. , 7. , 7. ],
[-0.65078721, -0.03482355, 1.74232625, 0.2979584 ],
[ 7. , 7. , 7. , 7. ],
[ 7. , 7. , 7. , 7. ],
[ 7. , 7. , 7. , 7. ],
[ 0.95132554, -1.54193589, 1.98741745, -0.60608077],
[ 0.78902133, 1.41593836, 0.09430052, -0.25057659]])
>>>

花式索引

1.传入单个索引数组

>>> import numpy as np
>>> arr=np.empty((8,4))#创建8行4列内容为随机值的数组
>>> arr
array([[2.65577744e-260, 7.70858946e+218, 6.01334668e-154,
4.47593816e-091],
[7.01413727e-009, 2.96905203e+222, 2.11672643e+214,
4.56532297e-085],
[4.78409596e+180, 2.44001263e-152, 2.45981714e-154,
6.83528875e+212],
[6.14829725e-071, 1.05161522e-153, 1.05135742e-153,
2.43902457e-154],
[4.83245960e+276, 6.03103052e-154, 7.06652000e-096,
2.65862875e-260],
[1.76380220e+241, 2.30576063e-310, 9.80013217e+040,
1.55850644e-312],
[1.33360318e+241, 4.09842267e-310, 2.48721655e-075,
1.04922745e-312],
[1.91217285e-309, 1.18182126e-125, 6.57144273e-299,
5.54240979e-302]])
>>> for i in range(8):
arr[i]=i >>> arr
array([[0., 0., 0., 0.],
[1., 1., 1., 1.],
[2., 2., 2., 2.],
[3., 3., 3., 3.],
[4., 4., 4., 4.],
[5., 5., 5., 5.],
[6., 6., 6., 6.],
[7., 7., 7., 7.]])
>>> arr[[4,3,0,6]]#选特定的索引下标,选取第4,3,0,6行
array([[4., 4., 4., 4.],
[3., 3., 3., 3.],
[0., 0., 0., 0.],
[6., 6., 6., 6.]])
>>> arr[[-3,-5,-7]]#选择特定的索引下标,选取第-3,-5,-7列
array([[5., 5., 5., 5.],
[3., 3., 3., 3.],
[1., 1., 1., 1.]])

2.传入多个索引数组

>>> arr=np.arange(32).reshape((8,4))
>>> arr
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23],
[24, 25, 26, 27],
[28, 29, 30, 31]])
>>> arr[[1,5,7,2],[0,3,1,2]]#选取(1,0),(5,3),(7,1),(2,2)对应元素
array([ 4, 23, 29, 10])
>>> arr[[1,5,7,2]][:,[0,3,1,2]]#先选取第1,5,7,2行,再将每行按照0,3,1,2这个顺序交换
array([[ 4, 7, 5, 6],
[20, 23, 21, 22],
[28, 31, 29, 30],
[ 8, 11, 9, 10]])