从numpy结构化数组中提取python“native”值

时间:2021-08-22 21:32:00

I have a structured numpy array.

我有一个结构化的numpy数组。

The numpy structure matches the type google.protobuf.Timestamp.

numpy结构与google.protobuf.Timestamp类型匹配。

I need to extract the seconds int64 and the nanos int32 from each element of said structure and assign it to the real Timestamp structure.

我需要从所述结构的每个元素中提取秒int64和nanos int32,并将其分配给实际的时间戳结构。

Below I list a script that does just that in a convenient way for anyone to test (numpy and protobuf Python modules need to be installed).

下面我列出了一个脚本,它可以方便地为任何人测试(需要安装numpy和protobuf Python模块)。

How do I get rid/circumvent the TypeError listed at the end and have the values out of the numpy structure in the Timestamp variable?

如何摆脱/绕过最后列出的TypeError并将值从Timestamp变量中的numpy结构中删除?

import numpy as np
from google.protobuf import timestamp_pb2

# numpy structure that mimics google.protobuf.Timestamp
Timestamp_t = np.dtype([('seconds', np.int64), ('nanos', np.int32)])

# populate numpy array with above structure
x_values_size = 3
x_values = np.empty((x_values_size,), dtype=Timestamp_t)
x_values['seconds'] = np.linspace(0, 100, num=x_values_size, dtype=np.int64)
x_values['nanos']   = np.linspace(0, 10, num=x_values_size, dtype=np.int32)

# copy data from numpy structured array to a descriptor-created Timestamp
for elem in np.nditer(x_values) :
    # destination protobuf structure (actually, part of some sequence)
    # try 1: this will actually change the type of 'ts'
    ts1 = timestamp_pb2.Timestamp()
    print(type(ts1)) # Timestamp as expected
    ts1 = elem
    print(ts1) # now a numpy.ndarray
    print(type(ts1))
    print(ts1.dtype)

    # try 2: assign member by member
    ts2 = timestamp_pb2.Timestamp()
    # fails with:
    # TypeError: array(0, dtype=int64) has type <class 'numpy.ndarray'>, but expected one of: (<class 'int'>,)
    ts2.seconds = elem['seconds']
    ts2.nanos = elem['nanos']
    print("-----")

Disclaimer: hardcore newbie when it comes to python and numpy arrays.

免责声明:关于python和numpy数组的硬核新手。

1 个解决方案

#1


1  

So

In [112]: x_values
Out[112]: 
array([(  0,  0), ( 50,  5), (100, 10)], 
      dtype=[('seconds', '<i8'), ('nanos', '<i4')])

I don't usually recommend using nditer unless you need special behavior. Simple iteration on the array (rows if 2d) is usually all you need. But to better understand what is happening, lets compare the iteration methods:

除非您需要特殊行为,否则我通常不建议使用nditer。通常只需要对数组进行简单迭代(行为2d)。但是为了更好地理解正在发生的事情,让我们比较一下迭代方法:

In [114]: for elem in np.nditer(x_values):
     ...:     print(elem, elem.dtype)
     ...:     print(type(elem))   
(0, 0) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>
(50, 5) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>
(100, 10) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>

In [115]: for elem in x_values:
     ...:     print(elem, elem.dtype)
     ...:     print(type(elem))
(0, 0) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>
(50, 5) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>
(100, 10) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>

Same except the type is different, np.ndarray v. np.void. It's easier to modify the nditer variable.

相同但类型不同,np.ndarray v.np.void。修改nditer变量更容易。

Do the same but looking at one field:

做同样但看一个字段:

In [119]: for elem in np.nditer(x_values):
     ...:     print(elem['seconds'], type(elem['seconds']))   
0 <class 'numpy.ndarray'>
50 <class 'numpy.ndarray'>
100 <class 'numpy.ndarray'>

In [120]: for elem in x_values:
     ...:     print(elem['seconds'], type(elem['seconds']))
0 <class 'numpy.int64'>
50 <class 'numpy.int64'>
100 <class 'numpy.int64'>

I don't have the protobuf code, but I suspect

我没有protobuf代码,但我怀疑

ts2.seconds = elem['seconds']

will work better with the 2nd iteration, the one that produces np.int64 values. Or add elem['seconds'].item().

第二次迭代将产生更好的效果,即产生np.int64值的迭代。或者添加elem ['秒']。item()。

#1


1  

So

In [112]: x_values
Out[112]: 
array([(  0,  0), ( 50,  5), (100, 10)], 
      dtype=[('seconds', '<i8'), ('nanos', '<i4')])

I don't usually recommend using nditer unless you need special behavior. Simple iteration on the array (rows if 2d) is usually all you need. But to better understand what is happening, lets compare the iteration methods:

除非您需要特殊行为,否则我通常不建议使用nditer。通常只需要对数组进行简单迭代(行为2d)。但是为了更好地理解正在发生的事情,让我们比较一下迭代方法:

In [114]: for elem in np.nditer(x_values):
     ...:     print(elem, elem.dtype)
     ...:     print(type(elem))   
(0, 0) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>
(50, 5) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>
(100, 10) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.ndarray'>

In [115]: for elem in x_values:
     ...:     print(elem, elem.dtype)
     ...:     print(type(elem))
(0, 0) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>
(50, 5) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>
(100, 10) [('seconds', '<i8'), ('nanos', '<i4')]
<class 'numpy.void'>

Same except the type is different, np.ndarray v. np.void. It's easier to modify the nditer variable.

相同但类型不同,np.ndarray v.np.void。修改nditer变量更容易。

Do the same but looking at one field:

做同样但看一个字段:

In [119]: for elem in np.nditer(x_values):
     ...:     print(elem['seconds'], type(elem['seconds']))   
0 <class 'numpy.ndarray'>
50 <class 'numpy.ndarray'>
100 <class 'numpy.ndarray'>

In [120]: for elem in x_values:
     ...:     print(elem['seconds'], type(elem['seconds']))
0 <class 'numpy.int64'>
50 <class 'numpy.int64'>
100 <class 'numpy.int64'>

I don't have the protobuf code, but I suspect

我没有protobuf代码,但我怀疑

ts2.seconds = elem['seconds']

will work better with the 2nd iteration, the one that produces np.int64 values. Or add elem['seconds'].item().

第二次迭代将产生更好的效果,即产生np.int64值的迭代。或者添加elem ['秒']。item()。