如何使用在c ++中从np.array获取数据到std :: vector?

时间:2022-06-25 16:35:37

This is my first question on this site.

这是我在这个网站上的第一个问题。

First of all, I need to make a module with one function for python in C++, which must work with numpy, using <numpy/arrayobject.h>. This function takes one numpy array and returns two numpy arrays. All arrays are one-dimensional.

首先,我需要在C ++中为python创建一个带有一个函数的模块,它必须使用numpy,使用 。此函数采用一个numpy数组并返回两个numpy数组。所有阵列都是一维的。

The first question is how to get the data from a numpy array? I want to collect the information from array in std::vector, so then I can easily work with it C++.

第一个问题是如何从numpy数组中获取数据?我想从std :: vector中的数组中收集信息,这样我就可以轻松地使用C ++了。

The second: am I right that function should return a tuple of arrays, then user of my module can write like this in python: arr1, arr2 = foo(arr) ? And how to return like this?

第二个:我是正确的,该函数应该返回一个数组元组,然后我的模块的用户可以在python中写这样:arr1,arr2 = foo(arr)?怎么这样回来?

Thank you very much.

非常感谢你。

1 个解决方案

#1


1  

NumPy includes lots of functions and macros that make it pretty easy to access the data of an ndarray object within a C or C++ extension. Given a 1D ndarray called v, one can access element i with PyArray_GETPTR1(v, i). So if you want to copy each element in the array to a std::vector of the same type, you can iterate over each element and copy it, like so (I'm assuming an array of doubles):

NumPy包含许多函数和宏,使得访问C或C ++扩展中的ndarray对象的数据非常容易。给定一个名为v的1d ndarray,可以使用PyArray_GETPTR1(v,i)访问元素i。因此,如果您想将数组中的每个元素复制到相同类型的std :: vector,您可以迭代每个元素并复制它,就像这样(我假设一个双精度数组):

npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
for (int i = 0; i < vsize; i++) {
    out[i] = *reinterpret_cast<double*>(PyArray_GETPTR1(v, i));
}

One could also do a bulk memcpy-like operation, but keep in mind that NumPy ndarrays may be mis-aligned for the data type, have non-native byte order, or other subtle attributes that make such copies less than desirable. But assuming that you are aware of these, one could do:

也可以进行类似memcpy的批量操作,但请记住,NumPy ndarrays可能与数据类型不一致,具有非本机字节顺序或其他微妙属性,使得此类副本不太理想。但假设您了解这些,可以做到:

npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
std::memcpy(out.data(), PyArray_DATA(v), sizeof(double) * vsize);

Using either approach, out now contains a copy of the ndarray's data, and you can manipulate it however you like. Keep in mind that, unless you really need the data as a std::vector, the NumPy C API may be perfectly fine to use in your extension as a way to access and manipulate the data. That is, unless you need to pass the data to some other function which must take a std::vector or you want to use C++ library code that relies on std::vector, I'd consider doing all your processing directly on the native array types.

使用任何一种方法,out现在都包含ndarray数据的副本,您可以随意操作它。请记住,除非您确实需要将数据作为std :: vector,否则NumPy C API可以完美地在您的扩展中用作访问和操作数据的方式。也就是说,除非你需要将数据传递给一些必须采用std :: vector的其他函数,或者你想使用依赖于std :: vector的C ++库代码,否则我会考虑直接在本机上进行所有处理数组类型。

As to your last question, one generally uses PyArg_BuildValue to construct a tuple which is returned from your extension functions. Your tuple would just contain two ndarray objects.

至于你的上一个问题,通常使用PyArg_BuildValue来构造一个从扩展函数返回的元组。你的元组只包含两个ndarray对象。

#1


1  

NumPy includes lots of functions and macros that make it pretty easy to access the data of an ndarray object within a C or C++ extension. Given a 1D ndarray called v, one can access element i with PyArray_GETPTR1(v, i). So if you want to copy each element in the array to a std::vector of the same type, you can iterate over each element and copy it, like so (I'm assuming an array of doubles):

NumPy包含许多函数和宏,使得访问C或C ++扩展中的ndarray对象的数据非常容易。给定一个名为v的1d ndarray,可以使用PyArray_GETPTR1(v,i)访问元素i。因此,如果您想将数组中的每个元素复制到相同类型的std :: vector,您可以迭代每个元素并复制它,就像这样(我假设一个双精度数组):

npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
for (int i = 0; i < vsize; i++) {
    out[i] = *reinterpret_cast<double*>(PyArray_GETPTR1(v, i));
}

One could also do a bulk memcpy-like operation, but keep in mind that NumPy ndarrays may be mis-aligned for the data type, have non-native byte order, or other subtle attributes that make such copies less than desirable. But assuming that you are aware of these, one could do:

也可以进行类似memcpy的批量操作,但请记住,NumPy ndarrays可能与数据类型不一致,具有非本机字节顺序或其他微妙属性,使得此类副本不太理想。但假设您了解这些,可以做到:

npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
std::memcpy(out.data(), PyArray_DATA(v), sizeof(double) * vsize);

Using either approach, out now contains a copy of the ndarray's data, and you can manipulate it however you like. Keep in mind that, unless you really need the data as a std::vector, the NumPy C API may be perfectly fine to use in your extension as a way to access and manipulate the data. That is, unless you need to pass the data to some other function which must take a std::vector or you want to use C++ library code that relies on std::vector, I'd consider doing all your processing directly on the native array types.

使用任何一种方法,out现在都包含ndarray数据的副本,您可以随意操作它。请记住,除非您确实需要将数据作为std :: vector,否则NumPy C API可以完美地在您的扩展中用作访问和操作数据的方式。也就是说,除非你需要将数据传递给一些必须采用std :: vector的其他函数,或者你想使用依赖于std :: vector的C ++库代码,否则我会考虑直接在本机上进行所有处理数组类型。

As to your last question, one generally uses PyArg_BuildValue to construct a tuple which is returned from your extension functions. Your tuple would just contain two ndarray objects.

至于你的上一个问题,通常使用PyArg_BuildValue来构造一个从扩展函数返回的元组。你的元组只包含两个ndarray对象。