This is my first question on this site.
这是我在这个网站上的第一个问题。
First of all, I need to make a module with one function for python in C++, which must work with numpy, using <numpy/arrayobject.h>
. This function takes one numpy array and returns two numpy arrays. All arrays are one-dimensional.
首先,我需要在C ++中为python创建一个带有一个函数的模块,它必须使用numpy,使用
The first question is how to get the data from a numpy array? I want to collect the information from array in std::vector, so then I can easily work with it C++.
第一个问题是如何从numpy数组中获取数据?我想从std :: vector中的数组中收集信息,这样我就可以轻松地使用C ++了。
The second: am I right that function should return a tuple of arrays, then user of my module can write like this in python: arr1, arr2 = foo(arr)
? And how to return like this?
第二个:我是正确的,该函数应该返回一个数组元组,然后我的模块的用户可以在python中写这样:arr1,arr2 = foo(arr)?怎么这样回来?
Thank you very much.
非常感谢你。
1 个解决方案
#1
1
NumPy includes lots of functions and macros that make it pretty easy to access the data of an ndarray
object within a C or C++ extension. Given a 1D ndarray
called v
, one can access element i
with PyArray_GETPTR1(v, i)
. So if you want to copy each element in the array to a std::vector
of the same type, you can iterate over each element and copy it, like so (I'm assuming an array of double
s):
NumPy包含许多函数和宏,使得访问C或C ++扩展中的ndarray对象的数据非常容易。给定一个名为v的1d ndarray,可以使用PyArray_GETPTR1(v,i)访问元素i。因此,如果您想将数组中的每个元素复制到相同类型的std :: vector,您可以迭代每个元素并复制它,就像这样(我假设一个双精度数组):
npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
for (int i = 0; i < vsize; i++) {
out[i] = *reinterpret_cast<double*>(PyArray_GETPTR1(v, i));
}
One could also do a bulk memcpy
-like operation, but keep in mind that NumPy ndarray
s may be mis-aligned for the data type, have non-native byte order, or other subtle attributes that make such copies less than desirable. But assuming that you are aware of these, one could do:
也可以进行类似memcpy的批量操作,但请记住,NumPy ndarrays可能与数据类型不一致,具有非本机字节顺序或其他微妙属性,使得此类副本不太理想。但假设您了解这些,可以做到:
npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
std::memcpy(out.data(), PyArray_DATA(v), sizeof(double) * vsize);
Using either approach, out
now contains a copy of the ndarray
's data, and you can manipulate it however you like. Keep in mind that, unless you really need the data as a std::vector
, the NumPy C API may be perfectly fine to use in your extension as a way to access and manipulate the data. That is, unless you need to pass the data to some other function which must take a std::vector
or you want to use C++ library code that relies on std::vector
, I'd consider doing all your processing directly on the native array types.
使用任何一种方法,out现在都包含ndarray数据的副本,您可以随意操作它。请记住,除非您确实需要将数据作为std :: vector,否则NumPy C API可以完美地在您的扩展中用作访问和操作数据的方式。也就是说,除非你需要将数据传递给一些必须采用std :: vector的其他函数,或者你想使用依赖于std :: vector的C ++库代码,否则我会考虑直接在本机上进行所有处理数组类型。
As to your last question, one generally uses PyArg_BuildValue
to construct a tuple which is returned from your extension functions. Your tuple would just contain two ndarray
objects.
至于你的上一个问题,通常使用PyArg_BuildValue来构造一个从扩展函数返回的元组。你的元组只包含两个ndarray对象。
#1
1
NumPy includes lots of functions and macros that make it pretty easy to access the data of an ndarray
object within a C or C++ extension. Given a 1D ndarray
called v
, one can access element i
with PyArray_GETPTR1(v, i)
. So if you want to copy each element in the array to a std::vector
of the same type, you can iterate over each element and copy it, like so (I'm assuming an array of double
s):
NumPy包含许多函数和宏,使得访问C或C ++扩展中的ndarray对象的数据非常容易。给定一个名为v的1d ndarray,可以使用PyArray_GETPTR1(v,i)访问元素i。因此,如果您想将数组中的每个元素复制到相同类型的std :: vector,您可以迭代每个元素并复制它,就像这样(我假设一个双精度数组):
npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
for (int i = 0; i < vsize; i++) {
out[i] = *reinterpret_cast<double*>(PyArray_GETPTR1(v, i));
}
One could also do a bulk memcpy
-like operation, but keep in mind that NumPy ndarray
s may be mis-aligned for the data type, have non-native byte order, or other subtle attributes that make such copies less than desirable. But assuming that you are aware of these, one could do:
也可以进行类似memcpy的批量操作,但请记住,NumPy ndarrays可能与数据类型不一致,具有非本机字节顺序或其他微妙属性,使得此类副本不太理想。但假设您了解这些,可以做到:
npy_intp vsize = PyArray_SIZE(v);
std::vector<double> out(vsize);
std::memcpy(out.data(), PyArray_DATA(v), sizeof(double) * vsize);
Using either approach, out
now contains a copy of the ndarray
's data, and you can manipulate it however you like. Keep in mind that, unless you really need the data as a std::vector
, the NumPy C API may be perfectly fine to use in your extension as a way to access and manipulate the data. That is, unless you need to pass the data to some other function which must take a std::vector
or you want to use C++ library code that relies on std::vector
, I'd consider doing all your processing directly on the native array types.
使用任何一种方法,out现在都包含ndarray数据的副本,您可以随意操作它。请记住,除非您确实需要将数据作为std :: vector,否则NumPy C API可以完美地在您的扩展中用作访问和操作数据的方式。也就是说,除非你需要将数据传递给一些必须采用std :: vector的其他函数,或者你想使用依赖于std :: vector的C ++库代码,否则我会考虑直接在本机上进行所有处理数组类型。
As to your last question, one generally uses PyArg_BuildValue
to construct a tuple which is returned from your extension functions. Your tuple would just contain two ndarray
objects.
至于你的上一个问题,通常使用PyArg_BuildValue来构造一个从扩展函数返回的元组。你的元组只包含两个ndarray对象。